Loading Video...
NTHRYS
Arrow

Online Internships

NTHRYS-NTHRYS BIOTECH LABS-Academic Services-Internships >> Online Internships

Online Internships

In-Silico · Online NGS · Multi-Omics Pipelines · Cloud/HPC

Check below focused areas and choose one to apply

  1. NGS QC: FastQC / MultiQC / Adapter Trimming
  2. Read Alignment (DNA) : BWA-MEM / Bowtie2
  3. Read Alignment (RNA) : STAR / HISAT2
  4. Reference Indexing & Genome Resources
  5. WGS Variant Calling: GATK Best Practices
  6. WES Variant Calling & Panel Workflows
  7. DeepVariant & DRAGEN-style Pipelines (concepts)
  8. Variant Filtering, Recalibration & Quality Gates
  9. Structural Variants: Manta / DELLY
  10. Copy Number Variation: CNVkit / GATK-gCNV
  11. Loss of Heterozygosity & Purity/Plody Basics
  12. RNA-seq Quantification: Salmon / Kallisto
  13. DGE Analysis: DESeq2 / edgeR / limma-voom
  14. Transcript Assembly & Isoforms: StringTie
  15. Alternative Splicing Analysis: rMATS
  16. ChIP-seq Peak Calling & QC: MACS2
  17. ATAC-seq Footprinting & Motifs: HOMER
  18. DNA Methylation/WGBS: Bismark Pipelines
  19. Bisulfite QC & Differential Methylation
  20. Single-Cell RNA-seq: Seurat Basics
  21. Single-Cell RNA-seq: Scanpy Basics
  22. Single-Cell QC/Integration/Batch Correction
  23. Spatial Transcriptomics: Space Ranger → Seurat
  24. Metagenomics 16S: QIIME2 End-to-End
  25. Shotgun Metagenomics: HUMAnN / MetaPhlAn
  26. De-novo Assembly: SPAdes / MEGAHIT
  27. Genome Annotation: Prokka / PGAP
  28. Functional Enrichment: GSEA / Enrichr / clusterProfiler
  29. Gene Set Curation & Pathway Databases
  30. Multi-Omics Integration: MOFA / mixOmics
  31. Network Biology & Co-expression Modules
  32. Clinical Variant Interpretation: ClinVar / COSMIC
  33. Annotation & Databases: VEP / ANNOVAR
  34. IGV / UCSC Track Hubs / Visualization
  35. Reproducible Reports: Quarto / R Markdown
  36. Data Versioning & FAIR: DVC / Metadata
  37. Workflow Engines: Snakemake Fundamentals
  38. Workflow Engines: Nextflow Fundamentals
  39. Containerization: Conda / Docker / Singularity
  40. HPC Scheduling: SLURM Job Arrays & Logs
  41. Cloud Basics: AWS/GCP/Azure for Omics
  42. File Orchestration: NF-Tower / Snakemake Reports
  43. Quality Dashboards & MultiQC Custom Panels
  44. Sample Sheet Design & Cohort Metadata
  45. Batch Effects & Confounders Handling
  46. Benchmarking & Synthetic Controls
  47. Security & Compliance Considerations
  48. Regulatory Data Submission (ENA/GEO/SRA)
  49. Packaging & Sharing Pipelines (Git/GitHub)
  50. Capstone: End-to-End Omics Pipeline Build
In-Silico · Online Clinical Genomics Variant Interpretation

Check below focused areas and choose one to apply

  1. Clinical NGS QC: coverage, uniformity, duplication
  2. Target capture/WES/WGS assay characteristics
  3. Read alignment & recalibration (BWA-GATK)
  4. Germline SNV/Indel calling (GATK Best Practices)
  5. Somatic SNV/Indel calling (Mutect2/VarDict)
  6. Low-allele fraction variant detection
  7. Structural variants (Manta/DELLY/LUMPY)
  8. Copy-number variants (ExomeDepth/gCNV)
  9. Mitochondrial variants & heteroplasmy
  10. Phasing & compound heterozygosity
  11. Trio analysis & inheritance models
  12. Repeat expansions & STR detection
  13. RNA-seq for splicing/allele-specific expression
  14. Splice impact prediction & validation routes
  15. Variant normalization & left-alignment
  16. ClinVar/OMIM/HGMD interrogation (concepts)
  17. Population frequencies (gnomAD) & sub-pops
  18. In-silico predictors (REVEL, CADD, SpliceAI)
  19. Gene-disease validity & constraint metrics
  20. ACMG/AMP classification framework (germline)
  21. ACMG evidence codes: PM/PP/PS/BP/BS
  22. Somatic guidelines (AMP/ASCO/CAP tiers)
  23. Actionability (OncoKB/CIViC/COSMIC)
  24. LOH, TMB, MSI pipelines (clinical context)
  25. Copy-number driven biomarkers (HER2, MET, etc.)
  26. Fusion detection (STAR-Fusion/Arriba) overview
  27. Clinical reporting templates & phrasing
  28. VUS management & reclassification policy
  29. Secondary/incidental findings (ACMG SF v3)
  30. Carrier screening panel curation
  31. Pharmacogenomics (CPIC, PharmGKB)
  32. Newborn screening & rare disease workflows
  33. Mendelian gene panels (virtual panels)
  34. Gene curation with ClinGen SOPs
  35. Reference transcripts & MANE Select
  36. HGVS nomenclature & transcript selection
  37. Data traceability, audit & LIMS linkage
  38. Verification/validation (CLSI, CAP) concepts
  39. QC dashboards & run acceptance criteria
  40. Proficiency testing & inter-lab concordance
  41. Ethical/legal: consent, privacy, reporting scope
  42. Regulatory submissions & documentation
  43. Cloud/HPC in clinical settings (guardrails)
  44. Automated evidence collection (VEP/ANNOVAR)
  45. Knowledge base building & evidence tagging
  46. Variant database hygiene & versioning
  47. Family-based reanalysis strategy
  48. Tumor-normal vs tumor-only trade-offs
  49. Clinical MTB case write-ups
  50. Capstone: end-to-end clinical interpretation
In-Silico · Online Single-Cell · Spatial scRNA-seq · STomics

Check below focused areas and choose one to apply

  1. Single-cell experimental design & sample types
  2. Cell/nuclei isolation & quality considerations (concepts)
  3. Unique molecular identifiers (UMIs) & barcodes
  4. Library types: 3’/5’ tag, full-length, plate vs droplet
  5. Raw data structure: FASTQ + feature-barcode matrices
  6. Initial QC: sequencing depth & read structure checks
  7. Cell-level QC: library size, feature counts, mito/ribo%
  8. Empty droplets & ambient RNA (SoupX-style concepts)
  9. Doublet detection strategies (DoubletFinder/scrublet)
  10. Normalization strategies (log, sctransform)
  11. Feature selection & HVG detection
  12. Dimensionality reduction: PCA, UMAP, t-SNE
  13. Graph construction & clustering (Louvain/Leiden)
  14. Cluster annotation with canonical markers
  15. Differential expression across clusters/conditions
  16. Pseudobulk aggregation for robust testing
  17. Trajectory inference & pseudotime (Monocle/Slingshot)
  18. RNA velocity (scVelo concepts)
  19. Cell–cell communication analysis (CellChat/NicheNet)
  20. Integration of multiple batches/experiments
  21. Reference-based label transfer & mapping
  22. CITE-seq & multi-modal data (RNA+ADT)
  23. scATAC-seq QC & peak calling overview
  24. Linking chromatin accessibility to gene expression
  25. Regulon analysis (SCENIC-style concepts)
  26. Single-cell multi-omics integration strategies
  27. Data subsetting & re-clustering workflows
  28. Handling large datasets & sparse matrices
  29. Metadata management & experimental covariates
  30. Spatial technologies overview (Visium, MERFISH, etc.)
  31. Tissue image / spot alignment & QC
  32. Spatial normalization & smoothing concepts
  33. Spatially variable gene detection
  34. Deconvolution of spots into cell-type mixtures
  35. Spatial clustering & neighborhood analysis
  36. Colocalization & ligand–receptor patterns in space
  37. Integration of single-cell and spatial data
  38. Building spatial maps of cell states
  39. Creating marker panels & signatures
  40. Benchmarking pipelines with public datasets
  41. Report-ready figures for manuscripts
  42. Reproducible single-cell workflows (best practices)
  43. Project structure & versioning for scRNA-seq
  44. Sharing count matrices & metadata (FAIR)
  45. Ethical handling of clinical single-cell data
  46. Automation with Snakemake/Nextflow (high level)
  47. Cloud/HPC strategies for large scRNA-seq
  48. Documentation & notebooks for reviewers
  49. Quality dashboards & interactive exploration (Shiny/Dash)
  50. Capstone: full single-cell + spatial analysis report
In-Silico · Online Metagenomics Microbiome Analytics

Check below focused areas and choose one to apply

  1. Microbiome study design & sample types
  2. DNA extraction biases & quality checks (concepts)
  3. 16S/18S/ITS amplicon vs shotgun metagenomics
  4. Library prep considerations & sequencing depth
  5. Raw reads QC: quality, adapters, contaminants
  6. Host read removal & decontamination strategies
  7. Amplicon processing with QIIME2 end-to-end
  8. DADA2/UNOISE pipelines for ASV inference
  9. OTUs vs ASVs and choice of reference DB
  10. Taxonomic assignment (SILVA, Greengenes, GTDB)
  11. Alpha diversity metrics & rarefaction curves
  12. Beta diversity, distance metrics & ordination
  13. PERMANOVA and other community-level tests
  14. Differential abundance (DESeq2/ANCOM-BC concepts)
  15. Compositionality & appropriate normalisation
  16. Contaminant detection (decontam-style workflows)
  17. Longitudinal & repeated-measures microbiome data
  18. Shotgun taxonomic profiling (Kraken2/Bracken)
  19. Shotgun taxonomic profiling (MetaPhlAn)
  20. Functional profiling of pathways (HUMAnN)
  21. Resistome profiling (ARG databases concepts)
  22. Viruses, phages & virome analysis basics
  23. Mycobiome & fungal community profiling basics
  24. Metagenome assembly (MEGAHIT/SPAdes overview)
  25. Binning metagenome-assembled genomes (MAGs)
  26. Bin refinement, de-duplication & quality checks
  27. CheckM/GTDB-Tk for MAG quality & taxonomy
  28. Pangenome & strain-resolved analysis (concepts)
  29. Metatranscriptomics (RNA) workflows overview
  30. Metaproteomics/metabolomics integration (high level)
  31. Host–microbiome data integration
  32. Microbiome & clinical covariates (metadata models)
  33. Batch effects & technical confounders handling
  34. Environment/soil/water microbiome specifics
  35. Animal/human gut microbiome specifics
  36. Food/fermentation microbiome analytics
  37. AMR surveillance using metagenomics
  38. Functional enrichment & pathway interpretation
  39. Network analysis & co-occurrence patterns
  40. Biomarker discovery & machine learning (intro)
  41. Reproducible QIIME2 pipelines & artefacts
  42. Reproducible shotgun pipelines (Nextflow/Snakemake)
  43. Reporting standards & MIxS/FAIR principles
  44. Submission to public repositories (ENA/SRA/EBI)
  45. Dashboard-style visual summaries for stakeholders
  46. Project structure, naming & version control
  47. Privacy, ethics & human-associated microbiome data
  48. Basic cloud/HPC strategies for large microbiomes
  49. End-to-end case study: 16S workflow
  50. End-to-end case study: shotgun microbiome study
In-Silico · Online Systems Biology Networks & Pathways

Check below focused areas and choose one to apply

  1. Systems biology principles: networks, feedback, robustness
  2. Pathways vs networks vs gene sets (conceptual distinctions)
  3. Interaction types: PPI, TF–target, metabolic, signalling
  4. Pathway databases: KEGG, Reactome, WikiPathways
  5. Interaction databases: STRING, BioGRID, IntAct
  6. Gene set collections: GO, MSigDB, custom panels
  7. Building co-expression networks (correlation-based)
  8. WGCNA basics: adjacency, TOM & scale-free topology
  9. Module detection & dynamic tree cut
  10. Module–trait relationships & hub gene identification
  11. Regulatory network enrichment (TFs & motifs)
  12. Gene regulatory network inference (ARACNe/GENIE3 concepts)
  13. Bayesian, ODE & logical network models (overview)
  14. Over-representation analysis (ORA) for pathways
  15. GSEA-style enrichment on ranked gene lists
  16. Gene set variation analysis (GSVA, ssGSEA)
  17. Topology-aware methods (SPIA/CAMERA/ROAST concepts)
  18. Combining multiple enrichment results coherently
  19. Multi-omics integration in pathway context (MOFA/mixOmics concepts)
  20. Integrating transcriptomics with proteomics in networks
  21. Integrating metabolomics to metabolic networks
  22. Differential network & network rewiring analysis
  23. Module preservation across datasets/cohorts
  24. Network propagation & diffusion concepts
  25. Random walk with restart (RWR) on biological graphs
  26. Disease gene prioritisation using interaction networks
  27. Disease modules & subnetwork extraction
  28. Network-based drug–target & drug–disease mapping
  29. Drug repurposing via network proximity
  30. Network-constrained feature selection & ML (intro)
  31. Graph representations: adjacency matrices & edgelists
  32. Graph metrics: degree, centrality, clustering, paths
  33. Community detection & graph clustering (Louvain etc.)
  34. Heterogeneous & bipartite biological networks
  35. Single-cell networks & gene modules per cell type
  36. Spatial transcriptomics neighborhood networks
  37. Ligand–receptor signalling networks (CellChat/NicheNet concepts)
  38. Host–pathogen interaction networks
  39. Host–microbiome & multi-kingdom interaction graphs
  40. Knowledge graphs & ontology-backed biomolecular networks
  41. FAIR & reusable network objects (graphML, igraph, tidygraph)
  42. Network visualization styles & layout selection
  43. Cytoscape workflows: import, style, export
  44. Scripting Cytoscape (RCy3/py2cytoscape concepts)
  45. Automated reporting from network analyses (R Markdown/Quarto)
  46. Reproducible pipelines for enrichment + network analysis
  47. Evaluating database bias, coverage & confidence scores
  48. Best practices for network figure design in publications
  49. Project organisation, versioning & documentation
  50. Capstone: end-to-end systems biology & network report
In-Silico · Online Comp Chem Cheminformatics · QSAR/ADMET

Check below focused areas and choose one to apply

  1. Chemical structure formats & representations (SMILES, SDF, MOL)
  2. 2D structure drawing & curation best practices
  3. Chemical databases & registries (ChEMBL, PubChem, internal)
  4. Standardization, salt-stripping & tautomer handling
  5. Physicochemical descriptors (logP, pKa, TPSA, HBD/HBA)
  6. Topological & fragment-based molecular descriptors
  7. 3D descriptors & conformer generation concepts
  8. Fingerprint types (MACCS, Morgan, Daylight-like)
  9. Similarity searches & diversity analysis
  10. Chemical space visualization (PCA, t-SNE, UMAP)
  11. QSAR workflow design & dataset preparation
  12. Endpoint curation & assay harmonization
  13. Train/validation/test splits & time-split concepts
  14. Class imbalance & resampling strategies
  15. Linear QSAR (MLR, PLS) basics
  16. Non-linear ML models (RF, GBM, SVM, NN) for QSAR
  17. Regression vs classification QSAR models
  18. Feature selection & dimensionality reduction
  19. Model validation: internal & external metrics
  20. Y-randomization & leakage checks
  21. Applicability domain (AD) concepts & methods
  22. Interpretability: feature importance & SHAP-style ideas
  23. Virtual screening pipeline design
  24. Hit triage and ranking strategies
  25. Statistical vs physics-based scoring synergy
  26. ADMET endpoints: solubility & permeability models
  27. Metabolism & clearance (hepatic, renal) modelling
  28. CYP450 & DDI risk in-silico predictions
  29. Toxicity endpoints: hERG, hepatotoxicity, genotoxicity
  30. Rule-based filters (Lipinski, Veber, PAINS)
  31. Multi-parameter optimization (MPO) scores
  32. Structure-based design overview (docking basics)
  33. Binding site preparation & protonation states (concepts)
  34. Ligand preparation & tautomers for docking
  35. Docking workflow & scoring function concepts
  36. Post-docking analysis & rescoring ideas
  37. Free energy approaches (MM/GBSA, FEP) overview
  38. Conformational analysis & flexibility
  39. Molecular mechanics vs quantum mechanics basics
  40. QM for pKa/tautomer & reactivity insight (high level)
  41. Cheminformatics data engineering & pipelines
  42. Compound ID management & tracking
  43. Data quality, outliers & curation SOPs
  44. Cheminformatics for library design & expansion
  45. Scaffold analysis & series expansion
  46. Integrating QSAR/ADMET with medicinal chemistry cycles
  47. Regulatory context for in-silico models (OECD principles)
  48. Documentation, versioning & model cards
  49. Basic automation of QSAR/ADMET workflows
  50. Capstone: build & document a QSAR/ADMET model
In-Silico · Online Structural Biology Docking · Molecular Dynamics

Check below focused areas and choose one to apply

  1. Structural biology techniques overview (X-ray, NMR, cryo-EM)
  2. PDB/mmCIF formats, headers & annotations
  3. Structure validation: clashes, geometry & Ramachandran
  4. Missing residues, alternate conformations & occupancy
  5. Biological assembly vs asymmetric unit
  6. Protonation states, tautomers & pH-dependent features (concepts)
  7. Binding site identification & pocket detection
  8. Non-covalent interactions: H-bonds, hydrophobics, salt-bridges
  9. Ligand structure preparation & standardisation
  10. Protein preparation pipelines (cleanup & optimisation)
  11. Grid generation & binding box definition
  12. Docking search algorithms (systematic, stochastic, genetic)
  13. Scoring functions (force-field, empirical, knowledge-based)
  14. Covalent docking concepts & warhead considerations
  15. Docking protocol validation & enrichment metrics
  16. Redocking & cross-docking workflows
  17. Consensus scoring & rescoring strategies
  18. Post-docking filtering & visual triage
  19. Virtual screening campaigns & library handling
  20. Pose visualisation & interaction diagrams
  21. Molecular dynamics fundamentals & force fields
  22. Topology/building for proteins & ligands
  23. Solvation, ion placement & box types
  24. Energy minimisation & relaxation
  25. Equilibration: NVT, NPT & restraints
  26. Production MD setup & run management
  27. Trajectory handling, stripping & storage
  28. MD analysis: RMSD, RMSF, radius of gyration
  29. Hydrogen bonds, contacts & distances over time
  30. Binding pocket stability & induced-fit insight
  31. MM/PBSA & MM/GBSA binding free energy estimates
  32. Free energy methods overview (FEP, TI, umbrella sampling concepts)
  33. Enhanced sampling basics (metadynamics, replica exchange concepts)
  34. Coarse-grained MD concepts & when to use
  35. Membrane protein system setup (lipid bilayers) overview
  36. Nucleic acids & protein–DNA/RNA complexes handling
  37. Allosteric site identification & analysis
  38. Fragment-based design in a structural context
  39. Water networks & displacement analysis
  40. Structure-based pharmacophore modelling
  41. Ensemble docking & induced-fit strategies
  42. Homology modelling & template selection
  43. Loop modelling & structure refinement
  44. Protein–protein docking overview & scoring
  45. Variant & resistance mutation analysis (structural)
  46. Integrating cryo-EM maps & models (concepts)
  47. Automating structural workflows via scripts
  48. Best practices for reproducibility in SBDD work
  49. Documenting protocols, parameters & seeds
  50. Preparing high-quality structural figures for publication
  51. Capstone: SBDD mini-project combining docking + MD
In-Silico · Online Pharmacometrics PK/PD Modeling · MIDD

Check below focused areas and choose one to apply

  1. PK fundamentals: ADME, concentration–time profiles
  2. Noncompartmental analysis (NCA) basics
  3. One- and two-compartment PK models (IV bolus)
  4. Infusion and extravascular dosing models
  5. Clearance, volume, half-life & exposure relationships
  6. Absorption models: first-order, zero-order, transit compartments
  7. Lag time, flip–flop kinetics & absorption complexities
  8. Bioavailability (F) and bioequivalence metrics (AUC, Cmax)
  9. Linear vs nonlinear PK (capacity-limited, Michaelis–Menten)
  10. Time-varying processes (autoinduction, tolerance) basics
  11. Population PK (popPK) concepts & variability sources
  12. Random effects: inter-individual, inter-occasion, residual error
  13. Covariate model building strategies (graphical, stepwise, full)
  14. Handling body size, age & organ function as covariates
  15. Allometric scaling & maturation functions in paediatrics
  16. Dataset structure for popPK/PKPD (wide vs long formats)
  17. BLQ data handling (M1–M4 methods, concepts)
  18. Model diagnostics: GOF plots, residuals, IWRES, CWRES
  19. Prediction- and simulation-based diagnostics (VPC, pcVPC)
  20. Bootstrap & parameter precision assessment
  21. Shrinkage, identifiability & parameter correlation
  22. PK/PD structural models: direct effect models
  23. Indirect response & turnover models
  24. Emax & sigmoid Emax models for continuous PD
  25. Effect compartment & hysteresis loop concepts
  26. Exposure–response relationships for efficacy
  27. Exposure–safety & tolerability modelling
  28. Dose–response modelling for binary & ordered endpoints
  29. Time-to-event (TTE) modelling basics (hazard, survival)
  30. Therapeutic drug monitoring (TDM) & Bayesian forecasting
  31. Drug–drug interaction (DDI) modelling (inhibition/induction)
  32. Special populations: renal/hepatic impairment, paediatrics
  33. Physiologically based PK (PBPK) concepts & use-cases
  34. Model-based meta-analysis (MBMA) basics
  35. Clinical trial simulation for design optimisation
  36. Adaptive and seamless design concepts (model-informed)
  37. Regulatory guidance (FDA/EMA) on PK/PD & ER analyses (overview)
  38. Model-informed drug development (MIDD) case-study flow
  39. Workflows in NONMEM/Monolix/nlmixr (high level)
  40. Using PsN-style tools for automation & diagnostics
  41. R-based workflows: data prep, fitting, diagnostics & plots
  42. Good modelling practice & analysis plans (SAPs/MAPs)
  43. Documentation, model history & decision traceability
  44. Version control for scripts, models & datasets
  45. Reproducible reports in R Markdown/Quarto
  46. Communicating PK/PD results to non-modellers
  47. Visual storytelling: spaghetti plots, CI bands, forest plots
  48. Simple PK/PD dashboards for clinicians & teams
  49. Cross-functional collaboration with clinicians & statisticians
  50. Capstone: end-to-end popPK + exposure–response analysis
In-Silico · Online Toxicology Safety Prediction · Risk

Check below focused areas and choose one to apply

  1. Basics of toxicology: dose–response & risk concepts
  2. In-silico toxicology landscape & applications
  3. Chemical structure curation for tox modelling
  4. Toxicity endpoints: acute, chronic, genotoxicity, carcinogenicity
  5. Organ-specific toxicity: hepatotoxicity, nephrotoxicity, cardiotoxicity
  6. hERG & QT prolongation risk prediction (concepts)
  7. In-silico mutagenicity & Ames test surrogates
  8. Skin sensitisation & irritation/corrosion models
  9. Respiratory & inhalation toxicity prediction
  10. Developmental & reproductive toxicity (DART) concepts
  11. Endocrine disruption & nuclear receptor alerts
  12. Tox read-across principles & category formation
  13. Structural alerts & rule-based profilers (PAINS/ToxAlerts style)
  14. QSAR models for toxicity endpoints (classification/regression)
  15. Descriptor & fingerprint choices for tox QSAR
  16. Applicability domain (AD) for tox models
  17. Model validation vs OECD principles (overview)
  18. Consensus modelling & model stacking for safety
  19. Handling imbalanced toxicity datasets
  20. Negative vs positive control selection & curation
  21. ADME vs tox: integrating clearance & exposure
  22. Margin of safety & safety indices (basic calculations)
  23. In-vitro to in-vivo extrapolation (IVIVE) concepts
  24. Benchmark dose (BMD) modelling basics
  25. PBPK/PD for safety assessment (intro)
  26. Occupational & environmental exposure scenarios
  27. REACH-style chemical safety assessment (overview)
  28. ICH M7 & genotoxic impurities (concepts)
  29. TTC (threshold of toxicological concern) concepts
  30. In-silico DILI (drug-induced liver injury) risk
  31. Idiosyncratic toxicity hypotheses & flags
  32. Nanomaterial toxicity: in-silico considerations
  33. Ecotoxicology endpoints & QSARs (fish, daphnia, algae)
  34. Bioaccumulation & persistence (PBT/vPvB criteria, concepts)
  35. Mixture toxicity & combination effects (intro)
  36. Off-target & polypharmacology in safety space
  37. Off-target activity mining from bioactivity databases
  38. Using omics & pathway data in toxicology
  39. Network-based toxicology & adverse outcome pathways (AOPs)
  40. Automated profiling with open-source tox platforms
  41. Uncertainty communication & model disclaimers
  42. Regulatory acceptance of in-silico toxicology (high level)
  43. Building transparent tox model reports
  44. Reproducible pipelines for tox data & models
  45. Visualisations: tox radar charts & traffic-light tables
  46. Prioritising compounds for testing using in-silico scores
  47. Integrating in-silico tox into medicinal chemistry cycles
  48. Data management, versioning & audit trails in tox projects
  49. Ethics of reducing animal use with computational methods
  50. Capstone: design an in-silico safety screening cascade
In-Silico · Online AI/ML for Biosciences LLMs · CV · Tabular

Check below focused areas and choose one to apply

  1. Problem framing & ML workflow for bioscience use-cases
  2. Bioscience data types: tabular, images, sequences, text, graphs
  3. Data cleaning & QC for lab/clinical datasets
  4. Handling missing values, outliers & batch effects
  5. Train/validation/test splits & data leakage prevention
  6. Feature engineering for omics & lab measurements
  7. Classical ML: linear/logistic regression for biomarkers
  8. Tree-based models: Random Forest & Gradient Boosting (concepts)
  9. Regularisation (L1/L2, elastic net) & overfitting control
  10. Model evaluation metrics (AUC, PR, accuracy, RMSE, etc.)
  11. Imbalanced classes & rare-event modelling strategies
  12. Cross-validation, nested CV & robust performance estimation
  13. Calibration, decision thresholds & clinical utility curves
  14. Uncertainty estimation & confidence intervals (concepts)
  15. Dimensionality reduction (PCA, t-SNE, UMAP) for bioscience data
  16. Clustering & unsupervised learning for patient/assay stratification
  17. ML on tabular omics/clinical datasets end-to-end
  18. Feature selection & stability analysis
  19. Simple AutoML-style pipelines for small teams
  20. Model interpretability: global feature importance & partial dependence
  21. SHAP/LIME-style local explanations (concepts)
  22. Computer vision basics for microscopy & pathology images
  23. CNN concepts: convolutions, pooling & feature maps
  24. Transfer learning with pre-trained vision models
  25. Data augmentation & stain/illumination variability handling
  26. Object detection & segmentation for cells/tissues (high level)
  27. Quality control of imaging datasets & annotations
  28. Evaluation of CV models (IoU, Dice & pixel-wise metrics)
  29. Intro to sequence models (1D CNNs, RNNs, transformers – concepts)
  30. LLM foundations & prompt design for biosciences
  31. Using LLMs for protocol summarisation & documentation assistance
  32. LLM-assisted EDA & coding in R/Python notebooks
  33. Text classification & entity extraction for biomedical text (concepts)
  34. Embeddings & semantic search for scientific literature
  35. Domain-specialised models (biomedical LLMs – concepts)
  36. Basic RAG-style workflows with scientific PDFs (high level)
  37. Multimodal ideas: combining tabular + text or tabular + images
  38. Data governance, anonymisation & PHI/PII basics
  39. Bias, fairness & robustness checks in medical ML
  40. ML model lifecycle: experiment tracking & versioning
  41. Reproducible pipelines with notebooks & scripts
  42. Intro to MLOps: packaging, environments & deployment options (concepts)
  43. Monitoring model drift & refreshing datasets (concepts)
  44. Documentation & model cards for bioscience models
  45. Visual reporting: dashboards & interpretability reports
  46. Collaboration patterns with biologists, clinicians & engineers
  47. Reading & critiquing ML for health/bio papers
  48. Validation & translation of ML into lab/clinic workflows
  49. Regulatory & ethical overview for AI in health & biotech
  50. Capstone: end-to-end ML workflow on a bioscience dataset
In-Silico · Online BioNLP · Text Mining Literature · Clinical Notes

Check below focused areas and choose one to apply

  1. Biomedical text sources: PubMed, preprints, patents & clinical notes
  2. Data acquisition and APIs (Entrez/Europe PMC concepts)
  3. Text pre-processing: tokenisation, sentence splitting, normalisation
  4. Handling Unicode, punctuation & abbreviations in biomedical text
  5. Stopwords, stemming, lemmatisation vs domain vocabulary
  6. Bag-of-words and tf–idf representations
  7. Word embeddings (word2vec, GloVe – concepts)
  8. Contextual embeddings (BioBERT, ClinicalBERT – concepts)
  9. Subword tokenisation (BPE/WordPiece) & vocabulary handling
  10. Biomedical corpora & annotation schemes basics
  11. Named Entity Recognition (NER) for genes, diseases, drugs, variants
  12. Dictionary & rule-based NER (lexicons, regex) foundations
  13. ML/neural NER pipelines overview
  14. Entity normalisation to UMLS, MeSH, SNOMED CT (concepts)
  15. Abbreviation detection & disambiguation in biomedical text
  16. Relation extraction: gene–disease, drug–drug, drug–event
  17. Co-occurrence vs supervised relation extraction
  18. Event extraction for biological processes (high level)
  19. Negation, speculation & assertion detection
  20. Document classification for topics, trial phases & study types
  21. Multi-label tagging (e.g., MeSH/subheading assignment)
  22. Sentence/passage ranking for evidence retrieval
  23. Information retrieval & BM25 basics
  24. Semantic search with dense embeddings (concepts)
  25. Question answering over biomedical literature (high level)
  26. Summarisation of biomedical articles (extractive/abstractive – concepts)
  27. RAG-style workflows with PDFs and databases (overview)
  28. Knowledge graph construction from text (entities + relations)
  29. Ontology & terminology integration into pipelines
  30. Text mining for systematic reviews & evidence synthesis
  31. Pharmacovigilance signal detection from case reports/text (concepts)
  32. Clinical note processing: de-identification & PHI basics
  33. Text classification for triage, routing & alerts (concepts)
  34. Bias, fairness & domain shift in clinical NLP
  35. Evaluation metrics for NER/RE/classification (precision, recall, F1, etc.)
  36. Annotation tools & workflows (BRAT-style concepts)
  37. Inter-annotator agreement & guidelines design
  38. Active learning & weak supervision for annotation (concepts)
  39. Pipeline design & orchestration for large-scale text mining
  40. Working with rate-limited APIs & big literature pulls
  41. Storage/indexing for text & embeddings (overview)
  42. Visualising entities, relations & evidence networks
  43. Reproducible notebooks & scripts for BioNLP pipelines
  44. Model documentation and “model cards” for BioNLP systems
  45. Prompt design for LLMs on biomedical text
  46. Hallucinations, verification & human-in-the-loop review
  47. Integrating text mining with omics/clinical data projects
  48. Reporting mined evidence to scientists & clinicians
  49. Maintaining/updating models, dictionaries & ontologies
  50. Packaging workflows for collaboration (repos, configs, docs)
  51. Capstone: mini literature-mining pipeline for a chosen disease/target
In-Silico · Online Health Informatics & RWE RWD · Epidemiological Modeling

Check below focused areas and choose one to apply

  1. Health informatics landscape & data sources (EHR, claims, registries)
  2. Data models & standards (HL7 v2/v3 basics, FHIR concepts)
  3. Terminologies & coding (ICD, SNOMED CT, LOINC, RxNorm overview)
  4. Clinical data quality: completeness, correctness, consistency
  5. Phenotyping from EHR: rule-based cohort definitions
  6. Extract–transform–load (ETL) pipelines for clinical data
  7. De-identification & pseudonymisation basics
  8. HIPAA/GDPR-style privacy concepts (non-legal)
  9. Common data models (CDM) overview: OMOP, PCORnet, i2b2
  10. Mapping source data to a CDM (high level)
  11. Time-series structures: visits, episodes, person-time
  12. Basic epidemiologic measures: incidence, prevalence, risk, rate
  13. Person-time and dynamic cohorts
  14. Confounding, bias & effect modification (concepts)
  15. Study designs: cohort, case–control, case–crossover
  16. Target trial emulation using observational data (high level)
  17. Propensity scores & balancing methods (concepts)
  18. Matching/stratification & inverse probability weighting basics
  19. Survival analysis (Kaplan–Meier, Cox model overview)
  20. Competing risks & multi-state concepts
  21. Missing data mechanisms & simple handling strategies
  22. Sensitivity analyses for unmeasured confounding (concepts)
  23. RWE for safety: signal detection from real-world data
  24. RWE for effectiveness & comparative effectiveness research
  25. Pragmatic trials & hybrid designs (high level)
  26. Registries & post-marketing studies (overview)
  27. RWE in regulatory decision-making (FDA/EMA concepts)
  28. Vaccine effectiveness & safety monitoring (epidemiologic view)
  29. Infectious disease modelling: deterministic SIR-type models
  30. Stochastic & agent-based modelling ideas (concepts)
  31. Reproduction number (R₀, Rₜ) and epidemic curves
  32. Scenario analysis & intervention modelling (NPIs, vaccination)
  33. Spatial epidemiology basics: mapping incidence & risk
  34. Cluster detection & hotspot analysis concepts
  35. Dashboards for surveillance & situational awareness
  36. Data pipelines for automated refresh & QC checks
  37. Visual analytics: trend plots, heatmaps, small multiples
  38. Communicating risk, uncertainty & limitations to stakeholders
  39. Governance: data access boards, SOPs & audit trails
  40. Metadata, data dictionaries & lineage documentation
  41. Reproducible R/Python pipelines for RWE studies
  42. Notebook vs script workflows & project templates
  43. Version control for code, cohorts & definitions
  44. Simple containerisation for deployable analytic pipelines
  45. ETL + analytics orchestration (Airflow/Prefect-style concepts)
  46. Quality management & validation of analytic code
  47. Collaboration with clinicians, epidemiologists & IT teams
  48. Writing RWE/epi study reports & technical appendices
  49. Health informatics career paths and role definitions
  50. Capstone: RWE/epidemiology analysis plan + mini pipeline
In-Silico · Online Biostatistics Design · Reproducibility

Check below focused areas and choose one to apply

  1. Types of data, scales of measurement & study variables
  2. Descriptive statistics: centre, spread & shape
  3. Visualising data: histograms, boxplots, scatterplots
  4. Probability basics & common distributions (normal, binomial, Poisson)
  5. Sampling distributions & Central Limit Theorem (CLT)
  6. Point estimation & confidence intervals (means, proportions)
  7. Hypothesis testing concepts: null, alternative, p-values, errors
  8. t-tests: one-sample, paired & two-sample comparisons
  9. ANOVA: one-way & simple post-hoc comparisons
  10. Non-parametric tests (Wilcoxon, Mann–Whitney, Kruskal–Wallis) basics
  11. Chi-square tests for independence & goodness-of-fit
  12. Correlation & simple linear regression
  13. Multiple linear regression & model diagnostics (concepts)
  14. Logistic regression for binary outcomes
  15. Odds ratios, risk ratios & interpreting regression output
  16. Time-to-event data & survival endpoints
  17. Kaplan–Meier curves & log-rank tests (overview)
  18. Cox proportional hazards model (concepts)
  19. Power & sample size for means & proportions
  20. Power & sample size for survival/clinical endpoints (high level)
  21. Parallel-group vs crossover, factorial & cluster trial designs
  22. Randomisation methods: simple, block, stratified (overview)
  23. Blinding, allocation concealment & protocol adherence
  24. Diagnostic test evaluation: sensitivity, specificity, PPV, NPV
  25. ROC curves & AUC interpretation
  26. Repeated measures & mixed-effects concepts
  27. Longitudinal data basics & correlated observations
  28. Handling missing data: MCAR/MAR/MNAR concepts
  29. Complete case, simple imputation & multiple imputation (overview)
  30. Multiple testing, family-wise error & FDR concepts
  31. Interim analyses & stopping rules (high level idea)
  32. Bias types: selection, information, confounding
  33. Effect modification vs confounding (conceptual)
  34. Causal diagrams (DAGs) for design & adjustment planning (overview)
  35. Design of Experiments (DoE) for lab/biotech studies
  36. Factorial & fractional factorial designs (concepts)
  37. Blocking, randomisation & replication in DoE
  38. Response surface methods & optimisation (high level)
  39. Experimental design for omics & high-throughput assays
  40. Pre-registration & statistical analysis plans (SAPs)
  41. Reporting standards: CONSORT, STROBE, PRISMA (concepts)
  42. Reproducible workflows: scripts vs point-and-click
  43. Literate programming: R Markdown / Quarto / Jupyter
  44. Version control for code & analysis (Git basics)
  45. Data provenance, metadata & tidy data principles
  46. Simulation-based power & design checking
  47. Bootstrap & permutation tests (conceptual)
  48. Sensitivity analyses & robustness checks for key assumptions
  49. Collaboration between biostatisticians & domain scientists
  50. Capstone: design + analysis + reproducible report for a study
In-Silico · Online Dashboards & Data Viz R · Python · BI

Check below focused areas and choose one to apply

  1. Foundations of tidy data and data cleaning for visualisation
  2. Choosing appropriate chart types for different biomedical questions
  3. Univariate plots: histograms, density plots and boxplots
  4. Bivariate plots: scatterplots, line charts and bar charts
  5. Multivariate visualisation with colour, facets and small multiples
  6. Perceptual principles and avoiding misleading scales/encodings
  7. Designing publication-quality figures for manuscripts and theses
  8. Colour palettes, accessibility & colour-blind–friendly design
  9. Annotating plots with statistical summaries & uncertainty
  10. Visualising distributions, outliers and batch effects in omics data
  11. Time-series plots for longitudinal and monitoring data
  12. Visualising survival curves, risk tables and confidence bands
  13. Forest plots for effect sizes and meta-analyses (concepts)
  14. Correlation matrices and pair-plots for quick EDA
  15. Heatmaps for gene expression and high-dimensional matrices
  16. Clustered heatmaps & dendrograms (concepts)
  17. Volcano and MA plots for differential expression results
  18. Visualising dimensionality reduction (PCA, t-SNE, UMAP)
  19. Geospatial maps for public health and epidemiology
  20. Dashboards vs static reports: when to use which
  21. Wireframing a scientific dashboard (layout & UX basics)
  22. KPI tiles, summary cards and drill-down design
  23. Filter panels, slicers and linked views for exploration
  24. Designing dashboards for non-technical stakeholders
  25. R/ggplot2 grammar of graphics for layered plots
  26. R-based interactive plots (plotly/ggplotly-style concepts)
  27. Python matplotlib foundations for scientific plots
  28. Python seaborn-style high-level statistical plots (concepts)
  29. Python plotly-style interactive visualisations (concepts)
  30. R Shiny-style dashboard concepts for interactive apps
  31. Python Dash/Streamlit-style dashboard concepts
  32. BI tools (Power BI/Tableau-style) in scientific contexts (overview)
  33. Embedding statistical models & uncertainty into dashboards
  34. Parameter controls & scenario sliders for "what-if" analysis
  35. Visualising model performance (ROC, PR, calibration plots)
  36. Visualising ML feature importance and SHAP-style outputs (concepts)
  37. Designing QC dashboards for labs and NGS pipelines (concepts)
  38. Designing monitoring dashboards for clinical/operational metrics
  39. Handling large datasets: sampling & aggregation strategies
  40. Performance considerations for interactive dashboards (high level)
  41. Exporting figures for journals (size, DPI, formats)
  42. Exporting dashboard views and snapshots for reports
  43. Automating report generation with R Markdown/Quarto
  44. Automating PowerPoint/PDF exports from R/Python (concepts)
  45. Reusable plotting functions & theming for consistent branding
  46. Version-controlling dashboards & visual assets with Git
  47. Documenting dashboard logic & data lineage for auditability
  48. Collecting feedback and iterating on dashboard design
  49. Communicating limitations, caveats & uncertainty in visuals
  50. Capstone: design and implement a small dashboard/report for a bio/health dataset
In-Silico · Online Bio-Data Engineering ETL · Pipelines · Cloud/HPC

Check below focused areas and choose one to apply

  1. Fundamentals of data engineering for bio/health domains
  2. Source systems for bio data: instruments, LIMS, EHR, external databases
  3. Data formats: CSV, TSV, JSON, XML, Parquet basics
  4. Bio-specific formats: FASTQ, BAM/CRAM, VCF, HDF5, AnnData (concepts)
  5. Schema design for experimental and clinical datasets
  6. Data modelling: star/snowflake vs wide tables (overview)
  7. ETL vs ELT concepts and patterns
  8. Batch vs streaming ingestion (high level)
  9. File naming conventions and folder hierarchies for labs
  10. Metadata capture and data dictionaries
  11. Using checksums and manifests for file integrity
  12. Data validation rules and schema checks (concepts)
  13. Handling missing, inconsistent and out-of-range values
  14. ID management, primary keys and foreign keys
  15. Patient/sample IDs and pseudonymisation (non-legal overview)
  16. Log design: capturing run, pipeline and audit logs
  17. Data lineage and provenance tracking (overview)
  18. Designing staging, raw, curated and analytics layers
  19. Partitioning strategies for large tables and object stores
  20. Indexing and query performance basics
  21. Scheduling ETL jobs with cron-style tools (concepts)
  22. Workflow orchestration: Airflow/Prefect-like concepts
  23. Retry, backoff and idempotency in pipelines
  24. Alerting and monitoring for failed jobs
  25. Using message queues/pub-sub (high level)
  26. Interfacing with NGS pipelines and QC outputs
  27. Aggregating MultiQC-style metrics for dashboards
  28. Loading data into analytic databases/warehouses (concepts)
  29. SQL patterns for cohort extraction and summaries
  30. Designing APIs for data access (REST-style concepts)
  31. Object storage (S3/GCS/Azure) and folder layouts
  32. Permissions, IAM roles and principle of least privilege
  33. Basic encryption in transit and at rest (conceptual)
  34. Cloud cost awareness: storage vs compute trade-offs
  35. HPC clusters vs cloud VMs: when to use which
  36. SLURM-style schedulers: jobs, queues and job arrays (concepts)
  37. Container images for pipelines (Docker/Singularity-style concepts)
  38. Environment management and reproducibility (Conda-style)
  39. Config-driven pipelines (YAML/JSON configs)
  40. Template repos and cookiecutter-style project skeletons
  41. Testing ETL code and pipeline components (unit/smoke tests)
  42. Sample data and synthetic datasets for development
  43. Documentation for pipelines and datasets (README, ADRs)
  44. Data catalog and discovery concepts
  45. Governance checklists and access request workflows
  46. Backup, archiving and retention for research/clinical data
  47. Migrating pipelines between on-prem and cloud (high level)
  48. Handover practices for data engineering artefacts
  49. Collaboration between data engineers, bioinformaticians and IT
  50. Capstone: design a simple bio-data lake plus ETL plus analytics layer
In-Silico · Online Knowledge Graphs Ontologies · Semantics

Check below focused areas and choose one to apply

  1. Introduction to knowledge graphs (KGs) and semantic networks
  2. Nodes, edges, labels and properties: KG data model basics
  3. Graph vs relational vs document databases: when to use what (concepts)
  4. RDF triples vs property graphs: comparative concepts
  5. URIs/IRIs and identifiers in biomedical data
  6. Ontologies, taxonomies and controlled vocabularies: key differences
  7. Basic description logics (DL) intuition for practitioners
  8. OWL and RDFS: classes, properties and individuals (concepts)
  9. Reusing standard biomedical ontologies (GO, HPO, DO, etc.) — overview
  10. OBO Foundry principles and ontology ecosystem (high level)
  11. Terminology servers and mapping services (concepts)
  12. Designing simple domain models for diseases, drugs and genes
  13. Representing pathways, interactions and phenotypes in graphs (conceptual)
  14. Entity–relationship and concept modelling prior to graph design
  15. Ontology engineering lifecycle: requirements & competency questions
  16. Ontology editing tools (Protégé-style usage and patterns)
  17. Naming conventions, IDs and annotation properties
  18. Logical vs annotation axioms: keeping models clean (concepts)
  19. Reasoning and classification: what DL reasoners actually do (high level)
  20. Consistency checks and debugging unsatisfiable classes (conceptual)
  21. Integrating heterogeneous datasets into a unified KG
  22. Schema/ontology alignment and mapping strategies
  23. Cross-references, synonyms and equivalence mappings
  24. Normalising identifiers (CURIE patterns: HGNC, UniProt, etc.) concepts
  25. SPARQL querying for RDF-style knowledge graphs (basic patterns)
  26. Graph query languages (Cypher/Gremlin/PGQL-style) — overview
  27. Graph patterns for gene–disease–drug queries
  28. Path, neighbourhood and subgraph queries for hypothesis exploration
  29. Inference-enriched querying: leveraging reasoners (conceptual)
  30. Provenance and evidence modelling (e.g. ECO-style) overview
  31. Attaching scores, confidence and provenance to edges
  32. Modelling temporal/contextual qualifiers (time, tissue, species)
  33. Graph design for clinical concepts: diagnoses, labs, medications (high level)
  34. FAIR data principles and how KGs support FAIRness (overview)
  35. KG construction pipelines from tabular/relational data
  36. ETL for KGs: mapping CSV/SQL to triples/edges
  37. Mapping languages (R2RML/OBDA-style) — conceptual view
  38. Incremental updates and versioning strategies for ontologies/KGs
  39. Quality metrics for KGs: coverage, connectivity, consistency
  40. Graph visualisation tools and layout choices
  41. Integrating KGs with ML (node embeddings, link prediction concepts)
  42. Using KGs to power search, recommendation and Q&A (conceptual)
  43. Working with public biomedical KGs (Bio2RDF-style, etc.) overview
  44. API design for KG-backed applications (REST/GraphQL concepts)
  45. Security, access control and governance in KG deployments (high level)
  46. Collaboration workflows: curators, modellers and engineers
  47. Documentation and onboarding for ontology/KG reusers
  48. Evaluating KG usefulness with real user queries and feedback
  49. Lightweight semantic models for small teams and projects
  50. Capstone: scoped biomedical KG/ontology design + example queries
In-Silico · Online GWAS · Stat Genetics Polygenic Risk

Check below focused areas and choose one to apply

  1. Foundations of population and quantitative genetics (conceptual)
  2. Hardy–Weinberg equilibrium, allele/genotype frequencies
  3. Linkage disequilibrium (LD) , haplotypes and LD decay (concepts)
  4. Common study designs: case–control, cohort, trio and GWAS meta-analysis
  5. Genotyping technologies and SNP arrays (high level overview)
  6. Genotype calling, quality control and imputation (concepts)
  7. Sample- and variant-level QC metrics and thresholds (conceptual)
  8. Population structure, ancestry estimation and PCA plots
  9. Relatedness, kinship and cryptic relatedness checks
  10. Basic association testing: allelic, genotypic and trend tests
  11. Logistic regression for binary traits (GWAS context)
  12. Linear regression for quantitative traits (GWAS context)
  13. Covariate adjustment: age, sex, ancestry PCs and batch effects
  14. Multiple testing and genome-wide significance thresholds
  15. Manhattan and Q–Q plots: construction and interpretation
  16. Inflation factors (λGC) and genomic control concepts
  17. Mixed models and LMM-style association (high level)
  18. Handling stratification and relatedness in association studies
  19. Conditional and joint association analyses (concepts)
  20. Fine-mapping and credible sets (high level overview)
  21. Gene-based and region-based association testing (concepts)
  22. Pathway and enrichment-style analyses for GWAS hits (conceptual)
  23. Rare-variant and burden test basics (overview)
  24. Imputation reference panels and reference bias (high level)
  25. Trans-ethnic GWAS and transferability challenges
  26. eQTL and QTL-style association concepts
  27. Colocalisation (GWAS with QTL traits) concepts
  28. Post-GWAS functional annotation of variants (conceptual)
  29. Variant-to-gene linking strategies (overview)
  30. Polygenic inheritance and SNP heritability (concepts)
  31. SNP-heritability estimation and LD score regression (conceptual)
  32. Construction of polygenic risk scores (PRS) from GWAS summary stats
  33. Clumping and thresholding for PRS (high level)
  34. Bayesian and shrinkage methods for PRS (concepts)
  35. Evaluating PRS: AUC, R² and calibration (overview)
  36. Portability of PRS across ancestries (issues and considerations)
  37. Translational aspects of PRS (screening, stratification; non-clinical overview)
  38. Simulating genotype–phenotype data for teaching and validation
  39. Data formats for GWAS: PLINK, VCF, text summary stats
  40. Working with public GWAS catalog and summary-statistics resources
  41. Basic scripting workflows to run QC and association steps
  42. Reproducible pipelines for GWAS and PRS analysis (high level)
  43. Visualising GWAS and PRS results for presentations and reports
  44. Interpreting and communicating GWAS findings responsibly
  45. Understanding common pitfalls and over-interpretation risks
  46. Ethical, legal and social considerations in genetic risk analysis (non-legal)
  47. Integrating GWAS with other omics (brief conceptual overview)
  48. Documentation, metadata and analysis plans for GWAS projects
  49. Collaboration between statisticians, geneticists and clinicians
  50. Capstone: mini GWAS/PRS analysis using public summary data (conceptual pipeline)
In-Silico · Online Synthetic Biology Design & CAD

Check below focused areas and choose one to apply

  1. Overview of synthetic biology and design-build-test-learn cycles
  2. DNA as a programmable substrate: parts, devices and systems (concepts)
  3. Biological parts libraries: promoters, RBSs, CDS, terminators (overview)
  4. Standards for genetic parts and assemblies (conceptual)
  5. Design constraints: host, chassis, context and burden
  6. Gene circuit motifs: toggle switches, oscillators and logic gates (concepts)
  7. In-silico prototyping of simple gene circuits
  8. Concepts of metabolic engineering within synthetic biology
  9. Pathway selection and retrosynthesis-style route planning (high level)
  10. Flux and cofactor considerations (conceptual)
  11. DNA sequence design: codon usage, GC content and constraints
  12. Minimal off-target and homology considerations (overview)
  13. Basic design rules to avoid unwanted secondary structures
  14. Insulation, orthogonality and composability concepts
  15. Host chassis options: bacteria, yeast, mammalian cells (conceptual)
  16. Genome-scale design vs plasmid-level design (high level)
  17. Genome editing design concepts (CRISPR guide design, high level)
  18. In-silico design of gRNAs and off-target scanning (conceptual)
  19. Regulatory element tuning: promoter/RBS strength design (concepts)
  20. Transcriptional, translational and post-translational control layers
  21. Signal processing and biosensor circuit concepts
  22. Kill-switches and biocontainment designs (conceptual)
  23. Modular cloning and assembly strategy planning (Golden Gate-style concepts)
  24. DNA assembly maps, compatibility and overhang planning
  25. Basic SBML-style model concepts for gene circuits
  26. Deterministic vs stochastic modelling of gene networks (high level)
  27. Parameter ideas: transcription, translation, degradation rates (conceptual)
  28. Simulating circuit dynamics to evaluate design behaviour (conceptual)
  29. Sensitivity-style thinking: which parameters influence behaviour most
  30. Constraint-based modelling ideas for metabolic pathways (conceptual)
  31. Multi-objective trade-offs: productivity vs growth vs stability
  32. Digital twins and in-silico strain design concepts
  33. Data needed to calibrate and refine synbio models (overview)
  34. Integrating omics data into design decisions (conceptual)
  35. CAD workflows for DNA constructs (design to sequence file)
  36. Annotation of designs with features, landmarks and metadata
  37. Version control for constructs and design iterations
  38. Bill of materials (BOM) for DNA synthesis and cloning (concepts)
  39. Design review checklists before sending constructs for synthesis
  40. Laboratory protocols as structured design outputs (conceptual)
  41. Design of experiments (DoE) ideas for testing circuit variants
  42. Recording build-test results for feedback into design
  43. Pipelines linking CAD tools to LIMS/ELN-style systems (conceptual)
  44. Graph representations of parts and constructs (high level)
  45. Risk thinking: failure modes in constructs and circuits
  46. Ethical, biosafety and biosecurity considerations in synbio design (non-regulatory overview)
  47. Applications: biosensors, biomanufacturing, cell-based therapies (conceptual survey)
  48. Communication of synthetic biology designs to wet-lab teams
  49. Documentation packages: maps, sequence files and simulation notes
  50. Capstone: scoped in-silico design and CAD for a simple synthetic biology construct or circuit
In-Silico · Online Signal & Image MRI · Pathology · CV

Check below focused areas and choose one to apply

  1. Basics of biomedical signals (ECG, EEG, EMG) and images (MRI, CT, microscopy)
  2. Sampling, Nyquist concepts and anti-aliasing in biomedical acquisition
  3. Noise sources in biomedical data and denoising strategies (conceptual)
  4. Time-domain features: peaks, intervals, morphology descriptors
  5. Frequency-domain analysis: FFT and power spectra (high level)
  6. Time–frequency concepts: STFT/wavelet thinking (overview)
  7. Digital filtering concepts: low/high/band-pass and notch filters
  8. Baseline wander, motion artefact and powerline interference (ECG/EEG)
  9. ECG waveform segmentation: P–QRS–T detection (concepts)
  10. Heart rate variability (HRV) feature families (conceptual overview)
  11. EEG channel layouts and basic rhythm bands (δ, θ, α, β, γ concepts)
  12. Event-related potentials (ERP) and simple averaging concepts
  13. Signal quality indices (SQI) and QC thinking for biosignals
  14. 2D image basics: pixels, resolution, bit depth and colour models
  15. Image histograms, contrast stretching and basic enhancement
  16. Smoothing, sharpening and edge detection filters (conceptual)
  17. Segmentation concepts: thresholding, region-based, clustering approaches
  18. Connected components and basic morphology (erosion/dilation) concepts
  19. Feature extraction: shape, texture and intensity descriptors
  20. Classical ML for classification/regression on engineered features
  21. Introduction to computer vision with biomedical examples
  22. Deep learning concepts for images: CNN intuition (no heavy math)
  23. Segmentation networks (U-Net-style ideas) for lesion/tissue masks
  24. Detection and localisation concepts (bounding boxes, heatmaps)
  25. Patch- and tile-based analysis for whole-slide pathology images (conceptual)
  26. Registration concepts: aligning multimodal images (e.g. MRI/CT)
  27. Motion correction concepts for dynamic imaging
  28. Region-of-interest (ROI) selection and feature summarisation
  29. Radiomics-style feature families (shape, intensity, texture concepts)
  30. Basic MRI sequences overview (T1/T2/FLAIR; non-physics focus)
  31. Simple brain MRI workflows: skull strip → segment → quantify (conceptual)
  32. Quantitative metrics: volumes, thickness, signal ratios (overview)
  33. Pathology image colour normalisation (concepts)
  34. Basic pipelines for nuclei or cell segmentation in histology
  35. Quality assurance for imaging pipelines and outputs
  36. Dataset curation: de-identification concepts for images/signals
  37. Train/validation/test splits and leakage pitfalls (conceptual)
  38. Evaluation metrics: accuracy, ROC/AUC, Dice, IoU (overview)
  39. Cross-validation and robustness thinking for biomedical ML
  40. Simple explainability concepts (saliency/heatmap intuition)
  41. Pipeline design: from raw DICOM/waveforms to analysis-ready datasets
  42. File formats: DICOM, NIfTI, TIFF/WSI, EDF-style signals (high level)
  43. Organising datasets and metadata for reproducibility
  44. Basic scripting workflows to chain processing steps
  45. Documenting pipelines: configs, logs and reports
  46. Common pitfalls and artefacts in MRI and pathology image analysis
  47. Bias, generalisation and domain shift considerations (conceptual)
  48. Ethical/clinical caveats: decision support vs diagnosis (non-clinical training)
  49. Communicating results with clear caveats and limitations
  50. Capstone: design a scoped analysis pipeline for one signal or image use-case
In-Silico · Online LIMS · ELN Lab Automation · Digital

Check below focused areas and choose one to apply

  1. Overview of LIMS, ELN and lab automation ecosystems
  2. Sample lifecycle concepts: accessioning → testing → storage → disposal
  3. Sample identification, barcoding and labelling strategies (conceptual)
  4. Aliquots, derivatives, batches and pooling in digital workflows
  5. Test definitions, panels and method metadata in LIMS
  6. Instrument worklists and basic instrument integration concepts
  7. Result entry, verification and validation workflows (high level)
  8. QC samples, controls and flags in digital workflows (concepts)
  9. Difference between LIMS vs ELN vs inventory tools (roles)
  10. Designing structured ELN templates for experiments and assays
  11. Free-text vs structured fields: balance & trade-offs
  12. Metadata standards and ontologies for lab records (conceptual)
  13. Inventory management: reagents, consumables, lots and expiry
  14. Storage location hierarchies: room → freezer → rack → box → position
  15. Chain-of-custody logging and traceability concepts
  16. Scheduling and capacity: planners, calendars and resource booking
  17. User roles, permissions and segregation of duties (high level)
  18. Master data: tests, instruments, locations, units and reference ranges
  19. Configuration vs customisation: what to tweak vs what to avoid
  20. Workflow engines and state machines inside LIMS-style systems (concepts)
  21. Designing a sample accessioning workflow (from request to label)
  22. Stability studies and sample retention tracking (conceptual)
  23. Deviation, incident and CAPA logging in digital systems (overview)
  24. Audit trails and e-signatures: 21 CFR Part 11-style concepts (non-legal)
  25. Review and approval workflows for results and reports
  26. Basic validation and UAT thinking for LIMS/ELN changes
  27. Import/export patterns: CSV templates and simple APIs (conceptual)
  28. HL7/FHIR-style interfacing concepts for hospital/lab connectivity
  29. Barcode template design and label layout simulation
  30. Parsing simple instrument data files into LIMS-friendly structures
  31. Rules engines for auto-accept, auto-flag and reflex testing (high level)
  32. Dashboards for sample counts, TAT and workload monitoring
  33. Key KPIs for lab operations: TAT, pending, re-run, rejection metrics (conceptual)
  34. ELN templates for SOPs, methods and experiment notes
  35. Linking ELN pages to samples, runs, attachments and reports
  36. Template versioning, approvals and change history
  37. Scripting repetitive tasks and simple automations around LIMS data
  38. Logical scheduling of instruments and robots (simulation concepts)
  39. Simulating sample routing across benches, rooms and instruments
  40. What-if simulations: workload, capacity and staffing scenarios
  41. Designing a core data model for a small lab (entities & relationships)
  42. Choosing fields, constraints and validations in forms
  43. Test case design for new workflows and configuration changes
  44. Migration from spreadsheets/manual logs to LIMS/ELN (conceptual roadmap)
  45. Change control and configuration management basics for labs
  46. Backup, restore and archiving concepts for lab data
  47. Data integrity and ALCOA+ principles (high-level overview)
  48. Vendor-neutral selection checklists and RFP-style thinking (concepts)
  49. User training, SOPs and adoption strategies for digital lab systems
  50. Capstone: design and simulate a mini LIMS/ELN workflow for one lab scenario
In-Silico · Online Digital QA/QC Compliance & e-Validation

Check below focused areas and choose one to apply

  1. Role of QA vs QC vs digital teams in regulated-style environments (conceptual)
  2. Basics of GxP-style thinking for labs, manufacturing and R&D (non-legal overview)
  3. ALCOA+ data integrity principles and examples
  4. Master data, reference data and controlled vocabularies for QA/QC
  5. Capturing QC data digitally: checklists, forms and structured logs
  6. Deviation, incident and OOS/OOT logging concepts
  7. Change control records and impact assessment thinking (high level)
  8. CAPA lifecycle: root cause → actions → effectiveness checks (conceptual)
  9. Risk-based thinking: FMEA-style concepts for processes and systems
  10. Digital SOPs and controlled document management workflows
  11. Version control, approvals and training assignment concepts for SOPs
  12. Training records and competency tracking in digital systems
  13. Audit trail concepts: who changed what, when and why
  14. Basics of computerized system lifecycle (plan → spec → build → test → release)
  15. User requirements vs functional specifications vs configuration specs (conceptual)
  16. Configuration vs customisation in lab/QA systems (trade-offs)
  17. Test plan, test case and test script design concepts
  18. Static vs dynamic testing; installation, operational and performance test ideas
  19. Traceability matrix concepts: linking requirements to tests and evidence
  20. Electronic signatures and identity verification concepts (non-legal overview)
  21. Part 11-style control concepts: access, audit trails, records (high level)
  22. Data classification and retention concepts for QA/QC data
  23. Backup, restore and archival testing concepts
  24. Configuring checks and limits: specifications, ranges and QC rules
  25. Digital QC calculations, rounding and significant figures (conceptual)
  26. Control charts and trend monitoring basics (X-bar, R charts concepts)
  27. Using dashboards to monitor deviations, CAPA, complaints and KPIs
  28. Key QA/QC KPIs: deviations, CAPA closure, investigation times, OOS rates
  29. Sampling plans and acceptance criteria concepts (non-statistical overview)
  30. Linking equipment, instruments and calibration records to QA/QC data
  31. Template design for forms: required fields, checks and picklists
  32. Valid values, lookup lists and reference tables for quality data
  33. Workflows for complaint intake, triage and investigation logging
  34. Internal and external audit planning and follow-up tracking (digital)
  35. Vendor and supplier qualification records (conceptual)
  36. Risk registers and mitigation tracking for digital systems
  37. Spreadsheet risk assessment and control concepts
  38. Computerized system risk assessment and classification ideas
  39. Data migration and cutover checklists for QA/QC systems
  40. Periodic review concepts for systems, configurations and data
  41. Using queries and simple analytics to detect anomalies in QA/QC datasets
  42. Basic statistical summaries for quality metrics (non-deep math)
  43. Storyboarding the e-validation journey for a small system (conceptual)
  44. Templates for validation plans, reports and test summaries
  45. Collaboration between QA, IT, vendors and end-users
  46. Common pitfalls in e-validation and compliance analytics (conceptual)
  47. Non-compliance scenarios and remediation planning (high level)
  48. Readiness for inspections: digital evidence, audit trails and reports (conceptual)
  49. Communication and training strategies for digital QA/QC initiatives
  50. Capstone: scoped digital QA/QC and e-validation concept for one system or process
In-Silico · Online Oncology Informatics Biomarker Analytics

Check below focused areas and choose one to apply

  1. Landscape of oncology informatics: clinical, molecular & real-world data (overview)
  2. Cancer biology & hallmarks (high-level, non-clinical)
  3. Tumour classification, staging & grading concepts (non-clinical)
  4. Structured data in oncology: diagnosis, procedures, drugs, outcomes
  5. Coding systems & terminologies in oncology records (conceptual)
  6. Tumour boards, EMR and imaging systems as oncology data sources (overview)
  7. Cancer registry concepts: case capture, follow-up & outcomes (high level)
  8. Data models for patient, tumour, episode and treatment lines
  9. Time-to-event data structures: index dates and censoring fields (concepts)
  10. Handling longitudinal therapies, dose changes and regimen switches
  11. Real-world evidence (RWE) in oncology: opportunities & caveats
  12. Clinical trial data structures: arms, visits and endpoints (overview)
  13. Eligibility, inclusion/exclusion and line-of-therapy derivation concepts
  14. Oncology outcome endpoints (response, progression, survival – definitions only)
  15. Basic survival analysis concepts: KM curves & hazard thinking (non-math)
  16. Confounding & bias in observational oncology datasets
  17. Data quality checks specific to oncology (dates, stage, sites, regimens)
  18. Basic biomarker concepts: diagnostic, prognostic & predictive markers
  19. Genomic biomarkers: variants, fusions & signatures (high-level view)
  20. Immuno-oncology biomarkers: TMB, microenvironment etc. (conceptual)
  21. Multi-omics biomarkers: genomics, transcriptomics, proteomics (overview)
  22. Companion diagnostics & assay report structures (non-clinical)
  23. Integrating molecular reports with EMR/registry records (concepts)
  24. Curation pipelines for variant & biomarker annotations (high level)
  25. Knowledge bases & guidance for cancer variants (conceptual overview)
  26. Real-world biomarker testing patterns & adoption analytics
  27. Cohort definition for biomarker-enriched populations (concepts)
  28. Feature engineering for oncology models: lines, burden, prior therapies
  29. Building descriptive dashboards for oncology programmes
  30. Visualising timelines: swim-lane plots for treatment journeys
  31. Plotting response and burden-of-disease trajectories (high level)
  32. Basic ML concepts in oncology informatics (risk scores, stratification)
  33. Model evaluation metrics for oncology predictions (ROC, PR, calibration – overview)
  34. Fairness & subgroup performance considerations in oncology models (conceptual)
  35. Privacy & de-identification concepts for oncology datasets
  36. Data sharing frameworks & federated thinking (high-level)
  37. Data pipelines from source systems to oncology data marts (ETL concepts)
  38. Data dictionaries & metadata for oncology analytics
  39. QA/QC checks for survival & response-based analyses
  40. Change management when updating coding, staging or biomarker rules
  41. Collaborative workflows across clinicians, data teams & statisticians
  42. Documentation standards for oncology analysis artefacts
  43. Reporting templates for internal tumour boards & strategy teams
  44. Communicating limitations & uncertainty in oncology analytics
  45. High-level view of regulatory & HTA use of oncology data (non-advisory)
  46. Road-mapping an oncology informatics programme in an organisation
  47. Benchmarking and external comparison concepts (registries, publications)
  48. Role of AI/ML & NLP in extracting oncology variables from text (overview)
  49. Common pitfalls and “gotchas” in oncology & biomarker analytics
  50. Capstone: scoped oncology informatics or biomarker analytics mini-project
In-Silico · Online Agri & Plant Omics Crop Informatics

Check below focused areas and choose one to apply

  1. Landscape of agri/plant bioinformatics and crop informatics (overview)
  2. Plant genome organisation, ploidy and reference resources (high level)
  3. Crop pan-genomes and germplasm diversity concepts
  4. Reference genome databases and browsers for major crops (conceptual)
  5. Gene and transcript annotation concepts for plants
  6. Functional annotation sources for plant genes and proteins (overview)
  7. Read mapping and variant calling workflows for crop genomes (conceptual)
  8. Handling polyploidy, homeologs and duplicated regions (high level)
  9. Variant types in crops: SNPs, InDels, SVs and CNVs (conceptual)
  10. Quality control and filtering of variant calls in plant datasets
  11. Constructing and managing variant panels for breeding programmes
  12. Genotyping-by-sequencing (GBS) and array data concepts
  13. Genotype matrices, missingness and imputation ideas
  14. Population structure and relatedness in germplasm panels (conceptual)
  15. Linkage disequilibrium and haplotype blocks (intuitive overview)
  16. QTL mapping concepts for agronomic traits (non-mathematical)
  17. GWAS-style association analysis for crop traits (conceptual)
  18. Multi-environment trial (MET) data structures and covariates
  19. Phenotyping data formats: field, greenhouse and high-throughput phenotyping (overview)
  20. Data cleaning and harmonisation for phenotypic traits
  21. Trait definitions, units and scales; basic transformations (conceptual)
  22. Integrating environmental and management data with phenotypes
  23. Genomic selection (GS) concepts and typical workflows (high level)
  24. Model inputs for GS: markers, kinship, environmental covariates
  25. Prediction accuracy, cross-validation and bias considerations (conceptual)
  26. Marker-assisted selection (MAS) vs genomic selection (comparison)
  27. Intro to crop-specific decision-support dashboards (conceptual)
  28. Designing simple dashboards for lines, traits and locations
  29. Basic spatial and GIS concepts for field experiments (overview)
  30. Plot-level vs line-level vs genotype-level aggregations
  31. Gene expression and RNA-seq concepts for plant stress/trait studies
  32. Co-expression networks and modules for plant genes (conceptual)
  33. Pathway and GO term enrichment for crop trait candidates
  34. Multi-omics concepts: linking genomics, transcriptomics and metabolomics in crops
  35. Intro to plant–pathogen interactome and resistance gene analytics (high level)
  36. Plant pan-genome presence/absence variation (conceptual)
  37. Curating metadata for accessions, locations and seasons
  38. Data standards and ontologies in plant breeding and trials (overview)
  39. File formats and organisation for multi-season, multi-location datasets
  40. Basic QC checklists for genotypes, phenotypes and environments
  41. Scenario thinking: benchmarking varieties across locations and years
  42. Intro to crop modelling concepts and linking with informatics outputs
  43. Communicating limitations and uncertainties in agronomic analytics
  44. Ethical and data-sharing considerations in breeding programmes (conceptual)
  45. Collaboration patterns between breeders, bioinformaticians and data teams
  46. Road-mapping data infrastructure for a breeding or crop-research unit
  47. Smallholder vs large-scale contexts: data implications (high-level)
  48. Opportunities for AI/ML and remote-sensing data in crop informatics (overview)
  49. Common pitfalls and misunderstandings in agri/plant bioinformatics
  50. Capstone: scoped crop-informatics or plant-bioinformatics mini-project design


PDF