Loading Video...
NTHRYS
Arrow

FAIR & Reproducible Research — Versioned Data & Code Training | Biostatistics & ML for Omics

NTHRYS >> Services >> Academic Services >> Training Programs >> Bioinformatics Training >> Biostatistics, AI/ML & Reproducible Omics Analytics >> FAIR & Reproducible Research — Versioned Data & Code Training | Biostatistics & ML for Omics

FAIR & Reproducible Research — Versioned Data & Code — Hands-on

Build FAIR and reproducible analysis practices for omics and clinical machine learning projects. This module focuses on data and metadata management, Git and data versioning, environment capture, workflow documentation and practical templates so that your work is re runnable, auditable and shareable.

FAIR & Reproducible Research — Versioned Data & Code
Help Desk · WhatsApp
Session 1
Fee: Rs 8800
FAIR Principles & Reproducible Mindset
  • FAIR and reproducible research basics
  • Findable, Accessible, Interoperable, Reusable reproducible vs replicable vs robust why this matters in omics and clinical ML
  • From ad hoc analysis to structured projects
  • project folder conventions naming patterns for files and outputs documenting decisions as you go
  • Roles, collaboration and responsibility sharing
  • analyst, PI and data steward roles simple team agreements and checklists expectations for future maintainers
Session 2
Fee: Rs 11800
Data, Metadata & Versioned Storage
  • Data and metadata organisation patterns
  • raw, interim and processed layers metadata tables and data dictionaries basic standards and schemas for omics
  • Versioning data sets with simple tools
  • Git friendly layouts for large files data version control ideas (eg. DVC style) tracking provenance from input to output
  • Storage, backup and access policies
  • local, network and cloud storage choices snapshotting and backup basics simple access control patterns
Session 3
Fee: Rs 14800
Code, Environments & Workflow Capture
  • Version controlling analysis code with Git
  • branching and pull request habits commit messages that tell a story tags and releases for milestones
  • Capturing environments and dependencies
  • conda, virtualenv and requirements files lock files and environment exports container images for stable reruns
  • Recording workflows and pipeline steps
  • notebooks vs scripts vs workflow tools simple directed acyclic workflow diagrams logs and run manifests for each execution
Session 4
Fee: Rs 18800
Templates, Audits & Project Deliverables
  • Checklists and templates for daily work
  • README and contributing templates analysis plan and protocol skeletons issue and change log patterns
  • Internal reviews and reproducibility audits
  • rerun by colleague test common failure modes and fixes simple audit trail summary documents
  • Deliverables: reproducible project package
  • versioned repo with data pointers environment and workflow definitions rerun instructions for reviewers and collaborators


PDF