Loading Video...
NTHRYS
Arrow

Version Control & Data Versioning — Git and DVC Training | Reproducible Research

NTHRYS >> Services >> Academic Services >> Training Programs >> Bioinformatics Training >> AI/ML, Data Science, Pipelines & Cloud >> Version Control & Data Versioning — Git and DVC Training | Reproducible Research

Version Control & Data Versioning — Git & DVC — Hands-on

Create auditable, collaborative research stacks by pairing Git for code with DVC for data and models. Learn team-friendly branching strategies, signed commits, pre-commit quality gates, and dataset/model versioning over S3/GS/Azure. Tie versions to Snakemake/Nextflow runs, notebooks, and releases so results are traceable and repeatable.

Version Control & Data Versioning — Git & DVC
Help Desk · WhatsApp
Session 1
Fee: Rs 17,400
Git Fundamentals, Branching & Quality Gates
  • Core Git workflows
  • feature/PR flow rebase/squash/merge monorepo/submodules
  • Quality & security gates
  • pre-commit hooks signed commits (GPG/SSH) protected branches
  • Binary/data handling
  • Git LFS/annex .gitattributes repo hygiene
Session 2
Fee: Rs 23,200
DVC Basics: Remotes, Pipelines & Experiments
  • Track datasets & models
  • dvc add/commit push/pull lockfiles
  • Pipelines & metrics
  • dvc.yaml/stages params & plots exp tracking
  • Remotes & auth
  • S3/GS/Azure minio/NAS encryption
Session 3
Fee: Rs 29,000
Data/Model Versioning at Scale (S3/GS/Azure)
  • Scale patterns & storage
  • sharding & dedupe lifecycle & tiers access controls
  • Ties to pipelines & notebooks
  • Snakemake/Nextflow Jupyter/R Markdown reports/plots
  • MLOps & registries
  • model registry promotion rules rollbacks
Session 4
Fee: Rs 36,200
Mini Capstone: Releasable Repo + Data Registry
  • Publish a tagged release (semver) with data snapshot
  • Theory + Practical
  • Automate checks & artifacts
  • pre-commit/CI hooks CHANGELOG & SBOM signed tags
  • Deliverables: repo, DVC remote & release notes
  • Git + DVC project dataset/model snapshot PDF/HTML summary


PDF