Loading Video...
NTHRYS
Arrow

Feature Selection & Model Reproducibility for QSAR/QSPR | Leakage-Safe, Nested-CV Pipelines

NTHRYS >> Services >> Academic Services >> Training Programs >> Bioinformatics Training >> Cheminformatics, QSAR & ADMET >> Feature Selection & Model Reproducibility for QSAR/QSPR | Leakage-Safe, Nested-CV Pipelines

Feature Selection & Model Reproducibility — Hands-on

Design robust QSAR/QSPR feature spaces and validation strategies that generalize. This module focuses on principled feature selection, collinearity control, leakage prevention, and reproducibility practices. You will implement end-to-end, audit-ready pipelines with nested cross-validation, proper reporting, and artifacts for reuse.

Feature Selection & Model Reproducibility for QSAR/QSPR
Help Desk · WhatsApp
Session 1
Fee: Rs 12800
Data Hygiene & Collinearity Control
  • Pre-FS checks & filters
  • missingness & imputation variance/near-zero variance scaling/normalization
  • Collinearity diagnostics
  • correlation matrices VIF thresholds cluster-based pruning
  • Train/test hygiene
  • scaffold/stratified splits no peeking policies pipelines for transformations
Session 2
Fee: Rs 15800
Filter/Wrappers/Embedded & Dimensionality Reduction
  • Filter methods
  • univariate tests mutual information mRMR
  • Wrappers & embedded
  • RFE/RFECV Boruta/SHAP Lasso/Elastic Net
  • Dimension reduction
  • PCA/PLS UMAP for EDA targets & scaling effects
Session 3
Fee: Rs 18800
Validation: Nested CV, Leakage & Stability
  • Leakage-safe pipelines
  • Pipeline/ColumnTransformer fit/transform boundaries temporal/scaffold splits
  • Model selection the right way
  • nested cross-validation grouped CV bootstrap stability
  • Sanity checks & AD
  • y-scrambling permutation tests applicability domain
Session 4
Fee: Rs 22800
Mini Capstone: Reproducible FS Pipeline
  • Build: filter → FS → model → validate
  • Theory + Practical
  • Report & artifacts
  • config YAML/JSON random seeds & env lock model/FS cards
  • Deliverables
  • reproducible notebook/script metrics + stability plots pipeline diagram


PDF