Loading Video...
NTHRYS
Arrow

Descriptor Engineering & Feature Selection Training | Molecular Descriptors & QSAR Feature Space

NTHRYS >> Services >> Academic Services >> Training Programs >> Bioinformatics Training >> Computational Drug Discovery, Chemoinformatics & QSAR/ADMET >> Descriptor Engineering & Feature Selection Training | Molecular Descriptors & QSAR Feature Space

Descriptor Engineering & Feature Selection — Hands-on

Learn how to turn raw chemical structures into clean, informative feature matrices ready for QSAR, ML and property prediction. This module walks through molecular descriptor families, descriptor calculation, quality control, scaling and leakage-aware feature selection so that downstream models remain stable, interpretable and reproducible.

Descriptor Engineering & Feature Selection
Help Desk · WhatsApp
Session 1
Fee: Rs 8800
Molecular Descriptors Landscape (0D–3D)
  • Descriptor families and levels of representation
  • 0D / 1D / 2D / 3D descriptors global vs local features scalar vs vector forms
  • Physicochemical and topological descriptors
  • MW, logP, pKa, polar surface topological indices fragment counts
  • 3D and conformation dependent descriptors
  • 3D pharmacophoric patterns shape descriptors when 3D really matters
Session 2
Fee: Rs 11800
Descriptor Calculation & Data Hygiene
  • Descriptor engines and toolkits overview
  • RDKit and mordred PaDEL and CDK concepts batch processing of libraries
  • Handling missing values and unstable descriptors
  • constant and near constant columns error flags and NaNs removal vs imputation
  • Scaling, normalization and transformation choices
  • standard vs min max scaling log transform and power methods robust scaling for outliers
Session 3
Fee: Rs 14800
Feature Selection, Reduction & Leakage Checks
  • Multicollinearity diagnostics and pruning
  • correlation heatmaps variance inflation factor redundant feature removal
  • Filter, wrapper and embedded methods
  • univariate filters and mutual information forward and backward selection Lasso, tree based importance
  • Dimensionality reduction and leakage aware workflows
  • PCA and latent spaces pipelines in scikit learn train only fitting and data leakage
Session 4
Fee: Rs 18800
Mini Capstone: QSAR Ready Feature Matrix
  • From structures to curated descriptor table
  • Theory + Practical
  • End to end feature engineering and selection workflow
  • descriptor generation script quality checks and pruning split aware pipelines
  • Deliverables: feature matrix, code and documentation
  • CSV feature matrix notebook or script file readme for QSAR teams


PDF