Model Development & Training.ipynb
Model Development & Training.ipynb — Building the predictor
Objective
Train an explainable multimodal network that fuses gene-expression and chemical fingerprints to classify drug–cell-line pairs as sensitive or not sensitive.
Inputs
multimodal_dataset_final.pkl
processed_datasets/
108,696 rows × 2,783 features (735 genes + 2,048 bits)
DrugSens-Train.csv
…/sensitivity/pivot/clas/
Training labels
DrugSens-Validhyper-Subsampling.csv
same
Early-stopping / HP tuning
DrugSens-Trainhyper-Subsampling.csv
same
Class-weight estimation
DrugSens-Test.csv
same
Held-out evaluation
Architecture
Gene branch: 735 gene expression features are reduced to 128 dimensions via a dense encoder.
Chemistry branch: 2,048-bit Morgan fingerprints are reduced to 128 dimensions via a dense encoder.
Cross-modal attention: 8-head cross-modal attention fuses the gene and chemical feature representations, allowing interaction between modalities.
Fusion and output: The joint representation is concatenated and passed through three fully-connected layers (512 → 256 → 64), ending with a sigmoid output for binary classification (sensitive vs not sensitive).
Training setup
Hardware
NVIDIA A100 (Colab)
Optimiser
AdamW, lr = 1 × 10⁻³
Batch size
512
Loss
Weighted binary cross-entropy (positive weight ≈ 8.5)
Precision
Mixed float16 / float32
Early stopping
Patience = 10 on val-AUROC
Outputs
best_multimodal_model.keras
Final weights (best val-AUROC)
logs/train/
, logs/validation/
TensorBoard event files
Best epoch (test set) — AUROC = 0.981, Precision = 0.973, Recall = 0.985
Rationale
Attention maps provide gene–drug interaction insights for each prediction.
Class weighting handles the 8.5 : 1 imbalance without down-sampling.
Mixed precision halves GPU memory usage and speeds training ≈ 1.7 × on an A100.
The model file and logs live in the shared Google Drive; load them in Colab to reproduce or fine-tune.
Last updated