Model Development & Training.ipynb

Model Development & Training.ipynb — Building the predictor

Objective

Train an explainable multimodal network that fuses gene-expression and chemical fingerprints to classify drug–cell-line pairs as sensitive or not sensitive.

Inputs

File

Drive path

Notes

multimodal_dataset_final.pkl

processed_datasets/

108,696 rows × 2,783 features (735 genes + 2,048 bits)

DrugSens-Train.csv

…/sensitivity/pivot/clas/

Training labels

DrugSens-Validhyper-Subsampling.csv

same

Early-stopping / HP tuning

DrugSens-Trainhyper-Subsampling.csv

same

Class-weight estimation

DrugSens-Test.csv

same

Held-out evaluation

Architecture

Gene branch: 735 gene expression features are reduced to 128 dimensions via a dense encoder.
Chemistry branch: 2,048-bit Morgan fingerprints are reduced to 128 dimensions via a dense encoder.
Cross-modal attention: 8-head cross-modal attention fuses the gene and chemical feature representations, allowing interaction between modalities.
Fusion and output: The joint representation is concatenated and passed through three fully-connected layers (512 → 256 → 64), ending with a sigmoid output for binary classification (sensitive vs not sensitive).

Training setup

Setting

Value

Hardware

NVIDIA A100 (Colab)

Optimiser

AdamW, lr = 1 × 10⁻³

Batch size

512

Loss

Weighted binary cross-entropy (positive weight ≈ 8.5)

Precision

Mixed float16 / float32

Early stopping

Patience = 10 on val-AUROC

Outputs

Artifact

Purpose

best_multimodal_model.keras

Final weights (best val-AUROC)

logs/train/, logs/validation/

TensorBoard event files

Best epoch (test set) — AUROC = 0.981, Precision = 0.973, Recall = 0.985

Rationale

Attention maps provide gene–drug interaction insights for each prediction.
Class weighting handles the 8.5 : 1 imbalance without down-sampling.
Mixed precision halves GPU memory usage and speeds training ≈ 1.7 × on an A100.

The model file and logs live in the shared Google Drive; load them in Colab to reproduce or fine-tune.

PreviousData Integration & Preparation.ipynb NextAdditional Resources

Last updated 10 days ago