Overview
This project develops an explainable multimodal model that predicts whether a small-molecule compound will inhibit a paediatric cancer cell line (IC₅₀ ≤ 0.368 µM).
AUROC
0.981
Precision
0.973
Recall
0.985
Data sources
CCLE 22Q2 — 735 high-variance gene-expression features
GDSC2 — 108,696 drug–cell-line IC₅₀ measurements
SMILES fingerprints — 2,048-bit Morgan fingerprints
Model at a glance
Two dense branches (genes: 735 → 128, chemistry: 2,048 → 128) feed a multi-head cross-modal attention layer; the fused representation passes through three fully connected layers and a final sigmoid unit to predict sensitivity.
Key links
Google Drive — full project folder with datasets, notebooks, trained model, and logs — https://drive.google.com/drive/folders/1PXvxMIHJs_etK4K1nf5Hm4fkfUrDreaO
Last updated