Researcher

Semilogo Oketola

Biotechnologist | Oncology Research

I build computational systems to characterise cancer progression using integrated biological data. My work sits at the intersection of genomics, transcriptomics, and medical imaging, with a focus on multimodal feature integration, model interpretability, and biologically grounded validation methodology.

Why PancrionDX

Single-modality mutation-based models have a structural limitation for stage prediction: driver mutations in PDAC accumulate early and persist across disease progression, producing minimal differential signal between stages. PancrionDX was built to address this by integrating transcriptomic expression data and radiomic imaging features alongside genomics — capturing biological layers that shift measurably as the disease advances, and producing a more faithful representation of tumour state.

Project Evolution

SynGenix

Early-stage structured genomic modeling with systematic feature engineering, incorporating multi-gene panel analysis and variant impact scoring.

View Project

PanEcho

Expansion into exploration of computational modeling in biomedical data, establishing initial feature engineering pipelines and baseline classification approaches.

View Project

PancrionDX

Multimodal system integrating genomic, transcriptomic, and radiomic data for stage-based classification of pancreatic ductal adenocarcinoma.

Approach

Biological Grounding

Models are built to reflect known cancer biology, not just statistical patterns. Feature selection and architecture choices are anchored to the mechanistic behaviour of PDAC.

Interpretability First

Every prediction can be traced back to specific gene, expression, or imaging features. Opaque ensemble outputs are accompanied by SHAP attribution and gene-level contribution scores.

Robust Validation

Validation includes structured perturbation tests, gene ablation studies, and calibration analysis — not only held-out accuracy. This distinguishes signal from overfitting.

Multimodal Thinking

Different data types are treated as complementary biological signals rather than merged blindly. Each modality is evaluated independently before integration, preserving interpretability at the source level.

More work and writing