TDC-2: Multimodal foundation for therapeutic science

A Velez-Arce, K Huang, MM Li, X Lin, W Gao, T Fu… - bioRxiv, 2024 - biorxiv.org
Abstract Therapeutics Data Commons (tdcommons. ai) is an open science initiative with
unified datasets, AI models, and benchmarks to support research across therapeutic …

Perteval-scfm: Benchmarking single-cell foundation models for perturbation effect prediction

A Wenteler, M Occhetta, N Branson, M Huebner… - bioRxiv, 2024 - biorxiv.org
In silico modeling of transcriptional responses to perturbations is crucial for advancing our
understanding of cellular processes and disease mechanisms. We present PertEval-scFM, a …

A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches

AR Baião, Z Cai, RC Poulos, PJ Robinson… - arxiv preprint arxiv …, 2025 - arxiv.org
The rapid advancement of high-throughput sequencing and other assay technologies has
resulted in the generation of large and complex multi-omics datasets, offering …

No Foundations without Foundations--Why semi-mechanistic models are essential for regulatory biology

L Kovačević, T Gaudelet, J Opzoomer, H Triendl… - arxiv preprint arxiv …, 2025 - arxiv.org
Despite substantial efforts, deep learning has not yet delivered a transformative impact on
elucidating regulatory biology, particularly in the realm of predicting gene expression …

A systematic comparison of computational methods for expression forecasting

E Kernfeld, Y Yang, JS Weinstock, A Battle, P Cahan - BioRxiv, 2023 - biorxiv.org
Due to the abundance of single cell RNA-seq data, a number of methods for predicting
expression after perturbation have recently been published. Expression prediction methods …

Controllable Sequence Editing for Counterfactual Generation

MM Li, K Li, Y Ektefaie, S Messica, M Zitnik - arxiv preprint arxiv …, 2025 - arxiv.org
Sequence models generate counterfactuals by modifying parts of a sequence based on a
given condition, enabling reasoning about" what if" scenarios. While these models excel at …

Benchmarking a foundational cell model for post-perturbation RNAseq prediction

G Csendes, KZ Szalay, B Szalai - bioRxiv, 2024 - biorxiv.org
Accurately predicting cellular responses to perturbations is essential for understanding cell
behaviour in both healthy and diseased states. While perturbation data is ideal for building …

Causal models and prediction in cell line perturbation experiments

JP Long, Y Yang, S Shimizu, T Pham, KA Do - BMC bioinformatics, 2025 - Springer
In cell line perturbation experiments, a collection of cells is perturbed with external agents
and responses such as protein expression measured. Due to cost constraints, only a small …

Active learning for efficient discovery of optimal gene combinations in the combinatorial perturbation space

J Qin, HH Wessels, C Fernandez-Granda… - arxiv preprint arxiv …, 2024 - arxiv.org
The advancement of novel combinatorial CRISPR screening technologies enables the
identification of synergistic gene combinations on a large scale. This is crucial for …

Predicting perturbation targets with causal differential networks

M Wu, U Padia, SH Murphy, R Barzilay… - arxiv preprint arxiv …, 2024 - arxiv.org
Rationally identifying variables responsible for changes to a biological system can enable
myriad applications in disease understanding and cell engineering. From a causality …