Scientific discovery in the age of artificial intelligence

H Wang, T Fu, Y Du, W Gao, K Huang, Z Liu… - Nature, 2023 - nature.com
Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment
and accelerate research, hel** scientists to generate hypotheses, design experiments …

Transformers in single-cell omics: a review and new perspectives

A Szałata, K Hrovatin, S Becker, A Tejada-Lapuerta… - Nature …, 2024 - nature.com
Recent efforts to construct reference maps of cellular phenotypes have expanded the
volume and diversity of single-cell omics data, providing an unprecedented resource for …

Accurate proteome-wide missense variant effect prediction with AlphaMissense

J Cheng, G Novati, J Pan, C Bycroft, A Žemgulytė… - Science, 2023 - science.org
The vast majority of missense variants observed in the human genome are of unknown
clinical significance. We present AlphaMissense, an adaptation of AlphaFold fine-tuned on …

Large language models generate functional protein sequences across diverse families

A Madani, B Krause, ER Greene, S Subramanian… - Nature …, 2023 - nature.com
Deep-learning language models have shown promise in various biotechnological
applications, including protein design and engineering. Here we describe ProGen, a …

Simulating 500 million years of evolution with a language model

T Hayes, R Rao, H Akin, NJ Sofroniew, D Oktay, Z Lin… - Science, 2025 - science.org
More than three billion years of evolution have produced an image of biology encoded into
the space of natural proteins. Here we show that language models trained at scale on …

Galactica: A large language model for science

R Taylor, M Kardas, G Cucurull, T Scialom… - arxiv preprint arxiv …, 2022 - arxiv.org
Information overload is a major obstacle to scientific progress. The explosive growth in
scientific literature and data has made it ever harder to discover useful insights in a large …

Evolutionary-scale prediction of atomic-level protein structure with a language model

Z Lin, H Akin, R Rao, B Hie, Z Zhu, W Lu, N Smetanin… - Science, 2023 - science.org
Recent advances in machine learning have leveraged evolutionary information in multiple
sequence alignments to predict protein structure. We demonstrate direct inference of full …

Nucleotide Transformer: building and evaluating robust foundation models for human genomics

H Dalla-Torre, L Gonzalez, J Mendoza-Revilla… - Nature …, 2024 - nature.com
The prediction of molecular phenotypes from DNA sequences remains a longstanding
challenge in genomics, often driven by limited annotated data and the inability to transfer …

Illuminating protein space with a programmable generative model

JB Ingraham, M Baranov, Z Costello, KW Barber… - Nature, 2023 - nature.com
Three billion years of evolution has produced a tremendous diversity of protein molecules,
but the full potential of proteins is likely to be much greater. Accessing this potential has …

Enzyme function prediction using contrastive learning

T Yu, H Cui, JC Li, Y Luo, G Jiang, H Zhao - Science, 2023 - science.org
Enzyme function annotation is a fundamental challenge, and numerous computational tools
have been developed. However, most of these tools cannot accurately predict functional …