A guide to machine learning for biologists

JG Greener, SM Kandathil, L Moffat… - Nature reviews Molecular …, 2022 - nature.com
The expanding scale and inherent complexity of biological data have encouraged a growing
use of machine learning in biology to build informative and predictive models of the …

From variant to function in human disease genetics

T Lappalainen, DG MacArthur - Science, 2021 - science.org
Over the next decade, the primary challenge in human genetics will be to understand the
biological mechanisms by which genetic variants influence phenotypes, including disease …

Accurate proteome-wide missense variant effect prediction with AlphaMissense

J Cheng, G Novati, J Pan, C Bycroft, A Žemgulytė… - Science, 2023 - science.org
The vast majority of missense variants observed in the human genome are of unknown
clinical significance. We present AlphaMissense, an adaptation of AlphaFold fine-tuned on …

Efficient evolution of human antibodies from general protein language models

BL Hie, VR Shanker, D Xu, TUJ Bruun… - Nature …, 2024 - nature.com
Natural evolution must explore a vast landscape of possible sequences for desirable yet
rare mutations, suggesting that learning from natural evolutionary strategies could guide …

Loss-of-function, gain-of-function and dominant-negative mutations have profoundly different effects on protein structure

L Gerasimavicius, BJ Livesey, JA Marsh - Nature communications, 2022 - nature.com
Most known pathogenic mutations occur in protein-coding regions of DNA and change the
way proteins are made. Taking protein structure into account has therefore provided great …

Unsupervised evolution of protein and antibody complexes with a structure-informed language model

VR Shanker, TUJ Bruun, BL Hie, PS Kim - Science, 2024 - science.org
Large language models trained on sequence information alone can learn high-level
principles of protein design. However, beyond sequence, the three-dimensional structures of …

An Atlas of Variant Effects to understand the genome at nucleotide resolution

DM Fowler, DJ Adams, AL Gloyn, WC Hahn, DS Marks… - Genome biology, 2023 - Springer
Sequencing has revealed hundreds of millions of human genetic variants, and continued
efforts will only add to this variant avalanche. Insufficient information exists to interpret the …

Protein design and variant prediction using autoregressive generative models

JE Shin, AJ Riesselman, AW Kollasch… - Nature …, 2021 - nature.com
The ability to design functional sequences and predict effects of variation is central to protein
engineering and biotherapeutics. State-of-art computational methods rely on models that …

Embeddings from protein language models predict conservation and variant effects

C Marquet, M Heinzinger, T Olenyi, C Dallago… - Human genetics, 2022 - Springer
The emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret
the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational …

Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models

Y Qiu, GW Wei - Briefings in bioinformatics, 2023 - academic.oup.com
Protein engineering is an emerging field in biotechnology that has the potential to
revolutionize various areas, such as antibody design, drug discovery, food security, ecology …