Novel machine learning approaches revolutionize protein knowledge

N Bordin, C Dallago, M Heinzinger, S Kim… - Trends in Biochemical …, 2023 - cell.com
Breakthrough methods in machine learning (ML), protein structure prediction, and novel
ultrafast structural aligners are revolutionizing structural biology. Obtaining accurate models …

Using machine learning to predict the effects and consequences of mutations in proteins

DJ Diaz, AV Kulikova, AD Ellington, CO Wilke - Current opinion in structural …, 2023 - Elsevier
Abstract Machine and deep learning approaches can leverage the increasingly available
massive datasets of protein sequences, structures, and mutational effects to predict variants …

Nucleotide Transformer: building and evaluating robust foundation models for human genomics

H Dalla-Torre, L Gonzalez, J Mendoza-Revilla… - Nature …, 2024 - nature.com
The prediction of molecular phenotypes from DNA sequences remains a longstanding
challenge in genomics, often driven by limited annotated data and the inability to transfer …

<? sty\usepackage {wasysym}?> Bilingual language model for protein sequence and structure

M Heinzinger, K Weissenow… - NAR Genomics and …, 2024 - academic.oup.com
Adapting language models to protein sequences spawned the development of powerful
protein language models (pLMs). Concurrently, AlphaFold2 broke through in protein …

Prottrans: Toward understanding the language of life through self-supervised learning

A Elnaggar, M Heinzinger, C Dallago… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Computational biology and bioinformatics provide vast data gold-mines from protein
sequences, ideal for Language Models (LMs) taken from Natural Language Processing …

Proteingym: Large-scale benchmarks for protein fitness prediction and design

P Notin, A Kollasch, D Ritter… - Advances in …, 2024 - proceedings.neurips.cc
Predicting the effects of mutations in proteins is critical to many applications, from
understanding genetic disease to designing novel proteins to address our most pressing …

Protst: Multi-modality learning of protein sequences and biomedical texts

M Xu, X Yuan, S Miret, J Tang - International Conference on …, 2023 - proceedings.mlr.press
Current protein language models (PLMs) learn protein representations mainly based on
their sequences, thereby well capturing co-evolutionary information, but they are unable to …

Fine-tuning protein language models boosts predictions across diverse tasks

R Schmirler, M Heinzinger, B Rost - Nature Communications, 2024 - nature.com
Prediction methods inputting embeddings from protein language models have reached or
even surpassed state-of-the-art performance on many protein prediction tasks. In natural …

Updated benchmarking of variant effect predictors using deep mutational scanning

BJ Livesey, JA Marsh - Molecular systems biology, 2023 - embopress.org
The assessment of variant effect predictor (VEP) performance is fraught with biases
introduced by benchmarking against clinical observations. In this study, building on our …

Proteinnpt: Improving protein property prediction and design with non-parametric transformers

P Notin, R Weitzman, D Marks… - Advances in Neural …, 2023 - proceedings.neurips.cc
Protein design holds immense potential for optimizing naturally occurring proteins, with
broad applications in drug discovery, material design, and sustainability. However …