Machine learning for functional protein design

P Notin, N Rollins, Y Gal, C Sander, D Marks - Nature biotechnology, 2024 - nature.com
Recent breakthroughs in AI coupled with the rapid accumulation of protein sequence and
structure data have radically transformed computational protein design. New methods …

Opportunities and challenges for machine learning-assisted enzyme engineering

J Yang, FZ Li, FH Arnold - ACS Central Science, 2024 - ACS Publications
Enzymes can be engineered at the level of their amino acid sequences to optimize key
properties such as expression, stability, substrate range, and catalytic efficiency─ or even to …

Large language models generate functional protein sequences across diverse families

A Madani, B Krause, ER Greene, S Subramanian… - Nature …, 2023 - nature.com
Deep-learning language models have shown promise in various biotechnological
applications, including protein design and engineering. Here we describe ProGen, a …

Proteingym: Large-scale benchmarks for protein fitness prediction and design

P Notin, A Kollasch, D Ritter… - Advances in …, 2023 - proceedings.neurips.cc
Predicting the effects of mutations in proteins is critical to many applications, from
understanding genetic disease to designing novel proteins to address our most pressing …

[HTML][HTML] Progen2: exploring the boundaries of protein language models

E Nijkamp, JA Ruffolo, EN Weinstein, N Naik, A Madani - Cell systems, 2023 - cell.com
Attention-based models trained on protein sequences have demonstrated incredible
success at classification and generation tasks relevant for artificial-intelligence-driven …

The road to fully programmable protein catalysis

SL Lovelock, R Crawshaw, S Basler, C Levy, D Baker… - Nature, 2022 - nature.com
The ability to design efficient enzymes from scratch would have a profound effect on
chemistry, biotechnology and medicine. Rapid progress in protein engineering over the past …

Machine learning-guided protein engineering

P Kouba, P Kohout, F Haddadi, A Bushuiev… - ACS …, 2023 - ACS Publications
Recent progress in engineering highly promising biocatalysts has increasingly involved
machine learning methods. These methods leverage existing experimental and simulation …

MSA transformer

RM Rao, J Liu, R Verkuil, J Meier… - International …, 2021 - proceedings.mlr.press
Unsupervised protein language models trained across millions of diverse sequences learn
structure and function of proteins. Protein language models studied to date have been …

Learning protein fitness models from evolutionary and assay-labeled data

C Hsu, H Nisonoff, C Fannjiang, J Listgarten - Nature biotechnology, 2022 - nature.com
Abstract Machine learning-based models of protein fitness typically learn from either
unlabeled, evolutionarily related sequences or variant sequences with experimentally …

Language models generalize beyond natural proteins

R Verkuil, O Kabeli, Y Du, BIM Wicky, LF Milles… - BioRxiv, 2022 - biorxiv.org
Learning the design patterns of proteins from sequences across evolution may have
promise toward generative protein design. However it is unknown whether language …