Machine learning for functional protein design

P Notin, N Rollins, Y Gal, C Sander, D Marks - Nature biotechnology, 2024 - nature.com
Recent breakthroughs in AI coupled with the rapid accumulation of protein sequence and
structure data have radically transformed computational protein design. New methods …

Machine learning-enabled retrobiosynthesis of molecules

T Yu, AG Boob, MJ Volk, X Liu, H Cui, H Zhao - Nature Catalysis, 2023 - nature.com
Retrobiosynthesis provides an effective and sustainable approach to producing functional
molecules. The past few decades have witnessed a rapid expansion of biosynthetic …

Nucleotide Transformer: building and evaluating robust foundation models for human genomics

H Dalla-Torre, L Gonzalez, J Mendoza-Revilla… - Nature …, 2024 - nature.com
The prediction of molecular phenotypes from DNA sequences remains a longstanding
challenge in genomics, often driven by limited annotated data and the inability to transfer …

DeepLoc 2.0: multi-label subcellular localization prediction using protein language models

V Thumuluri, JJ Almagro Armenteros… - Nucleic acids …, 2022 - academic.oup.com
The prediction of protein subcellular localization is of great relevance for proteomics
research. Here, we propose an update to the popular tool DeepLoc with multi-localization …

[HTML][HTML] Bilingual language model for protein sequence and structure

M Heinzinger, K Weissenow… - NAR Genomics and …, 2024 - pmc.ncbi.nlm.nih.gov
Adapting language models to protein sequences spawned the development of powerful
protein language models (pLMs). Concurrently, AlphaFold2 broke through in protein …

Prottrans: Toward understanding the language of life through self-supervised learning

A Elnaggar, M Heinzinger, C Dallago… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Computational biology and bioinformatics provide vast data gold-mines from protein
sequences, ideal for Language Models (LMs) taken from Natural Language Processing …

Learning functional properties of proteins with language models

S Unsal, H Atas, M Albayrak, K Turhan… - Nature Machine …, 2022 - nature.com
Data-centric approaches have been used to develop predictive methods for elucidating
uncharacterized properties of proteins; however, studies indicate that these methods should …

Fine-tuning protein language models boosts predictions across diverse tasks

R Schmirler, M Heinzinger, B Rost - Nature Communications, 2024 - nature.com
Prediction methods inputting embeddings from protein language models have reached or
even surpassed state-of-the-art performance on many protein prediction tasks. In natural …

Transfer learning to leverage larger datasets for improved prediction of protein stability changes

H Dieckhaus, M Brocidiacono, NZ Randolph… - Proceedings of the …, 2024 - pnas.org
Amino acid mutations that lower a protein's thermodynamic stability are implicated in
numerous diseases, and engineered proteins with enhanced stability can be important in …

Proteinnpt: Improving protein property prediction and design with non-parametric transformers

P Notin, R Weitzman, D Marks… - Advances in Neural …, 2023 - proceedings.neurips.cc
Protein design holds immense potential for optimizing naturally occurring proteins, with
broad applications in drug discovery, material design, and sustainability. However …