Controllable protein design with language models

N Ferruz, B Höcker - Nature Machine Intelligence, 2022 - nature.com
The twenty-first century is presenting humankind with unprecedented environmental and
medical challenges. The ability to design novel proteins tailored for specific purposes would …

Deep learning in protein structural modeling and design

W Gao, SP Mahajan, J Sulam, JJ Gray - Patterns, 2020 - cell.com
Deep learning is catalyzing a scientific revolution fueled by big data, accessible toolkits, and
powerful computational resources, impacting many fields, including protein structural …

Artificial intelligence for science in quantum, atomistic, and continuum systems

X Zhang, L Wang, J Helwig, Y Luo, C Fu, Y **e… - arxiv preprint arxiv …, 2023 - arxiv.org
Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural
sciences. Today, AI has started to advance natural sciences by improving, accelerating, and …

Protein representation learning by geometric structure pretraining

Z Zhang, M Xu, A Jamasb… - arxiv preprint arxiv …, 2022 - arxiv.org
Learning effective protein representations is critical in a variety of tasks in biology such as
predicting protein function or structure. Existing approaches usually pretrain protein …

Ankh: Optimized protein language model unlocks general-purpose modelling

A Elnaggar, H Essam, W Salah-Eldin… - arxiv preprint arxiv …, 2023 - arxiv.org
As opposed to scaling-up protein language models (PLMs), we seek improving performance
via protein-specific optimization. Although the proportionality between the language model …

MSA transformer

RM Rao, J Liu, R Verkuil, J Meier… - International …, 2021 - proceedings.mlr.press
Unsupervised protein language models trained across millions of diverse sequences learn
structure and function of proteins. Protein language models studied to date have been …

Frozen pretrained transformers as universal computation engines

K Lu, A Grover, P Abbeel, I Mordatch - Proceedings of the AAAI …, 2022 - ojs.aaai.org
We investigate the capability of a transformer pretrained on natural language to generalize
to other modalities with minimal finetuning--in particular, without finetuning of the self …

Structure-based protein function prediction using graph convolutional networks

V Gligorijević, PD Renfrew, T Kosciolek… - Nature …, 2021 - nature.com
The rapid increase in the number of proteins in sequence databases and the diversity of
their functions challenge computational approaches for automated function prediction. Here …

xTrimoPGLM: unified 100B-scale pre-trained transformer for deciphering the language of protein

B Chen, X Cheng, P Li, Y Geng, J Gong, S Li… - arxiv preprint arxiv …, 2024 - arxiv.org
Protein language models have shown remarkable success in learning biological information
from protein sequences. However, most existing models are limited by either autoencoding …

Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences

A Rives, J Meier, T Sercu, S Goyal, Z Lin, J Liu… - Proceedings of the …, 2021 - pnas.org
In the field of artificial intelligence, a combination of scale in data and model capacity
enabled by unsupervised learning has led to major advances in representation learning and …