Utilizing graph machine learning within drug discovery and development

T Gaudelet, B Day, AR Jamasb, J Soman… - Briefings in …, 2021 - academic.oup.com
Graph machine learning (GML) is receiving growing interest within the pharmaceutical and
biotechnology industries for its ability to model biomolecular structures, the functional …

Language models enable zero-shot prediction of the effects of mutations on protein function

J Meier, R Rao, R Verkuil, J Liu… - Advances in neural …, 2021 - proceedings.neurips.cc
Modeling the effect of sequence variation on function is a fundamental problem for
understanding and designing proteins. Since evolution encodes information about function …

Prottrans: Toward understanding the language of life through self-supervised learning

A Elnaggar, M Heinzinger, C Dallago… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
Computational biology and bioinformatics provide vast data gold-mines from protein
sequences, ideal for Language Models (LMs) taken from Natural Language Processing …

Long range arena: A benchmark for efficient transformers

Y Tay, M Dehghani, S Abnar, Y Shen, D Bahri… - arxiv preprint arxiv …, 2020 - arxiv.org
Transformers do not scale very well to long sequence lengths largely because of quadratic
self-attention complexity. In the recent months, a wide spectrum of efficient, fast Transformers …

Generative pretraining from pixels

M Chen, A Radford, R Child, J Wu… - International …, 2020 - proceedings.mlr.press
Inspired by progress in unsupervised representation learning for natural language, we
examine whether similar models can learn useful representations for images. We train a …

MSA transformer

RM Rao, J Liu, R Verkuil, J Meier… - International …, 2021 - proceedings.mlr.press
Unsupervised protein language models trained across millions of diverse sequences learn
structure and function of proteins. Protein language models studied to date have been …

Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences

A Rives, J Meier, T Sercu, S Goyal, Z Lin, J Liu… - Proceedings of the …, 2021 - pnas.org
In the field of artificial intelligence, a combination of scale in data and model capacity
enabled by unsupervised learning has led to major advances in representation learning and …

Bertology meets biology: Interpreting attention in protein language models

J Vig, A Madani, LR Varshney, C **ong… - arxiv preprint arxiv …, 2020 - arxiv.org
Transformer architectures have proven to learn useful representations for protein
classification and generation tasks. However, these representations present challenges in …

Progen: Language modeling for protein generation

A Madani, B McCann, N Naik, NS Keskar… - arxiv preprint arxiv …, 2020 - arxiv.org
Generative modeling for protein engineering is key to solving fundamental problems in
synthetic biology, medicine, and material science. We pose protein engineering as an …

Using deep learning to annotate the protein universe

ML Bileschi, D Belanger, DH Bryant, T Sanderson… - Nature …, 2022 - nature.com
Understanding the relationship between amino acid sequence and protein function is a long-
standing challenge with far-reaching scientific and translational implications. State-of-the-art …