A guide to machine learning for biologists
The expanding scale and inherent complexity of biological data have encouraged a growing
use of machine learning in biology to build informative and predictive models of the …
use of machine learning in biology to build informative and predictive models of the …
[HTML][HTML] Machine learning in protein structure prediction
M AlQuraishi - Current opinion in chemical biology, 2021 - Elsevier
Prediction of protein structure from sequence has been intensely studied for many decades,
owing to the problem's importance and its uniquely well-defined physical and computational …
owing to the problem's importance and its uniquely well-defined physical and computational …
Evolutionary-scale prediction of atomic-level protein structure with a language model
Recent advances in machine learning have leveraged evolutionary information in multiple
sequence alignments to predict protein structure. We demonstrate direct inference of full …
sequence alignments to predict protein structure. We demonstrate direct inference of full …
On the opportunities and risks of foundation models
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
Accurate prediction of protein structures and interactions using a three-track neural network
DeepMind presented notably accurate predictions at the recent 14th Critical Assessment of
Structure Prediction (CASP14) conference. We explored network architectures that …
Structure Prediction (CASP14) conference. We explored network architectures that …
Prottrans: Toward understanding the language of life through self-supervised learning
Computational biology and bioinformatics provide vast data gold-mines from protein
sequences, ideal for Language Models (LMs) taken from Natural Language Processing …
sequences, ideal for Language Models (LMs) taken from Natural Language Processing …
Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations
Structure prediction for proteins lacking homologous templates in the Protein Data Bank
(PDB) remains a significant unsolved problem. We developed a protocol, CI-TASSER, to …
(PDB) remains a significant unsolved problem. We developed a protocol, CI-TASSER, to …
Wilds: A benchmark of in-the-wild distribution shifts
Distribution shifts—where the training distribution differs from the test distribution—can
substantially degrade the accuracy of machine learning (ML) systems deployed in the wild …
substantially degrade the accuracy of machine learning (ML) systems deployed in the wild …
MSA transformer
Unsupervised protein language models trained across millions of diverse sequences learn
structure and function of proteins. Protein language models studied to date have been …
structure and function of proteins. Protein language models studied to date have been …
Learning the protein language: Evolution, structure, and function
Language models have recently emerged as a powerful machine-learning approach for
distilling information from massive protein sequence databases. From readily available …
distilling information from massive protein sequence databases. From readily available …