Scientific large language models: A survey on biological & chemical domains

Q Zhang, K Ding, T Lv, X Wang, Q Yin, Y Zhang… - ACM Computing …, 2024 - dl.acm.org
Large Language Models (LLMs) have emerged as a transformative power in enhancing
natural language comprehension, representing a significant stride toward artificial general …

Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models

Y Qiu, GW Wei - Briefings in bioinformatics, 2023 - academic.oup.com
Protein engineering is an emerging field in biotechnology that has the potential to
revolutionize various areas, such as antibody design, drug discovery, food security, ecology …

OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization

G Ahdritz, N Bouatta, C Floristean, S Kadyan, Q **a… - Nature …, 2024 - nature.com
AlphaFold2 revolutionized structural biology with the ability to predict protein structures with
exceptionally high accuracy. Its implementation, however, lacks the code and data required …

Learning inverse folding from millions of predicted structures

C Hsu, R Verkuil, J Liu, Z Lin, B Hie… - International …, 2022 - proceedings.mlr.press
We consider the problem of predicting a protein sequence from its backbone atom
coordinates. Machine learning approaches to this problem to date have been limited by the …

US-align: universal structure alignments of proteins, nucleic acids, and macromolecular complexes

C Zhang, M Shine, AM Pyle, Y Zhang - Nature methods, 2022 - nature.com
Abstract Structure comparison and alignment are of fundamental importance in structural
biology studies. We developed the first universal platform, US-align, to uniformly align …

Protein remote homology detection and structural alignment using deep learning

T Hamamsy, JT Morton, R Blackwell, D Berenberg… - Nature …, 2024 - nature.com
Exploiting sequence–structure–function relationships in biotechnology requires improved
methods for aligning proteins that have low sequence similarity to previously annotated …

Artificial intelligence for science in quantum, atomistic, and continuum systems

X Zhang, L Wang, J Helwig, Y Luo, C Fu, Y **e… - arxiv preprint arxiv …, 2023 - arxiv.org
Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural
sciences. Today, AI has started to advance natural sciences by improving, accelerating, and …

Protein structure prediction with in-cell photo-crosslinking mass spectrometry and deep learning

K Stahl, A Graziadei, T Dau, O Brock… - Nature …, 2023 - nature.com
While AlphaFold2 can predict accurate protein structures from the primary sequence,
challenges remain for proteins that undergo conformational changes or for which few …

Structure-informed language models are protein designers

Z Zheng, Y Deng, D Xue, Y Zhou… - … on machine learning, 2023 - proceedings.mlr.press
This paper demonstrates that language models are strong structure-based protein
designers. We present LM-Design, a generic approach to reprogramming sequence-based …

BeStSel: webserver for secondary structure and fold prediction for protein CD spectroscopy

A Micsonai, É Moussong, F Wien, E Boros… - Nucleic acids …, 2022 - academic.oup.com
Circular dichroism (CD) spectroscopy is widely used to characterize the secondary structure
composition of proteins. To derive accurate and detailed structural information from the CD …