AlphaFold2 and its applications in the fields of biology and medicine

Z Yang, X Zeng, Y Zhao, R Chen - Signal Transduction and Targeted …, 2023 - nature.com
Abstract AlphaFold2 (AF2) is an artificial intelligence (AI) system developed by DeepMind
that can predict three-dimensional (3D) structures of proteins from amino acid sequences …

Protein representation learning by geometric structure pretraining

Z Zhang, M Xu, A Jamasb… - arxiv preprint arxiv …, 2022 - arxiv.org
Learning effective protein representations is critical in a variety of tasks in biology such as
predicting protein function or structure. Existing approaches usually pretrain protein …

Structure-informed language models are protein designers

Z Zheng, Y Deng, D Xue, Y Zhou… - … on machine learning, 2023 - proceedings.mlr.press
This paper demonstrates that language models are strong structure-based protein
designers. We present LM-Design, a generic approach to reprogramming sequence-based …

Saprot: Protein language modeling with structure-aware vocabulary

J Su, C Han, Y Zhou, J Shan, X Zhou, F Yuan - bioRxiv, 2023 - biorxiv.org
Large-scale protein language models (PLMs), such as the ESM family, have achieved
remarkable performance in various downstream tasks related to protein structure and …

Computational scoring and experimental evaluation of enzymes generated by neural networks

SR Johnson, X Fu, S Viknander, C Goldin… - Nature …, 2024 - nature.com
In recent years, generative protein sequence models have been developed to sample novel
sequences. However, predicting whether generated proteins will fold and function remains …

Machine learning for predicting protein properties: A comprehensive review

Y Wang, Y Zhang, X Zhan, Y He, Y Yang, L Cheng… - Neurocomputing, 2024 - Elsevier
In the field of protein engineering, the function and structure of proteins are key to
understanding cellular mechanisms, biological evolution, and biodiversity. With the …

PLMSearch: Protein language model powers accurate and fast sequence search for remote homology

W Liu, Z Wang, R You, C **e, H Wei, Y **ong… - Nature …, 2024 - nature.com
Homologous protein search is one of the most commonly used methods for protein
annotation and analysis. Compared to structure search, detecting distant evolutionary …

Diffusion language models are versatile protein learners

X Wang, Z Zheng, F Ye, D Xue, S Huang… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper introduces diffusion protein language model (DPLM), a versatile protein
language model that demonstrates strong generative and predictive capabilities for protein …

Peer: a comprehensive and multi-task benchmark for protein sequence understanding

M Xu, Z Zhang, J Lu, Z Zhu, Y Zhang… - Advances in …, 2022 - proceedings.neurips.cc
We are now witnessing significant progress of deep learning methods in a variety of tasks
(or datasets) of proteins. However, there is a lack of a standard benchmark to evaluate the …

Instructprotein: Aligning human and protein language via knowledge instruction

Z Wang, Q Zhang, K Ding, M Qin, X Zhuang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have revolutionized the field of natural language
processing, but they fall short in comprehending biological sequences such as proteins. To …