UniProt: the universal protein knowledgebase in 2023

Nucleic acids research, 2023 - academic.oup.com
The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-
quality and freely accessible set of protein sequences annotated with functional information …

Novel machine learning approaches revolutionize protein knowledge

N Bordin, C Dallago, M Heinzinger, S Kim… - Trends in Biochemical …, 2023 - cell.com
Breakthrough methods in machine learning (ML), protein structure prediction, and novel
ultrafast structural aligners are revolutionizing structural biology. Obtaining accurate models …

Transformer architecture and attention mechanisms in genome data analysis: a comprehensive review

SR Choi, M Lee - Biology, 2023 - mdpi.com
Simple Summary The rapidly advancing field of deep learning, specifically transformer-
based architectures and attention mechanisms, has found substantial applicability in …

Nucleotide Transformer: building and evaluating robust foundation models for human genomics

H Dalla-Torre, L Gonzalez, J Mendoza-Revilla… - Nature …, 2024 - nature.com
The prediction of molecular phenotypes from DNA sequences remains a longstanding
challenge in genomics, often driven by limited annotated data and the inability to transfer …

Annotation of biologically relevant ligands in UniProtKB using ChEBI

E Coudert, S Gehant, E De Castro, M Pozzato… - …, 2023 - academic.oup.com
Motivation To provide high quality, computationally tractable annotation of binding sites for
biologically relevant (cognate) ligands in UniProtKB using the chemical ontology ChEBI …

<? sty\usepackage {wasysym}?> Bilingual language model for protein sequence and structure

M Heinzinger, K Weissenow… - NAR Genomics and …, 2024 - academic.oup.com
Adapting language models to protein sequences spawned the development of powerful
protein language models (pLMs). Concurrently, AlphaFold2 broke through in protein …

Contrastive learning in protein language space predicts interactions between drugs and protein targets

R Singh, S Sledzieski, B Bryson… - Proceedings of the …, 2023 - National Acad Sciences
Sequence-based prediction of drug–target interactions has the potential to accelerate drug
discovery by complementing experimental screens. Such computational prediction needs to …

Fine-tuning protein language models boosts predictions across diverse tasks

R Schmirler, M Heinzinger, B Rost - Nature Communications, 2024 - nature.com
Prediction methods inputting embeddings from protein language models have reached or
even surpassed state-of-the-art performance on many protein prediction tasks. In natural …

Embeddings from protein language models predict conservation and variant effects

C Marquet, M Heinzinger, T Olenyi, C Dallago… - Human genetics, 2022 - Springer
The emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret
the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational …

Contrastive learning on protein embeddings enlightens midnight zone

M Heinzinger, M Littmann, I Sillitoe… - NAR genomics and …, 2022 - academic.oup.com
Experimental structures are leveraged through multiple sequence alignments, or more
generally through homology-based inference (HBI), facilitating the transfer of information …