Deep learning methods for de novo peptide sequencing

W Bittremieux, V Ananth, WE Fondrie… - Mass Spectrometry …, 2024 - Wiley Online Library
Protein tandem mass spectrometry data are most often interpreted by matching observed
mass spectra to a protein database derived from the reference genome of the sample being …

Machine learning strategies to tackle data challenges in mass spectrometry-based proteomics

C Dens, C Adams, K Laukens… - Journal of the American …, 2024 - ACS Publications
In computational proteomics, machine learning (ML) has emerged as a vital tool for
enhancing data analysis. Despite significant advancements, the diversity of ML model …

π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing

X Zhang, T Ling, Z **, S Xu, Z Gao, B Sun… - Nature …, 2025 - nature.com
Peptide sequencing via tandem mass spectrometry (MS/MS) is essential in proteomics.
Unlike traditional database searches, deep learning excels at de novo peptide sequencing …

A multi-species benchmark for training and validating mass spectrometry proteomics machine learning models

B Wen, WS Noble - Scientific Data, 2024 - nature.com
Training machine learning models for tasks such as de novo sequencing or spectral
clustering requires large collections of confidently identified spectra. Here we describe a …

Contrastive meta-reinforcement learning for heterogeneous graph neural architecture search

Z Xu, J Wu - Expert Systems with Applications, 2025 - Elsevier
Abstract Heterogeneous Graph Neural Networks (HGNNs) have demonstrated significant
success in capturing complex interactions within heterogeneous graphs to learn graph …

A transformer model for de novo sequencing of data-independent acquisition mass spectrometry data

J Sanders, B Wen, P Rudnick, R Johnson, CC Wu… - bioRxiv, 2024 - biorxiv.org
A core computational challenge in the analysis of mass spectrometry data is the de novo
sequencing problem, in which the generating amino acid sequence is inferred directly from …

Counting Ability of Large Language Models and Impact of Tokenization

X Zhang, J Cao, C You - arxiv preprint arxiv:2410.19730, 2024 - arxiv.org
Transformers, the backbone of modern large language models (LLMs), face inherent
architectural limitations that impede their reasoning capabilities. Unlike recurrent networks …

NeoMS: Mass Spectrometry-based Method for Uncovering Mutated MHC-I Neoantigens

S Wang, M Zhu, B Ma - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
Major Histocompatibility Complex (MHC) molecules play a critical role in the immune system
by presenting peptides on the cell surface for recognition by T-cells. Tumor cells often …

π-PrimeNovo: An Accurate and Efficient Non-Autoregressive Deep Learning Model for De Novo Peptide Sequencing

X Zhang, T Ling, Z **, S Xu, Z Gao, B Sun, Z Qiu… - bioRxiv, 2024 - biorxiv.org
Peptide sequencing via tandem mass spectrometry (MS/MS) is fundamental in proteomics
data analysis, playing a pivotal role in unraveling the complex world of proteins within …

Deep Learning Methods for Novel Peptide Discovery and Function Prediction

S Wang - 2024 - uwspace.uwaterloo.ca
This thesis explores deep learning methods for protein identification and property prediction,
encompassing two primary areas: mass spectrometry-based protein sequence identification …