Guiding questions to avoid data leakage in biological machine learning applications
Abstract Machine learning methods for extracting patterns from high-dimensional data are
very important in the biological sciences. However, in certain cases, real-world applications …
very important in the biological sciences. However, in certain cases, real-world applications …
TEMPRO: nanobody melting temperature estimation model using protein embeddings
JAE Alvarez, SN Dean - Scientific Reports, 2024 - nature.com
Single-domain antibodies (sdAbs) or nanobodies have received widespread attention due
to their small size (~ 15 kDa) and diverse applications in bio-derived therapeutics. As many …
to their small size (~ 15 kDa) and diverse applications in bio-derived therapeutics. As many …
TemStaPro: protein thermostability prediction using sequence representations from protein language models
Motivation Reliable prediction of protein thermostability from its sequence is valuable for
both academic and industrial research. This prediction problem can be tackled using …
both academic and industrial research. This prediction problem can be tackled using …
[HTML][HTML] Transitioning from wet lab to artificial intelligence: a systematic review of AI predictors in CRISPR
The revolutionary CRISPR-Cas9 system leverages a programmable guide RNA (gRNA) and
Cas9 proteins to precisely cleave problematic regions within DNA sequences. This …
Cas9 proteins to precisely cleave problematic regions within DNA sequences. This …
TemBERTure: advancing protein thermostability prediction with deep learning and attention mechanisms
Motivation Understanding protein thermostability is essential for numerous biotechnological
applications, but traditional experimental methods are time-consuming, expensive, and error …
applications, but traditional experimental methods are time-consuming, expensive, and error …
PTSP-BERT: Predict the thermal stability of proteins using sequence-based bidirectional representations from transformer-embedded features
Z Lv, M Wei, H Pei, S Peng, M Li, L Jiang - Computers in Biology and …, 2025 - Elsevier
Thermophilic proteins, mesophiles proteins and psychrophilic proteins have wide industrial
applications, as enzymes with different optimal temperatures are often needed for different …
applications, as enzymes with different optimal temperatures are often needed for different …
ThermoFinder: A sequence-based thermophilic proteins prediction framework
Thermophilic proteins are important for academic research and industrial processes, and
various computational methods have been developed to identify and screen them. However …
various computational methods have been developed to identify and screen them. However …
Classifying alkaliphilic proteins using embeddings from protein language model
Alkaliphilic proteins have great potential as biocatalysts in biotechnology, especially for
enzyme engineering. Extensive research has focused on exploring the enzymatic potential …
enzyme engineering. Extensive research has focused on exploring the enzymatic potential …
RNA sequence analysis landscape: A comprehensive review of task types, databases, datasets, word embedding methods, and language models
Deciphering information of RNA sequences reveals their diverse roles in living organisms,
including gene regulation and protein synthesis. Aberrations in RNA sequence such as …
including gene regulation and protein synthesis. Aberrations in RNA sequence such as …
Leveraging protein language model embeddings and logistic regression for efficient and accurate in-silico acidophilic proteins classification
The increasing demand for eco-friendly technologies in biotechnology necessitates effective
and sustainable catalysts. Acidophilic proteins, functioning optimally in highly acidic …
and sustainable catalysts. Acidophilic proteins, functioning optimally in highly acidic …