Amino acid encoding methods for protein sequences: a comprehensive review and assessment

X **g, Q Dong, D Hong, R Lu - IEEE/ACM transactions on …, 2019 - ieeexplore.ieee.org
As the first step of machine-learning based protein structure and function prediction, the
amino acid encoding play a fundamental role in the final success of those methods. Different …

Continuous distributed representation of biological sequences for deep proteomics and genomics

E Asgari, MRK Mofrad - PloS one, 2015 - journals.plos.org
We introduce a new representation and feature extraction method for biological sequences.
Named bio-vectors (BioVec) to refer to biological sequences in general with protein-vectors …

PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme

A Li, J Zhang, Z Zhou - BMC bioinformatics, 2014 - Springer
Background High-throughput transcriptome sequencing (RNA-seq) technology promises to
discover novel protein-coding and non-coding transcripts, particularly the identification of …

Novel Bioactive Peptides from Meretrix meretrix Protect Caenorhabditis elegans against Free Radical-Induced Oxidative Stress through the Stress Response Factor …

W Jia, Q Peng, L Su, X Yu, CW Ma, M Liang, X Yin… - Marine drugs, 2018 - mdpi.com
The hard clam Meretrix meretrix, which has been traditionally used as medicine and
seafood, was used in this study to isolate antioxidant peptides. First, a peptide-rich extract …

Learning to predict single-wall carbon nanotube-recognition DNA sequences

Y Yang, M Zheng, A Jagota - Npj Computational Materials, 2019 - nature.com
DNA/single-wall carbon nanotube (SWCNT) hybrids have enabled many applications
because of their special ability to disperse and sort SWCNTs by their chirality and …

VacPred: Sequence-based prediction of plant vacuole proteins using machine-learning techniques

AK Yadav, D Singla - Journal of Biosciences, 2020 - Springer
Subcellular localization prediction of the proteome is one of major goals of large-scale
genome or proteome sequencing projects to define the gene functions that could be …

Identification of sucrose synthase from Micractinium conductrix to favor biocatalytic glycosylation

K Chen, L Lin, R Ma, J Ding, H Pan, Y Tao… - Frontiers in …, 2023 - frontiersin.org
Sucrose synthase (SuSy, EC 2.4. 1.13) is a unique glycosyltransferase (GT) for develo**
cost-effective glycosylation processes. Up to now, some SuSys derived from plants and …

[HTML][HTML] A surrogate model of sigma profile and cosmosac activity coefficient predictions of using transformer with smiles input

JL Kang, CT Chiu, JS Huang, DSH Wong - Digital Chemical Engineering, 2022 - Elsevier
COSMOSAC is a model that allows apriori predictions of activity coefficients for
characterizing solute-solvent interactions. The method requires the input of sigma profile, the …

Identification of cytokine via an improved genetic algorithm

X Zeng, S Yuan, X Huang, Q Zou - Frontiers of Computer Science, 2015 - Springer
With the explosive growth in the number of protein sequences generated in the postgenomic
age, research into identifying cytokines from proteins and detecting their biochemical …

Numeric Lyndon-based feature embedding of sequencing reads for machine learning approaches

P Bonizzoni, M Costantini, C De Felice, A Petescia… - Information …, 2022 - Elsevier
Feature embedding methods have been proposed in the literature to represent sequences
as numeric vectors to be used in some bioinformatics investigations, such as family …