Retrotransposons in plant genomes: structure, identification, and classification through bioinformatics and machine learning

S Orozco-Arias, G Isaza, R Guyot - International journal of molecular …, 2019 - mdpi.com
Transposable elements (TEs) are genomic units able to move within the genome of virtually
all organisms. Due to their natural repetitive numbers and their high structural diversity, the …

Gene prediction based on DNA spectral analysis: a literature review

SA Marhon, SC Kremer - Journal of computational biology, 2011 - liebertpub.com
The identification of regions of DNA sequences that code for proteins is one of the most
fundamental applications in bioinformatics. These protein-coding regions are in contrast to …

Survey on encoding schemes for genomic data representation and feature learning—from signal processing to machine learning

N Yu, Z Li, Z Yu - Big Data Mining and Analytics, 2018 - ieeexplore.ieee.org
Data-driven machine learning, especially deep learning technology, is becoming an
important tool for handling big data issues in bioinformatics. In machine learning, DNA …

Measuring performance metrics of machine learning algorithms for detecting and classifying transposable elements

S Orozco-Arias, JS Piña, R Tabares-Soto… - Processes, 2020 - mdpi.com
Because of the promising results obtained by machine learning (ML) approaches in several
fields, every day is more common, the utilization of ML to solve problems in bioinformatics. In …

Lung cancer prediction using neural network ensemble with histogram of oriented gradient genomic features

E Adetiba, OO Olugbara - The Scientific World Journal, 2015 - Wiley Online Library
This paper reports an experimental comparison of artificial neural network (ANN) and
support vector machine (SVM) ensembles and their “nonensemble” variants for lung cancer …

Classification of SARS-CoV-2 and non-SARS-CoV-2 using machine learning algorithms

OP Singh, M Vallejo, IM El-Badawy, A Aysha… - Computers in biology …, 2021 - Elsevier
Due to the continued evolution of the SARS-CoV-2 pandemic, researchers worldwide are
working to mitigate, suppress its spread, and better understand it by deploying digital signal …

Genome analysis with inter-nucleotide distances

V Afreixo, CAC Bastos, AJ Pinho, SP Garcia… - …, 2009 - academic.oup.com
Motivation: DNA sequences can be represented by sequences of four symbols, but it is often
useful to convert the symbols into real or complex numbers for further analysis. Several …

K-mer-based machine learning method to classify LTR-retrotransposons in plant genomes

S Orozco-Arias, MS Candamil-Cortés, PA Jaimes… - PeerJ, 2021 - peerj.com
Every day more plant genomes are available in public databases and additional massive
sequencing projects (ie, that aim to sequence thousands of individuals) are formulated and …

Protein sequence comparison based on representation on a finite dimensional unit hypercube

S Ghosh, J Pal, C Cattani, B Maji… - Journal of …, 2024 - Taylor & Francis
Numerous techniques are used to compare protein sequences based on the values of the
physiochemical properties of amino acids. In this work, a single physical/chemical property …

On DNA numerical representations for genomic similarity computation

G Mendizabal-Ruiz, I Román-Godínez… - PloS one, 2017 - journals.plos.org
Genomic signal processing (GSP) refers to the use of signal processing for the analysis of
genomic data. GSP methods require the transformation or map** of the genomic data to a …