Computational cluster validation in post-genomic data analysis
Motivation The discovery of novel biological knowledge from the ab initio analysis of post-
genomic data relies upon the use of unsupervised processing methods, in particular …
genomic data relies upon the use of unsupervised processing methods, in particular …
A roadmap of clustering algorithms: finding a match for a biomedical application
Clustering is ubiquitously applied in bioinformatics with hierarchical clustering and k-means
partitioning being the most popular methods. Numerous improvements of these two …
partitioning being the most popular methods. Numerous improvements of these two …
High-throughput genome scaffolding from in vivo DNA interaction frequency
Despite advances in DNA sequencing technology, assembly of complex genomes remains
a major challenge, particularly for genomes sequenced using short reads, which yield highly …
a major challenge, particularly for genomes sequenced using short reads, which yield highly …
Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space
Motivation: UPGMA (average linking) is probably the most popular algorithm for hierarchical
data clustering, especially in computational biology. However, UPGMA requires the entire …
data clustering, especially in computational biology. However, UPGMA requires the entire …
Three invariant Hi-C interaction patterns: applications to genome assembly
S Oddes, A Zelig, N Kaplan - Methods, 2018 - Elsevier
Assembly of reference-quality genomes from next-generation sequencing data is a key
challenge in genomics. Recently, we and others have shown that Hi-C data can be used to …
challenge in genomics. Recently, we and others have shown that Hi-C data can be used to …
A generalized enhanced quantum fuzzy approach for efficient data clustering
Data clustering is a challenging task to gain insights into data in various fields. In this paper,
an Enhanced Quantum-Inspired Evolutionary Fuzzy C-Means (EQIE-FCM) algorithm is …
an Enhanced Quantum-Inspired Evolutionary Fuzzy C-Means (EQIE-FCM) algorithm is …
Functional annotation prediction: all for one and one for all
In an era of rapid genome sequencing and high‐throughput technology, automatic function
prediction for a novel sequence is of utter importance in bioinformatics. While automatic …
prediction for a novel sequence is of utter importance in bioinformatics. While automatic …
EVEREST: automatic identification and classification of protein domains in all protein sequences
Background Proteins are comprised of one or several building blocks, known as domains.
Such domains can be classified into families according to their evolutionary origin. Whereas …
Such domains can be classified into families according to their evolutionary origin. Whereas …
Model order selection for bio-molecular data clustering
A Bertoni, G Valentini - BMC bioinformatics, 2007 - Springer
Background Cluster analysis has been widely applied for investigating structure in bio-
molecular data. A drawback of most clustering algorithms is that they cannot automatically …
molecular data. A drawback of most clustering algorithms is that they cannot automatically …
Gene cluster statistics with gene families
N Raghupathy, D Durand - Molecular biology and evolution, 2009 - academic.oup.com
Identifying genomic regions that descended from a common ancestor is important for
understanding the function and evolution of genomes. In distantly related genomes, clusters …
understanding the function and evolution of genomes. In distantly related genomes, clusters …