Time-series data mining

P Esling, C Agon - ACM Computing Surveys (CSUR), 2012 - dl.acm.org
In almost every scientific field, measurements are performed over time. These observations
lead to a collection of organized data called time series. The purpose of time-series data …

Alignment-free sequence comparison—a review

S Vinga, J Almeida - Bioinformatics, 2003 - academic.oup.com
Motivation: Genetic recombination and, in particular, genetic shuffling are at odds with
sequence comparison by alignment, which assumes conservation of contiguity between …

[CARTE][B] Ten lectures on wavelets

I Daubechies - 1992 - SIAM
Wavelets are a relatively recent development in applied mathematics. Their name itself was
coined approximately a decade ago (Morlet, Arens, Fourgeau, and Giard (1982), Morlet …

The similarity metric

M Li, X Chen, X Li, B Ma… - IEEE transactions on …, 2004 - ieeexplore.ieee.org
A new class of distances appropriate for measuring similarity relations between sequences,
say one type of similarity per distance, is studied. We propose a new" normalized …

Towards parameter-free data mining

E Keogh, S Lonardi, CA Ratanamahatana - Proceedings of the tenth …, 2004 - dl.acm.org
Most data mining algorithms require the setting of many input parameters. Two main
dangers of working with parameter-laden algorithms are the following. First, incorrect …

Causal inference using the algorithmic Markov condition

D Janzing, B Schölkopf - IEEE Transactions on Information …, 2010 - ieeexplore.ieee.org
Inferring the causal structure that links n observables is usually based upon detecting
statistical dependences and choosing simple graphs that make the joint measure …

Static analysis tools as early indicators of pre-release defect density

N Nagappan, T Ball - Proceedings of the 27th international conference …, 2005 - dl.acm.org
During software development it is helpful to obtain early estimates of the defect density of
software components. Such estimates identify fault-prone areas of code requiring further …

A new sequence distance measure for phylogenetic tree construction

HH Otu, K Sayood - Bioinformatics, 2003 - academic.oup.com
Motivation: Most existing approaches for phylogenetic inference use multiple alignment of
sequences and assume some sort of an evolutionary model. The multiple alignment strategy …

Shared information and program plagiarism detection

X Chen, B Francia, M Li, B Mckinnon… - IEEE Transactions on …, 2004 - ieeexplore.ieee.org
A fundamental question in information theory and in computer science is how to measure
similarity or the amount of shared information between two sequences. We have proposed a …

A simple statistical algorithm for biological sequence compression

MD Cao, TI Dix, L Allison… - 2007 Data Compression …, 2007 - ieeexplore.ieee.org
This paper introduces a novel algorithm for biological sequence compression that makes
use of both statistical properties and repetition within sequences. A panel of experts is …