A guided tour to approximate string matching
G Navarro - ACM computing surveys (CSUR), 2001 - dl.acm.org
We survey the current techniques to cope with the problem of string matching that allows
errors. This is becoming a more and more relevant issue for many fast growing areas such …
errors. This is becoming a more and more relevant issue for many fast growing areas such …
The taxonomic name resolution service: an online tool for automated standardization of plant names
Background The digitization of biodiversity data is leading to the widespread application of
taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records …
taxon names that are superfluous, ambiguous or incorrect, resulting in mismatched records …
Approximate string-matching with q-grams and maximal matches
E Ukkonen - Theoretical computer science, 1992 - Elsevier
We study approximate string-matching in connection with two string distance functions that
are computable in linear time. The first function is based on the so-called q-grams. An …
are computable in linear time. The first function is based on the so-called q-grams. An …
[LIBRO][B] Computational molecular biology: an algorithmic approach
P Pevzner - 2000 - books.google.com
In one of the first major texts in the emerging field of computational molecular biology, Pavel
Pevzner covers a broad range of algorithmic and combinatorial topics and shows how they …
Pevzner covers a broad range of algorithmic and combinatorial topics and shows how they …
Fast string correction with Levenshtein automata
KU Schulz, S Mihov - International Journal on Document Analysis and …, 2002 - Springer
The Levenshtein distance between two words is the minimal number of insertions, deletions
or substitutions that are needed to transform one word into the other. Levenshtein automata …
or substitutions that are needed to transform one word into the other. Levenshtein automata …
[LIBRO][B] String searching algorithms
GA Stephen - 1994 - books.google.com
String searching is a subject of both theoretical and practical interest in computer science.
This book presents a bibliographic overview of the field and an anthology of detailed …
This book presents a bibliographic overview of the field and an anthology of detailed …
[HTML][HTML] Privacy-preserving record linkage on large real world datasets
Record linkage typically involves the use of dedicated linkage units who are supplied with
personally identifying information to determine individuals from within and across datasets …
personally identifying information to determine individuals from within and across datasets …
Two algorithms for approxmate string matching in static texts
P Jokinen, E Ukkonen - International Symposium on Mathematical …, 1991 - Springer
The problem of finding all approximate occurrences P′ of a pattern string P in a text string T
such that the edit distance between P and P′ is≤ k is considered. We concentrate on a …
such that the edit distance between P and P′ is≤ k is considered. We concentrate on a …
RazerS—fast read map** with sensitivity control
Second-generation sequencing technologies deliver DNA sequence data at unprecedented
high throughput. Common to most biological applications is a map** of the reads to an …
high throughput. Common to most biological applications is a map** of the reads to an …
Indexing methods for approximate dictionary searching: Comparative analysis
L Boytsov - Journal of Experimental Algorithmics (JEA), 2011 - dl.acm.org
The primary goal of this article is to survey state-of-the-art indexing methods for approximate
dictionary searching. To improve understanding of the field, we introduce a taxonomy that …
dictionary searching. To improve understanding of the field, we introduce a taxonomy that …