A survey on unsupervised outlier detection in high‐dimensional numerical data

A Zimek, E Schubert, HP Kriegel - Statistical Analysis and Data …, 2012 - Wiley Online Library
High‐dimensional data in Euclidean space pose special challenges to data mining
algorithms. These challenges are often indiscriminately subsumed under the term 'curse of …

Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

C Böhm, S Berchtold, DA Keim - ACM Computing Surveys (CSUR), 2001 - dl.acm.org
During the last decade, multimedia databases have become increasingly important in many
application areas such as medicine, CAD, geography, and molecular biology. An important …

[BOEK][B] Modern information retrieval

R Baeza-Yates, B Ribeiro-Neto - 1999 - people.ischool.berkeley.edu
Information retrieval (IR) has changed considerably in recent years with the expansion of the
World Wide Web and the advent of modern and inexpensive graphical user interfaces and …

[PDF][PDF] Similarity search in high dimensions via hashing

A Gionis, P Indyk, R Motwani - Vldb, 1999 - cs.princeton.edu
The nearest-or near-neighbor query problems arise in a large variety of database
applications, usually in the context of similarity searching. Of late, there has been increasing …

Content-based image retrieval at the end of the early years

AWM Smeulders, M Worring, S Santini… - … on pattern analysis …, 2000 - ieeexplore.ieee.org
Presents a review of 200 references in content-based image retrieval. The paper starts with
discussing the working conditions of content-based retrieval: patterns of use, types of …

Fibonacci heaps and their uses in improved network optimization algorithms

ML Fredman, RE Tarjan - Journal of the ACM (JACM), 1987 - dl.acm.org
In this paper we develop a new data structure for implementing heaps (priority queues). Our
structure, Fibonacci heaps (abbreviated F-heaps), extends the binomial queues proposed …

On the surprising behavior of distance metrics in high dimensional space

CC Aggarwal, A Hinneburg, DA Keim - … London, UK, January 4–6, 2001 …, 2001 - Springer
In recent years, the effect of the curse of high dimensionality has been studied in great detail
on several problems such as clustering, nearest neighbor search, and indexing. In high …

When is “nearest neighbor” meaningful?

K Beyer, J Goldstein, R Ramakrishnan… - Database Theory—ICDT'99 …, 1999 - Springer
We explore the effect of dimensionality on the “nearest neighbor” problem. We show that
under a broad set of conditions (much broader than independent and identically distributed …

Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

HP Kriegel, P Kröger, A Zimek - … on knowledge discovery from data (tkdd …, 2009 - dl.acm.org
As a prolific research area in data mining, subspace clustering and related problems
induced a vast quantity of proposed solutions. However, many publications compare a new …

[PDF][PDF] A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces

R Weber, HJ Schek, S Blott - VLDB, 1998 - vldb.org
For similarity search in high-dimensional vector spaces (or 'HDVSs'), researchers have
proposed a number of new methods (or adaptations of existing methods) based, in the main …