Hierarchical density estimates for data clustering, visualization, and outlier detection

RJGB Campello, D Moulavi, A Zimek… - ACM Transactions on …, 2015 - dl.acm.org
An integrated framework for density-based cluster analysis, outlier detection, and data
visualization is introduced in this article. The main module consists of an algorithm to …

Estimating the number of clusters in a data set via the gap statistic

R Tibshirani, G Walther, T Hastie - Journal of the Royal …, 2001 - Wiley Online Library
We propose a method (the 'gap statistic') for estimating the number of clusters (groups) in a
set of data. The technique uses the output of any clustering algorithm (eg K‐means or …

A review on modal clustering

G Menardi - International Statistical Review, 2016 - Wiley Online Library
In spite of the current availability of numerous methods of cluster analysis, evaluating a
clustering configuration is questionable without the definition of a true population structure …

Functional data analysis of amplitude and phase variation

JS Marron, JO Ramsay, LM Sangalli, A Srivastava - Statistical Science, 2015 - JSTOR
The abundance of functional observations in scientific endeavors has led to a significant
development in tools for functional data analysis (FDA). This kind of data comes with several …

Detecting the number of clusters in n-way probabilistic clustering

Z He, A Cichocki, S **e, K Choi - IEEE Transactions on Pattern …, 2010 - ieeexplore.ieee.org
Recently, there has been a growing interest in multiway probabilistic clustering. Some
efficient algorithms have been developed for this problem. However, not much attention has …

Generalized density clustering

A Rinaldo, L Wasserman - 2010 - projecteuclid.org
We study generalized density-based clustering in which sharply defined clusters such as
clusters on lower-dimensional manifolds are allowed. We show that accurate clustering is …

Clustering via nonparametric density estimation

A Azzalini, N Torelli - Statistics and Computing, 2007 - Springer
Although Hartigan (1975) had already put forward the idea of connecting identification of
subpopulations with regions with high density of the underlying probability distribution, the …

Randomized algorithms in automatic control and data mining

The authors start their book with basic question: Why is randomization beneficial in the
context of algorithms? Or, say it another way: When random choice is better than …

A generalized single linkage method for estimating the cluster tree of a density

W Stuetzle, R Nugent - Journal of Computational and Graphical …, 2010 - Taylor & Francis
The goal of clustering is to detect the presence of distinct groups in a dataset and assign
group labels to the observations. Nonparametric clustering is based on the premise that the …

On boundary estimation

A Cuevas, A Rodríguez-Casal - Advances in Applied Probability, 2004 - cambridge.org
We consider the problem of estimating the boundary of a compact set S⊂ ℝd from a random
sample of points taken from S. We use the Devroye-Wise estimator which is a union of balls …