A survey of clustering algorithms for big data: Taxonomy and empirical analysis
Clustering algorithms have emerged as an alternative powerful meta-learning tool to
accurately analyze the massive volume of data generated by modern applications. In …
accurately analyze the massive volume of data generated by modern applications. In …
Performance evaluation of some clustering algorithms and validity indices
In this article, we evaluate the performance of three clustering algorithms, hard K-Means,
single linkage, and a simulated annealing (SA) based technique, in conjunction with four …
single linkage, and a simulated annealing (SA) based technique, in conjunction with four …
An empirical comparison of four initialization methods for the k-means algorithm
In this paper, we aim to compare empirically four initialization methods for the K-Means
algorithm: random, Forgy, MacQueen and Kaufman. Although this algorithm is known for its …
algorithm: random, Forgy, MacQueen and Kaufman. Although this algorithm is known for its …
[KNIHA][B] Constrained clustering: Advances in algorithms, theory, and applications
This volume encompasses many new types of constraints and clustering methods as well as
delivers thorough coverage of the capabilities and limitations of constrained clustering. With …
delivers thorough coverage of the capabilities and limitations of constrained clustering. With …
The effectiveness of Lloyd-type methods for the k-means problem
We investigate variants of Lloyd's heuristic for clustering high-dimensional data in an attempt
to explain its popularity (a half century after its introduction) among practitioners, and in …
to explain its popularity (a half century after its introduction) among practitioners, and in …
Pulse: Mining customer opinions from free text
We present a prototype system, code-named Pulse, for mining topics and sentiment
orientation jointly from free text customer feedback. We describe the application of the …
orientation jointly from free text customer feedback. We describe the application of the …
Correlation clustering in general weighted graphs
We consider the following general correlation-clustering problem [N. Bansal, A. Blum, S.
Chawla, Correlation clustering, in: Proc. 43rd Annu. IEEE Symp. on Foundations of …
Chawla, Correlation clustering, in: Proc. 43rd Annu. IEEE Symp. on Foundations of …
[PDF][PDF] A unified framework for model-based clustering
Abstract Model-based clustering techniques have been widely used and have shown
promising results in many applications involving complex data. This paper presents a unified …
promising results in many applications involving complex data. This paper presents a unified …
An experimental comparison of model-based clustering methods
We compare the three basic algorithms for model-based clustering on high-dimensional
discrete-variable datasets. All three algorithms use the same underlying model: a naive …
discrete-variable datasets. All three algorithms use the same underlying model: a naive …
Adaptive dimension reduction for clustering high dimensional data
It is well-known that for high dimensional data clustering, standard algorithms such as EM
and K-means are often trapped in a local minimum. Many initialization methods have been …
and K-means are often trapped in a local minimum. Many initialization methods have been …