[SÁCH][B] An introduction to outlier analysis

CC Aggarwal, CC Aggarwal - 2017 - Springer
Outliers are also referred to as abnormalities, discordants, deviants, or anomalies in the data
mining and statistics literature. In most applications, the data is created by one or more …

K-means clustering with outlier removal

G Gan, MKP Ng - Pattern Recognition Letters, 2017 - Elsevier
Outlier detection is an important data analysis task in its own right and removing the outliers
from clusters can improve the clustering accuracy. In this paper, we extend the k-means …

[SÁCH][B] Outlier ensembles

CC Aggarwal, CC Aggarwal - 2017 - Springer
Ensemble analysis is a popular method used to improve the accuracy of various data mining
algorithms. Ensemble methods combine the outputs of multiple algorithms or base detectors …

Local search methods for k-means with outliers

S Gupta, R Kumar, K Lu, B Moseley… - Proceedings of the VLDB …, 2017 - dl.acm.org
We study the problem of k-means clustering in the presence of outliers. The goal is to cluster
a set of data points to minimize the variance of the points assigned to the same cluster, with …

Constant approximation for k-median and k-means with outliers via iterative rounding

R Krishnaswamy, S Li, S Sandeep - Proceedings of the 50th annual ACM …, 2018 - dl.acm.org
In this paper, we present a new iterative rounding framework for many clustering problems.
Using this, we obtain an (α1+ є≤ 7.081+ є)-approximation algorithm for k-median with …

Clustering with outlier removal

H Liu, J Li, Y Wu, Y Fu - IEEE transactions on knowledge and …, 2019 - ieeexplore.ieee.org
Cluster analysis and outlier detection are two continuously rising topics in data mining area,
which in fact connect to each other deeply. Cluster structure is vulnerable to outliers; …

Efficiency of random swap clustering

P Fränti - Journal of big data, 2018 - Springer
Random swap algorithm aims at solving clustering by a sequence of prototype swaps, and
by fine-tuning their exact location by k-means. This randomized search strategy is simple to …

A local search algorithm for k-means with outliers

Z Zhang, Q Feng, J Huang, Y Guo, J Xu, J Wang - Neurocomputing, 2021 - Elsevier
Abstract k-Means is a well-studied clustering problem that finds applications in many fields
related to unsupervised learning. It is known that k-means clustering is highly sensitive to the …

Expected similarity estimation for large-scale batch and streaming anomaly detection

M Schneider, W Ertel, F Ramos - Machine Learning, 2016 - Springer
We present a novel algorithm for anomaly detection on very large datasets and data
streams. The method, named EXPected Similarity Estimation (expose), is kernel-based and …

Size matters: Cardinality-constrained clustering and outlier detection via conic optimization

N Rujeerapaiboon, K Schindler, D Kuhn… - SIAM Journal on …, 2019 - SIAM
Plain vanilla K-means clustering has proven to be successful in practice, yet it suffers from
outlier sensitivity and may produce highly unbalanced clusters. To mitigate both …