Survey of state-of-the-art mixed data clustering algorithms

A Ahmad, SS Khan - Ieee Access, 2019 - ieeexplore.ieee.org
Mixed data comprises both numeric and categorical features, and mixed datasets occur
frequently in many domains, such as health, finance, and marketing. Clustering is often …

Skinny-dip: clustering in a sea of noise

S Maurus, C Plant - Proceedings of the 22nd ACM SIGKDD international …, 2016 - dl.acm.org
Can we find heterogeneous clusters hidden in data sets with 80% noise? Although such
settings occur in the real-world, we struggle to find methods from the abundance of …

Using knowledge units of programming languages to recommend reviewers for pull requests: an empirical study

M Ahasanuzzaman, GA Oliva, AE Hassan - Empirical Software …, 2024 - Springer
Determining the right code reviewer for a given code change requires understanding the
characteristics of the changed code, identifying the skills of each potential reviewer …

Towards an optimal subspace for k-means

D Mautz, W Ye, C Plant, C Böhm - Proceedings of the 23rd ACM SIGKDD …, 2017 - dl.acm.org
Is there an optimal dimensionality reduction for k-means, revealing the prominent cluster
structure hidden in the data? We propose SUBKMEANS, which extends the classic k-means …

Non-redundant subspace clusterings with nr-kmeans and nr-dipmeans

D Mautz, W Ye, C Plant, C Böhm - ACM Transactions on Knowledge …, 2020 - dl.acm.org
A huge object collection in high-dimensional space can often be clustered in more than one
way, for instance, objects could be clustered by their shape or alternatively by their color …

Density-based multiscale analysis for clustering in strong noise settings with varying densities

TT Zhang, B Yuan - IEEE Access, 2018 - ieeexplore.ieee.org
Finding meaningful clustering patterns in data can be very challenging when the clusters are
of arbitrary shapes, different sizes, or densities, and especially when the data set contains …

[PDF][PDF] Details (Don't) Matter: Isolating Cluster Information in Deep Embedded Spaces.

L Miklautz, LGM Bauer, D Mautz, S Tschiatschek… - IJCAI, 2021 - ijcai.org
Deep clustering techniques combine representation learning with clustering objectives to
improve their performance. Among existing deep clustering techniques, autoencoder-based …

Enhancing cluster analysis via topological manifold learning

M Herrmann, D Kazempour, F Scheipl… - Data Mining and …, 2024 - Springer
We discuss topological aspects of cluster analysis and show that inferring the topological
structure of a dataset before clustering it can considerably enhance cluster detection: we …

[PDF][PDF] Large-scale subspace clustering by fast regression coding

J Li, H Zhao - IJCAI, 2017 - par.nsf.gov
Abstract Large-Scale Subspace Clustering (LSSC) is an interesting and important problem
in big data era. However, most existing methods (ie, sparse or low-rank subspace clustering) …

Extension of the Dip-test Repertoire-Efficient and Differentiable p-value Calculation for Clustering

LGM Bauer, C Leiber, C Böhm, C Plant - Proceedings of the 2023 SIAM …, 2023 - SIAM
Over the last decade, the Dip-test of unimodality has gained increasing interest in the data
mining community as it is a parameter-free statistical test that reliably rates the modality in …