Staleness-Reduction Mini-Batch -Means

X Zhu, J Sun, Z He, J Jiang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
-means (km) is a clustering algorithm that has been widely adopted due to its simple
implementation and high clustering quality. However, the standard km suffers from high …

Fault tolerant decentralised k-means clustering for asynchronous large-scale networks

G Di Fatta, F Blasa, S Cafiero, G Fortino - Journal of Parallel and Distributed …, 2013 - Elsevier
The K-Means algorithm for cluster analysis is one of the most influential and popular data
mining methods. Its straightforward parallel formulation is well suited for distributed memory …

Dynamic load balancing based on constrained kd tree decomposition for parallel particle tracing

J Zhang, H Guo, F Hong, X Yuan… - IEEE transactions on …, 2017 - ieeexplore.ieee.org
We propose a dynamically load-balanced algorithm for parallel particle tracing, which
periodically attempts to evenly redistribute particles across processes based on kd tree …

A Hybrid MPI/OpenMP Parallelization of -Means Algorithms Accelerated Using the Triangle Inequality

W Kwedlo, PJ Czochanski - Ieee Access, 2019 - ieeexplore.ieee.org
The standard formulation of the K-means clustering (Lloyd's method) performs many
unnecessary distance calculations. In this paper, we focus on four approaches that use the …

A Survey and Experimental Review on Data Distribution Strategies for Parallel Spatial Clustering Algorithms

JS Challa, N Goyal, A Sharma, N Sreekumar… - Journal of Computer …, 2024 - Springer
Abstract The advent of Big Data has led to the rapid growth in the usage of parallel
clustering algorithms that work over distributed computing frameworks such as MPI …

A new method to construct the KD tree based on presorted results

Y Cao, H Wang, W Zhao, B Duan, X Zhang - Complexity, 2020 - Wiley Online Library
Searching is one of the most fundamental operations in many complex systems. However,
the complexity of the search process would increase dramatically in high‐dimensional …

Data mining of mass storage based on cloud computing

J Wang, J Wan, Z Liu, P Wang - 2010 Ninth International …, 2010 - ieeexplore.ieee.org
Cloud computing is an elastic computing model that the users can lease the resources from
the rentable infrastructure. Cloud computing is gaining popularity due to its lower cost, high …

Efficient delaunay tessellation through KD tree decomposition

D Morozov, T Peterka - SC'16: Proceedings of the International …, 2016 - ieeexplore.ieee.org
Delaunay tessellations are fundamental data structures in computational geometry. They are
important in data analysis, where they can represent the geometry of a point set or …

Accelerated K-means algorithms for low-dimensional data on parallel shared-memory systems

W Kwedlo, M Łubowicz - IEEE Access, 2021 - ieeexplore.ieee.org
This paper considers the problem of exact accelerated algorithms for the K-means clustering
of low-dimensional data on modern multi-core systems. A version of the filtering algorithm …

Exact, fast and scalable parallel dbscan for commodity platforms

S Kumari, P Goyal, A Sood, D Kumar… - Proceedings of the 18th …, 2017 - dl.acm.org
DBSCAN is one of the most popular density-based clustering algorithm capable of
identifying arbitrary shaped clusters and noise. It is computationally expensive for large data …