Manifold learning: What, how, and why

M Meilă, H Zhang - Annual Review of Statistics and Its …, 2024‏ - annualreviews.org
Manifold learning (ML), also known as nonlinear dimension reduction, is a set of methods to
find the low-dimensional structure of data. Dimension reduction for large, high-dimensional …

Accelerated hierarchical density based clustering

L McInnes, J Healy - 2017 IEEE international conference on …, 2017‏ - ieeexplore.ieee.org
We present an accelerated algorithm for hierarchical density based clustering. Our new
algorithm improves upon HDBSCAN*, which itself provided a significant qualitative …

[ספר][B] Frontiers in massive data analysis

National Research Council, Division on Engineering… - 2013‏ - books.google.com
Data mining of massive data sets is transforming the way we think about crisis response,
marketing, entertainment, cybersecurity and national intelligence. Collections of documents …

Maximum inner-product search using cone trees

P Ram, AG Gray - Proceedings of the 18th ACM SIGKDD international …, 2012‏ - dl.acm.org
The problem of efficiently finding the best match for a query in a given set with respect to the
Euclidean distance or the cosine similarity has been extensively studied. However, the …

Density estimation trees

P Ram, AG Gray - Proceedings of the 17th ACM SIGKDD international …, 2011‏ - dl.acm.org
In this paper we develop density estimation trees (DETs), the natural analog of classification
trees and regression trees, for the task of density estimation. We consider the estimation of a …

[ספר][B] Advances in machine learning and data mining for astronomy

MJ Way, JD Scargle, KM Ali, AN Srivastava - 2012‏ - api.taylorfrancis.com
Advances in Machine Learning and Data Mining for Astronomy Page 1 W ay, Scargle, Chapman
& Hall/CRC Data Mining and Knowledge Discovery Series Advances in Machine Learning …

End-to-end differentiable clustering with associative memories

B Saha, D Krotov, MJ Zaki… - … Conference on Machine …, 2023‏ - proceedings.mlr.press
Clustering is a widely used unsupervised learning technique involving an intensive discrete
optimization problem. Associative Memory models or AMs are differentiable neural networks …

Conditional t-SNE: more informative t-SNE embeddings

B Kang, D Garcia Garcia, J Lijffijt, R Santos-Rodríguez… - Machine Learning, 2021‏ - Springer
Dimensionality reduction and manifold learning methods such as t-distributed stochastic
neighbor embedding (t-SNE) are frequently used to map high-dimensional data into a two …

Fast euclidean minimum spanning tree: algorithm, analysis, and applications

WB March, P Ram, AG Gray - Proceedings of the 16th ACM SIGKDD …, 2010‏ - dl.acm.org
The Euclidean Minimum Spanning Tree problem has applications in a wide range of fields,
and many efficient algorithms have been developed to solve it. We present a new, fast …

[PDF][PDF] Using the mutual k-nearest neighbor graphs for semi-supervised classification on natural language data

K Ozaki, M Shimbo, M Komachi… - Proceedings of the …, 2011‏ - aclanthology.org
The first step in graph-based semi-supervised classification is to construct a graph from input
data. While the k-nearest neighbor graphs have been the de facto standard method of graph …