Statistical inference on random dot product graphs: a survey

A Athreya, DE Fishkind, M Tang, CE Priebe… - Journal of Machine …, 2018 - jmlr.org
The random dot product graph (RDPG) is an independent-edge random graph that is
analytically tractable and, simultaneously, either encompasses or can successfully …

Provable guarantees for self-supervised deep learning with spectral contrastive loss

JZ HaoChen, C Wei, A Gaidon… - Advances in Neural …, 2021 - proceedings.neurips.cc
Recent works in self-supervised learning have advanced the state-of-the-art by relying on
the contrastive learning paradigm, which learns representations by pushing positive pairs, or …

Connect, not collapse: Explaining contrastive learning for unsupervised domain adaptation

K Shen, RM Jones, A Kumar, SM **e… - International …, 2022 - proceedings.mlr.press
We consider unsupervised domain adaptation (UDA), where labeled data from a source
domain (eg, photos) and unlabeled data from a target domain (eg, sketches) are used to …

Spectral methods for data science: A statistical perspective

Y Chen, Y Chi, J Fan, C Ma - Foundations and Trends® in …, 2021 - nowpublishers.com
Spectral methods have emerged as a simple yet surprisingly effective approach for
extracting information from massive, noisy and incomplete data. In a nutshell, spectral …

Surprising combinations of research contents and contexts are related to impact and emerge with scientific outsiders from distant disciplines

F Shi, J Evans - Nature Communications, 2023 - nature.com
We investigate the degree to which impact in science and technology is associated with
surprising breakthroughs, and how those breakthroughs arise. Identifying breakthroughs …

[HTML][HTML] Entrywise eigenvector analysis of random matrices with low expected rank

E Abbe, J Fan, K Wang, Y Zhong - Annals of statistics, 2020 - ncbi.nlm.nih.gov
Recovering low-rank structures via eigenvector perturbation analysis is a common problem
in statistical machine learning, such as in factor analysis, community detection, ranking …

Approximating spectral clustering via sampling: a review

N Tremblay, A Loukas - … Techniques for Supervised or Unsupervised Tasks, 2020 - Springer
Spectral clustering refers to a family of well-known unsupervised learning algorithms. Rather
than attempting to cluster points in their native domain, one constructs a (usually sparse) …

[LIBRO][B] Statistical foundations of data science

J Fan, R Li, CH Zhang, H Zou - 2020 - taylorfrancis.com
Statistical Foundations of Data Science gives a thorough introduction to commonly used
statistical models, contemporary statistical machine learning techniques and algorithms …

Network cross-validation by edge sampling

T Li, E Levina, J Zhu - Biometrika, 2020 - academic.oup.com
While many statistical models and methods are now available for network analysis,
resampling of network data remains a challenging problem. Cross-validation is a useful …

Guarantees for spectral clustering with fairness constraints

M Kleindessner, S Samadi, P Awasthi… - International …, 2019 - proceedings.mlr.press
Given the widespread popularity of spectral clustering (SC) for partitioning graph data, we
study a version of constrained SC in which we try to incorporate the fairness notion …