Statistical inference on random dot product graphs: a survey
The random dot product graph (RDPG) is an independent-edge random graph that is
analytically tractable and, simultaneously, either encompasses or can successfully …
analytically tractable and, simultaneously, either encompasses or can successfully …
Provable guarantees for self-supervised deep learning with spectral contrastive loss
Recent works in self-supervised learning have advanced the state-of-the-art by relying on
the contrastive learning paradigm, which learns representations by pushing positive pairs, or …
the contrastive learning paradigm, which learns representations by pushing positive pairs, or …
Connect, not collapse: Explaining contrastive learning for unsupervised domain adaptation
We consider unsupervised domain adaptation (UDA), where labeled data from a source
domain (eg, photos) and unlabeled data from a target domain (eg, sketches) are used to …
domain (eg, photos) and unlabeled data from a target domain (eg, sketches) are used to …
Spectral methods for data science: A statistical perspective
Spectral methods have emerged as a simple yet surprisingly effective approach for
extracting information from massive, noisy and incomplete data. In a nutshell, spectral …
extracting information from massive, noisy and incomplete data. In a nutshell, spectral …
Surprising combinations of research contents and contexts are related to impact and emerge with scientific outsiders from distant disciplines
We investigate the degree to which impact in science and technology is associated with
surprising breakthroughs, and how those breakthroughs arise. Identifying breakthroughs …
surprising breakthroughs, and how those breakthroughs arise. Identifying breakthroughs …
[HTML][HTML] Entrywise eigenvector analysis of random matrices with low expected rank
Recovering low-rank structures via eigenvector perturbation analysis is a common problem
in statistical machine learning, such as in factor analysis, community detection, ranking …
in statistical machine learning, such as in factor analysis, community detection, ranking …
Approximating spectral clustering via sampling: a review
Spectral clustering refers to a family of well-known unsupervised learning algorithms. Rather
than attempting to cluster points in their native domain, one constructs a (usually sparse) …
than attempting to cluster points in their native domain, one constructs a (usually sparse) …
[LIBRO][B] Statistical foundations of data science
Statistical Foundations of Data Science gives a thorough introduction to commonly used
statistical models, contemporary statistical machine learning techniques and algorithms …
statistical models, contemporary statistical machine learning techniques and algorithms …
Network cross-validation by edge sampling
While many statistical models and methods are now available for network analysis,
resampling of network data remains a challenging problem. Cross-validation is a useful …
resampling of network data remains a challenging problem. Cross-validation is a useful …
Guarantees for spectral clustering with fairness constraints
Given the widespread popularity of spectral clustering (SC) for partitioning graph data, we
study a version of constrained SC in which we try to incorporate the fairness notion …
study a version of constrained SC in which we try to incorporate the fairness notion …