Algorithms and sq lower bounds for pac learning one-hidden-layer relu networks

I Diakonikolas, DM Kane… - … on Learning Theory, 2020 - proceedings.mlr.press
We study the problem of PAC learning one-hidden-layer ReLU networks with $ k $ hidden
units on $\mathbb {R}^ d $ under Gaussian marginals in the presence of additive label …

Agnostically learning multi-index models with queries

I Diakonikolas, DM Kane, V Kontonis… - 2024 IEEE 65th …, 2024 - ieeexplore.ieee.org
We study the power of query access for the fundamental task of agnostic learning under the
Gaussian distribution. In the agnostic model, no assumptions are made on the labels of the …

On the limits of language generation: Trade-offs between hallucination and mode collapse

A Kalavasis, A Mehrotra, G Velegkas - arxiv preprint arxiv:2411.09642, 2024 - arxiv.org
Specifying all desirable properties of a language model is challenging, but certain
requirements seem essential. Given samples from an unknown language, the trained model …

Transfer learning beyond bounded density ratios

A Kalavasis, I Zadik, M Zampetakis - arxiv preprint arxiv:2403.11963, 2024 - arxiv.org
We study the fundamental problem of transfer learning where a learning algorithm collects
data from some source distribution $ P $ but needs to perform well with respect to a different …

Efficient algorithms for learning from coarse labels

D Fotakis, A Kalavasis, V Kontonis… - … on Learning Theory, 2021 - proceedings.mlr.press
For many learning problems one may not have access to fine grained label information; eg,
an image can be labeled as husky, dog, or even animal depending on the expertise of the …

Learning exponential families from truncated samples

J Lee, A Wibisono… - Advances in Neural …, 2023 - proceedings.neurips.cc
Missing data problems have many manifestations across many scientific fields. A
fundamental type of missing data problem arises when samples are\textit {truncated}, ie …

Testing convex truncation

A De, S Nadimpalli, RA Servedio - Proceedings of the 2023 Annual ACM …, 2023 - SIAM
We study the basic statistical problem of testing whether normally distributed n-dimensional
data has been truncated, ie altered by only retaining points that lie in some unknown …

Unraveling overoptimism and publication bias in ML-driven science

P Saidi, G Dasarathy, V Berisha - arxiv preprint arxiv:2405.14422, 2024 - arxiv.org
Machine Learning (ML) is increasingly used across many disciplines with impressive
reported results. However, recent studies suggest published performance of ML models are …

Efficient Statistics With Unknown Truncation, Polynomial Time Algorithms, Beyond Gaussians

JH Lee, A Mehrotra… - 2024 IEEE 65th Annual …, 2024 - ieeexplore.ieee.org
We study the estimation of distributional parameters when samples are shown only if they
fall in some unknown set. Kontonis, Tzamos, and Zampetakis (FOCS'19) gave an algorithm …

Detecting low-degree truncation

A De, H Li, S Nadimpalli, RA Servedio - Proceedings of the 56th Annual …, 2024 - dl.acm.org
We consider the following basic, and very broad, statistical problem: Given a known high-
dimensional distribution D over ℝ n and a collection of data points in ℝ n, distinguish …