Convergence guarantees for the Good-Turing estimator

A Painsky - Journal of Machine Learning Research, 2022 - jmlr.org
Consider a finite sample from an unknown distribution over a countable alphabet. The
occupancy probability (OP) refers to the total probability of symbols that appear exactly k …

Finite-sample symmetric mean estimation with fisher information rate

S Gupta, JCH Lee, E Price - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press
The mean of an unknown variance-$\sigma^ 2$ distribution $ f $ can be estimated from $ n $
samples with variance $\frac {\sigma^ 2}{n} $ and nearly corresponding subgaussian rate …

On the efficient implementation of high accuracy optimality of profile maximum likelihood

M Charikar, Z Jiang, K Shiragur… - Advances in Neural …, 2022 - proceedings.neurips.cc
We provide an efficient unified plug-in approach for estimating symmetric properties of
distributions given $ n $ independent samples. Our estimator is based on profile-maximum …

Profile entropy: A fundamental measure for the learnability and compressibility of distributions

Y Hao, A Orlitsky - Advances in Neural Information …, 2020 - proceedings.neurips.cc
The profile of a sample is the multiset of its symbol frequencies. We show that for samples of
discrete distributions, profile entropy is a fundamental measure unifying the concepts of …

Compressed Maximum Likelihood

Y Hao, A Orlitsky - International Conference on Machine …, 2021 - proceedings.mlr.press
Maximum likelihood (ML) is one of the most fundamental and general statistical estimation
techniques. Inspired by recent advances in estimating distribution functionals, we propose …

[图书][B] Efficient Universal Estimators for Symmetric Property Estimation

K Shiragur - 2022 - search.proquest.com
Given iid samples from an unknown distribution, estimating its symmetric properties is a
classical problem in information theory, statistics, operations research and computer …

[图书][B] Competitive and Universal Learning

Y Hao - 2021 - search.proquest.com
Modern data science calls for statistical inference algorithms that are both data-efficient and
computation-efficient. We design and analyze methods that 1) outperform existing …