Structured prediction with stronger consistency guarantees

A Mao, M Mohri, Y Zhong - Advances in Neural Information …, 2023 - proceedings.neurips.cc
We present an extensive study of surrogate losses for structured prediction supported by* $
H $-consistency bounds*. These are recently introduced guarantees that are more relevant …

A universal growth rate for learning with smooth surrogate losses

A Mao, M Mohri, Y Zhong - arxiv preprint arxiv:2405.05968, 2024 - arxiv.org
This paper presents a comprehensive analysis of the growth rate of $ H $-consistency
bounds (and excess error bounds) for various surrogate losses used in classification. We …

Moment distributionally robust tree structured prediction

Y Li, D Saeed, X Zhang, B Ziebart… - Advances in neural …, 2022 - proceedings.neurips.cc
Structured prediction of tree-shaped objects is heavily studied under the name of syntactic
dependency parsing. Current practice based on maximum likelihood or margin is either …

An embedding framework for the design and analysis of consistent polyhedral surrogates

J Finocchiaro, RM Frongillo, B Waggoner - Journal of Machine Learning …, 2024 - jmlr.org
We formalize and study the natural approach of designing convex surrogate loss functions
via embeddings, for discrete problems such as classification, ranking, or structured …

On the inconsistency of separable losses for structured prediction

C Corro - arxiv preprint arxiv:2301.10810, 2023 - arxiv.org
In this paper, we prove that separable negative log-likelihood losses for structured prediction
are not necessarily Bayes consistent, or, in other words, minimizing these losses may not …

Online Structured Prediction with Fenchel--Young Losses and Improved Surrogate Regret for Online Multiclass Classification with Logistic Loss

S Sakaue, H Bao, T Tsuchiya, T Oki - arxiv preprint arxiv:2402.08180, 2024 - arxiv.org
This paper studies online structured prediction with full-information feedback. For online
multiclass classification, van der Hoeven (2020) has obtained surrogate regret bounds …

Teacher guided training: An efficient framework for knowledge transfer

M Zaheer, AS Rawat, S Kim, C You, H Jain… - arxiv preprint arxiv …, 2022 - arxiv.org
The remarkable performance gains realized by large pretrained models, eg, GPT-3, hinge
on the massive amounts of data they are exposed to during training. Analogously, distilling …