A framework for bilevel optimization that enables stochastic and global variance reduction algorithms

M Dagréou, P Ablin, S Vaiter… - Advances in Neural …, 2022 - proceedings.neurips.cc
Bilevel optimization, the problem of minimizing a value function which involves the arg-
minimum of another function, appears in many areas of machine learning. In a large scale …

A near-optimal algorithm for stochastic bilevel optimization via double-momentum

P Khanduri, S Zeng, M Hong, HT Wai… - Advances in neural …, 2021 - proceedings.neurips.cc
This paper proposes a new algorithm--the\underline {S} ingle-timescale Do\underline {u} ble-
momentum\underline {St} ochastic\underline {A} pprox\underline {i} matio\underline …

On implicit bias in overparameterized bilevel optimization

P Vicol, JP Lorraine, F Pedregosa… - International …, 2022 - proceedings.mlr.press
Many problems in machine learning involve bilevel optimization (BLO), including
hyperparameter optimization, meta-learning, and dataset distillation. Bilevel problems …

idarts: Differentiable architecture search with stochastic implicit gradients

M Zhang, SW Su, S Pan, X Chang… - International …, 2021 - proceedings.mlr.press
Abstract Differentiable ARchiTecture Search (DARTS) has recently become the mainstream
in the neural architecture search (NAS) due to its efficiency and simplicity. With a gradient …

Amortized implicit differentiation for stochastic bilevel optimization

M Arbel, J Mairal - arxiv preprint arxiv:2111.14580, 2021 - arxiv.org
We study a class of algorithms for solving bilevel optimization problems in both stochastic
and deterministic settings when the inner-level objective is strongly convex. Specifically, we …

Probabilistic bilevel coreset selection

X Zhou, R Pi, W Zhang, Y Lin… - … on machine learning, 2022 - proceedings.mlr.press
The goal of coreset selection in supervised learning is to produce a weighted subset of data,
so that training only on the subset achieves similar performance as training on the entire …

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

Q Bertrand, Q Klopfenstein, M Massias… - Journal of Machine …, 2022 - jmlr.org
Finding the optimal hyperparameters of a model can be cast as a bilevel optimization
problem, typically solved using zero-order techniques. In this work we study first-order …

Achieving hierarchy-free approximation for bilevel programs with equilibrium constraints

J Li, J Yu, B Liu, Y Nie, Z Wang - … Conference on Machine …, 2023 - proceedings.mlr.press
In this paper, we develop an approximation scheme for solving bilevel programs with
equilibrium constraints, which are generally difficult to solve. Among other things, calculating …

Analyzing inexact hypergradients for bilevel learning

MJ Ehrhardt, L Roberts - IMA Journal of Applied Mathematics, 2024 - academic.oup.com
Estimating hyperparameters has been a long-standing problem in machine learning. We
consider the case where the task at hand is modeled as the solution to an optimization …

Bilevel optimization with a lower-level contraction: Optimal sample complexity without warm-start

R Grazzi, M Pontil, S Salzo - Journal of Machine Learning Research, 2023 - jmlr.org
We analyse a general class of bilevel problems, in which the upper-level problem consists in
the minimization of a smooth objective function and the lower-level problem is to find the …