A modern introduction to online learning
F Orabona - arxiv preprint arxiv:1912.13213, 2019 - arxiv.org
In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …
of Online Convex Optimization. Here, online learning refers to the framework of regret …
Regret analysis of stochastic and nonstochastic multi-armed bandit problems
Multi-armed bandit problems are the most basic examples of sequential decision problems
with an exploration-exploitation trade-off. This is the balance between staying with the option …
with an exploration-exploitation trade-off. This is the balance between staying with the option …
Forecasting electricity consumption by aggregating specialized experts: A review of the sequential aggregation of specialized experts, with an application to Slovakian …
We consider the setting of sequential prediction of arbitrary sequences based on specialized
experts. We first provide a review of the relevant literature and present two theoretical …
experts. We first provide a review of the relevant literature and present two theoretical …
[PDF][PDF] Adaptive subgradient methods for online learning and stochastic optimization.
We present a new family of subgradient methods that dynamically incorporate knowledge of
the geometry of the data observed in earlier iterations to perform more informative gradient …
the geometry of the data observed in earlier iterations to perform more informative gradient …
[LIBRO][B] Optimization for machine learning
An up-to-date account of the interplay between optimization and machine learning,
accessible to students and researchers in both communities. The interplay between …
accessible to students and researchers in both communities. The interplay between …
The multiplicative weights update method: a meta-algorithm and applications
Algorithms in varied fields use the idea of maintaining a distribution over a certain set and
use the multiplicative update rule to iteratively change these weights. Their analyses are …
use the multiplicative update rule to iteratively change these weights. Their analyses are …
Online learning with predictable sequences
We present methods for online linear optimization that take advantage of benign (as
opposed to worst-case) sequences. Specifically if the sequence encountered by the learner …
opposed to worst-case) sequences. Specifically if the sequence encountered by the learner …
Learning in games: a systematic review
RJ Qin, Y Yu - Science China Information Sciences, 2024 - Springer
Game theory studies the mathematical models for self-interested individuals. Nash
equilibrium is arguably the most central solution in game theory. While finding the Nash …
equilibrium is arguably the most central solution in game theory. While finding the Nash …
The best of both worlds: Stochastic and adversarial bandits
We present a new bandit algorithm, SAO (Stochastic and Adversarial Optimal) whose regret
is (essentially) optimal both for adversarial rewards and for stochastic rewards. Specifically …
is (essentially) optimal both for adversarial rewards and for stochastic rewards. Specifically …
Online optimization with gradual variations
We study the online convex optimization problem, in which an online algorithm has to make
repeated decisions with convex loss functions and hopes to achieve a small regret. We …
repeated decisions with convex loss functions and hopes to achieve a small regret. We …