- Academic Search

F Orabona - arxiv preprint arxiv:1912.13213, 2019 - arxiv.org

In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …

Salva Cita Citato da 418 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]

[PDF] nowpublishers.com

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

S Bubeck, N Cesa-Bianchi - Foundations and Trends® in …, 2012 - nowpublishers.com

Multi-armed bandit problems are the most basic examples of sequential decision problems
with an exploration-exploitation trade-off. This is the balance between staying with the option …

Salva Cita Citato da 3282 Articoli correlati Tutte e 26 le versioni Ricerca biblioteche Versione HTML

[Free GPT-4]

[PDF] springer.com

Forecasting electricity consumption by aggregating specialized experts: A review of the sequential aggregation of specialized experts, with an application to Slovakian …

M Devaine, P Gaillard, Y Goude, G Stoltz - Machine Learning, 2013 - Springer

We consider the setting of sequential prediction of arbitrary sequences based on specialized
experts. We first provide a review of the relevant literature and present two theoretical …

Salva Cita Citato da 128 Articoli correlati Tutte e 21 le versioni

[Free GPT-4]

[PDF] jmlr.org

[PDF][PDF] Adaptive subgradient methods for online learning and stochastic optimization.

J Duchi, E Hazan, Y Singer - Journal of machine learning research, 2011 - jmlr.org

We present a new family of subgradient methods that dynamically incorporate knowledge of
the geometry of the data observed in earlier iterations to perform more informative gradient …

Salva Cita Citato da 14851 Articoli correlati Tutte e 25 le versioni Versione HTML

[Free GPT-4]

[PDF] mit.edu

[LIBRO][B] Optimization for machine learning

S Sra, S Nowozin, SJ Wright - 2011 - books.google.com

An up-to-date account of the interplay between optimization and machine learning,
accessible to students and researchers in both communities. The interplay between …

Salva Cita Citato da 1042 Articoli correlati Tutte e 33 le versioni Ricerca biblioteche

[Free GPT-4]

[PDF] theoryofcomputing.org

The multiplicative weights update method: a meta-algorithm and applications

S Arora, E Hazan, S Kale - Theory of computing, 2012 - theoryofcomputing.org

Algorithms in varied fields use the idea of maintaining a distribution over a certain set and
use the multiplicative update rule to iteratively change these weights. Their analyses are …

Salva Cita Citato da 1303 Articoli correlati Tutte e 33 le versioni Versione HTML

[Free GPT-4]

[PDF] mlr.press

Online learning with predictable sequences

A Rakhlin, K Sridharan - Conference on Learning Theory, 2013 - proceedings.mlr.press

We present methods for online linear optimization that take advantage of benign (as
opposed to worst-case) sequences. Specifically if the sequence encountered by the learner …

Salva Cita Citato da 395 Articoli correlati Tutte e 16 le versioni Versione HTML

Learning in games: a systematic review

RJ Qin, Y Yu - Science China Information Sciences, 2024 - Springer

Game theory studies the mathematical models for self-interested individuals. Nash
equilibrium is arguably the most central solution in game theory. While finding the Nash …

Salva Cita Citato da 2 Articoli correlati

[Free GPT-4]

[PDF] mlr.press

The best of both worlds: Stochastic and adversarial bandits

S Bubeck, A Slivkins - Conference on Learning Theory, 2012 - proceedings.mlr.press

We present a new bandit algorithm, SAO (Stochastic and Adversarial Optimal) whose regret
is (essentially) optimal both for adversarial rewards and for stochastic rewards. Specifically …

Salva Cita Citato da 278 Articoli correlati Tutte e 13 le versioni Versione HTML

[Free GPT-4]

[PDF] mlr.press

Online optimization with gradual variations

CK Chiang, T Yang, CJ Lee… - … on Learning Theory, 2012 - proceedings.mlr.press

We study the online convex optimization problem, in which an online algorithm has to make
repeated decisions with convex loss functions and hopes to achieve a small regret. We …

Salva Cita Citato da 273 Articoli correlati Tutte e 17 le versioni Versione HTML

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Improved second-order bounds for prediction with expert advice

A modern introduction to online learning

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

Forecasting electricity consumption by aggregating specialized experts: A review of the sequential aggregation of specialized experts, with an application to Slovakian …

[PDF][PDF] Adaptive subgradient methods for online learning and stochastic optimization.

[LIBRO][B] Optimization for machine learning

The multiplicative weights update method: a meta-algorithm and applications

Online learning with predictable sequences

Learning in games: a systematic review

The best of both worlds: Stochastic and adversarial bandits

Online optimization with gradual variations