„Google“ mokslinčius

F Orabona - arxiv preprint arxiv:1912.13213, 2019 - arxiv.org

In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …

Išsaugoti Cituoti Cituoja 444 Susiję straipsniai Visos 3 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Optimal rates for bandit nonstochastic control

YJ Sun, S Newman, E Hazan - Advances in Neural …, 2023 - proceedings.neurips.cc

Abstract Linear Quadratic Regulator (LQR) and Linear Quadratic Gaussian (LQG) control
are foundational and extensively researched problems in optimal control. We investigate …

Išsaugoti Cituoti Cituoja 10 Susiję straipsniai Visos 6 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Synthetic control as online linear regression

J Chen - Econometrica, 2023 - Wiley Online Library

This paper notes a simple connection between synthetic control and online learning.
Specifically, we recognize synthetic control as an instance of Follow‐The‐Leader (FTL) …

Išsaugoti Cituoti Cituoja 27 Susiję straipsniai Visos 10 versijos

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Multi-agent online optimization with delays: Asynchronicity, adaptivity, and optimism

YG Hsieh, F Iutzeler, J Malick… - Journal of Machine …, 2022 - jmlr.org

In this paper, we provide a general framework for studying multi-agent online learning
problems in the presence of delays and asynchronicities. Specifically, we propose and …

Išsaugoti Cituoti Cituoja 38 Susiję straipsniai Visos 13 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation

YG Hsieh, K Antonakopoulos… - Advances in …, 2022 - proceedings.neurips.cc

We examine the problem of regret minimization when the learner is involved in a continuous
game with other optimizing agents: in this case, if all players follow a no-regret algorithm, it is …

Išsaugoti Cituoti Cituoja 26 Susiję straipsniai Visos 27 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fast last-iterate convergence of learning in games requires forgetful algorithms

Y Cai, G Farina, J Grand-Clément, C Kroer… - arxiv preprint arxiv …, 2024 - arxiv.org

Self-play via online learning is one of the premier ways to solve large-scale two-player zero-
sum games, both in theory and practice. Particularly popular algorithms include optimistic …

Išsaugoti Cituoti Cituoja 5 Susiję straipsniai Visos 4 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

On anytime learning at macroscale

L Caccia, J Xu, M Ott, M Ranzato… - … on Lifelong Learning …, 2022 - proceedings.mlr.press

In many practical applications of machine learning data arrives sequentially over time in
large chunks. Practitioners have then to decide how to allocate their computational budget in …

Išsaugoti Cituoti Cituoja 24 Susiję straipsniai Visos 5 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Online frank-wolfe with arbitrary delays

Y Wan, WW Tu, L Zhang - Advances in Neural Information …, 2022 - proceedings.neurips.cc

Abstract The online Frank-Wolfe (OFW) method has gained much popularity for online
convex optimization due to its projection-free property. Previous studies show that OFW can …

Išsaugoti Cituoti Cituoja 10 Susiję straipsniai Visos 7 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] nju.edu.cn

Learning with Asynchronous Labels

YY Qian, ZY Zhang, P Zhao, ZH Zhou - ACM Transactions on …, 2024 - dl.acm.org

Learning with data streams has attracted much attention in recent decades. Conventional
approaches typically assume that the feature and label of a data item can be timely …

Išsaugoti Cituoti Cituoja 1 Susiję straipsniai Visos 4 versijos

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Nonstochastic bandits and experts with arm-dependent delays

D Van Der Hoeven… - … Conference on Artificial …, 2022 - proceedings.mlr.press

We study nonstochastic bandits and experts in a delayed setting where delays depend on
both time and arms. While the setting in which delays only depend on time has been …

Išsaugoti Cituoti Cituoja 14 Susiję straipsniai Visos 5 versijos HTML kopija

Kurti įspėjimą

Cituoti

Išplėstinė paieška

Išsaugota skiltyje „Mano biblioteka“

Online learning with optimism and delay

A modern introduction to online learning

Optimal rates for bandit nonstochastic control

Synthetic control as online linear regression

Multi-agent online optimization with delays: Asynchronicity, adaptivity, and optimism

No-regret learning in games with noisy feedback: Faster rates and adaptivity via learning rate separation

Fast last-iterate convergence of learning in games requires forgetful algorithms

On anytime learning at macroscale

Online frank-wolfe with arbitrary delays

Learning with Asynchronous Labels

Nonstochastic bandits and experts with arm-dependent delays