A simple approach for non-stationary linear bandits

P Zhao, L Zhang, Y Jiang… - … Conference on Artificial …, 2020‏ - proceedings.mlr.press
This paper investigates the problem of non-stationary linear bandits, where the unknown
regression parameter is evolving over time. Previous studies have adopted sophisticated …

Adaptivity and non-stationarity: Problem-dependent dynamic regret for online convex optimization

P Zhao, YJ Zhang, L Zhang, ZH Zhou - Journal of Machine Learning …, 2024‏ - jmlr.org
We investigate online convex optimization in non-stationary environments and choose
dynamic regret as the performance measure, defined as the difference between cumulative …

No-regret learning in time-varying zero-sum games

M Zhang, P Zhao, H Luo… - … Conference on Machine …, 2022‏ - proceedings.mlr.press
Learning from repeated play in a fixed two-player zero-sum game is a classic problem in
game theory and online learning. We consider a variant of this problem where the game …

Adapting to online label shift with provable guarantees

Y Bai, YJ Zhang, P Zhao… - Advances in Neural …, 2022‏ - proceedings.neurips.cc
The standard supervised learning paradigm works effectively when training data shares the
same distribution as the upcoming testing samples. However, this stationary assumption is …

Optimistic online mirror descent for bridging stochastic and adversarial online convex optimization

S Chen, YJ Zhang, WW Tu, P Zhao, L Zhang - Journal of Machine Learning …, 2024‏ - jmlr.org
The stochastically extended adversarial (SEA) model, introduced by Sachs et al.(2022),
serves as an interpolation between stochastic and adversarial online convex optimization …

Regret and cumulative constraint violation analysis for online convex optimization with long term constraints

X Yi, X Li, T Yang, L **e, T Chai… - … on machine learning, 2021‏ - proceedings.mlr.press
This paper considers online convex optimization with long term constraints, where
constraints can be violated in intermediate rounds, but need to be satisfied in the long run …

Non-stationary online learning with memory and non-stochastic control

P Zhao, YH Yan, YX Wang, ZH Zhou - Journal of Machine Learning …, 2023‏ - jmlr.org
We study the problem of Online Convex Optimization (OCO) with memory, which allows loss
functions to depend on past decisions and thus captures temporal effects of learning …

Improved analysis for dynamic regret of strongly convex and smooth functions

P Zhao, L Zhang - Learning for Dynamics and Control, 2021‏ - proceedings.mlr.press
In this paper, we present an improved analysis for dynamic regret of strongly convex and
smooth functions. Specifically, we investigate the Online Multiple Gradient Descent (OMGD) …

Dynamic regret of online markov decision processes

P Zhao, LF Li, ZH Zhou - International Conference on …, 2022‏ - proceedings.mlr.press
Abstract We investigate online Markov Decision Processes (MDPs) with adversarially
changing loss functions and known transitions. We choose dynamic regret as the …

Adapting to continuous covariate shift via online density ratio estimation

YJ Zhang, ZY Zhang, P Zhao… - Advances in Neural …, 2023‏ - proceedings.neurips.cc
Dealing with distribution shifts is one of the central challenges for modern machine learning.
One fundamental situation is the covariate shift, where the input distributions of data change …