[PDF][PDF] Online (multinomial) logistic bandit: Improved regret and constant computation cost

YJ Zhang, M Sugiyama - Advances in Neural Information …, 2024‏ - proceedings.neurips.cc
This paper investigates the logistic bandit problem, a variant of the generalized linear bandit
model that utilizes a logistic model to depict the feedback from an action. While most existing …

Adaptivity and non-stationarity: Problem-dependent dynamic regret for online convex optimization

P Zhao, YJ Zhang, L Zhang, ZH Zhou - Journal of Machine Learning …, 2024‏ - jmlr.org
We investigate online convex optimization in non-stationary environments and choose
dynamic regret as the performance measure, defined as the difference between cumulative …

Online composite optimization between stochastic and adversarial environments

Y Wang, S Chen, W Jiang, W Yang… - Advances in Neural …, 2025‏ - proceedings.neurips.cc
We study online composite optimization under the Stochastically Extended Adversarial
(SEA) model. Specifically, each loss function consists of two parts: a fixed non-smooth and …

Online conformal prediction with decaying step sizes

AN Angelopoulos, RF Barber, S Bates - arxiv preprint arxiv:2402.01139, 2024‏ - arxiv.org
We introduce a method for online conformal prediction with decaying step sizes. Like
previous methods, ours possesses a retrospective guarantee of coverage for arbitrary …

Universal online learning with gradient variations: A multi-layer online ensemble approach

YH Yan, P Zhao, ZH Zhou - Advances in Neural Information …, 2023‏ - proceedings.neurips.cc
In this paper, we propose an online convex optimization approach with two different levels of
adaptivity. On a higher level, our approach is agnostic to the unknown types and curvatures …

Byzantine-robust distributed online learning: Taming adversarial participants in an adversarial environment

X Dong, Z Wu, Q Ling, Z Tian - IEEE Transactions on Signal …, 2023‏ - ieeexplore.ieee.org
This paper studies distributed online learning under Byzantine attacks. The performance of
an online learning algorithm is often characterized by (adversarial) regret, which evaluates …

Universal Online Convex Optimization with Projection per Round

W Yang, Y Wang, P Zhao, L Zhang - arxiv preprint arxiv:2405.19705, 2024‏ - arxiv.org
To address the uncertainty in function types, recent progress in online convex optimization
(OCO) has spurred the development of universal algorithms that simultaneously attain …

Gradient-variation online learning under generalized smoothness

YF **e, P Zhao, ZH Zhou - arxiv preprint arxiv:2408.09074, 2024‏ - arxiv.org
Gradient-variation online learning aims to achieve regret guarantees that scale with
variations in the gradients of online functions, which has been shown to be crucial for …

Online optimization under randomly corrupted attacks

Z Qu, X Li, L Li, X Yi - IEEE Transactions on Signal Processing, 2024‏ - ieeexplore.ieee.org
Existing algorithms in online optimization usually rely on trustful information, eg, reliable
knowledge of gradients, which makes them vulnerable to attacks. To take into account the …

Efficient non-stationary online learning by wavelets with applications to online distribution shift adaptation

YY Qian, P Zhao, YJ Zhang, M Sugiyama… - Forty-first International …, 2024‏ - openreview.net
Dynamic regret minimization offers a principled way for non-stationary online learning,
where the algorithm's performance is evaluated against changing comparators. Prevailing …