[BOEK][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Concrete problems in AI safety

D Amodei, C Olah, J Steinhardt, P Christiano… - arxiv preprint arxiv …, 2016 - arxiv.org
Rapid progress in machine learning and artificial intelligence (AI) has brought increasing
attention to the potential impacts of AI technologies on society. In this paper we discuss one …

[BOEK][B] Prediction, learning, and games

N Cesa-Bianchi, G Lugosi - 2006 - books.google.com
This important text and reference for researchers and students in machine learning, game
theory, statistics and information theory offers a comprehensive treatment of the problem of …

[PDF][PDF] Online convex programming and generalized infinitesimal gradient ascent

M Zinkevich - Proceedings of the 20th international conference on …, 2003 - cdn.aaai.org
Convex programming involves a convex set F⊆ Rn and a convex cost function c: F→ R. The
goal of convex programming is to find a point in F which minimizes c. In online convex …

Online learning with kernels

J Kivinen, AJ Smola… - IEEE transactions on …, 2004 - ieeexplore.ieee.org
Kernel-based algorithms such as support vector machines have achieved considerable
success in various problems in batch setting, where all of the training data is available in …

Evolutionary clustering

D Chakrabarti, R Kumar, A Tomkins - Proceedings of the 12th ACM …, 2006 - dl.acm.org
We consider the problem of clustering data over time. An evolutionary clustering should
simultaneously optimize two potentially conflicting criteria: first, the clustering at any point in …

Online convex optimization in dynamic environments

EC Hall, RM Willett - IEEE Journal of Selected Topics in Signal …, 2015 - ieeexplore.ieee.org
High-velocity streams of high-dimensional data pose significant “big data” analysis
challenges across a range of applications and settings. Online learning and online convex …

Dynamic regret of convex and smooth functions

P Zhao, YJ Zhang, L Zhang… - Advances in Neural …, 2020 - proceedings.neurips.cc
We investigate online convex optimization in non-stationary environments and choose the
dynamic regret as the performance measure, defined as the difference between cumulative …

Adaptivity and non-stationarity: Problem-dependent dynamic regret for online convex optimization

P Zhao, YJ Zhang, L Zhang, ZH Zhou - Journal of Machine Learning …, 2024 - jmlr.org
We investigate online convex optimization in non-stationary environments and choose
dynamic regret as the performance measure, defined as the difference between cumulative …

[HTML][HTML] Online transfer learning

P Zhao, SCH Hoi, J Wang, B Li - Artificial intelligence, 2014 - Elsevier
In this paper, we propose a novel machine learning framework called “Online Transfer
Learning”(OTL), which aims to attack an online learning task on a target domain by …