Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
The statistical complexity of interactive decision making
A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …
[KNIHA][B] Bandit algorithms
T Lattimore, C Szepesvári - 2020 - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …
and the multi-armed bandit model is a commonly used framework to address it. This …
Derivative-free optimization methods
In many optimization problems arising from scientific, engineering and artificial intelligence
applications, objective and constraint functions are available only as the output of a black …
applications, objective and constraint functions are available only as the output of a black …
Introduction to multi-armed bandits
A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com
Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …
decisions over time under uncertainty. An enormous body of work has accumulated over the …
Introduction to online convex optimization
E Hazan - Foundations and Trends® in Optimization, 2016 - nowpublishers.com
This monograph portrays optimization as a process. In many practical applications the
environment is so complex that it is infeasible to lay out a comprehensive theoretical model …
environment is so complex that it is infeasible to lay out a comprehensive theoretical model …
Strategic classification from revealed preferences
We study an online linear classification problem in which the data is generated by strategic
agents who manipulate their features in an effort to change the classification outcome. In …
agents who manipulate their features in an effort to change the classification outcome. In …
More adaptive algorithms for adversarial bandits
We develop a novel and generic algorithm for the adversarial multi-armed bandit problem
(or more generally the combinatorial semi-bandit problem). When instantiated differently, our …
(or more generally the combinatorial semi-bandit problem). When instantiated differently, our …
Corralling a band of bandit algorithms
We study the problem of combining multiple bandit algorithms (that is, online learning
algorithms with partial feedback) with the goal of creating a master algorithm that performs …
algorithms with partial feedback) with the goal of creating a master algorithm that performs …
Tight guarantees for interactive decision making with the decision-estimation coefficient
A foundational problem in reinforcement learning and interactive decision making is to
understand what modeling assumptions lead to sample-efficient learning guarantees, and …
understand what modeling assumptions lead to sample-efficient learning guarantees, and …
Adversarial bandits with knapsacks
We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed
bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a …
bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a …