Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
The statistical complexity of interactive decision making
A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …
On the Convergence and Sample Complexity Analysis of Deep Q-Networks with -Greedy Exploration
This paper provides a theoretical understanding of deep Q-Network (DQN) with the
$\varepsilon $-greedy exploration in deep reinforcement learning. Despite the tremendous …
$\varepsilon $-greedy exploration in deep reinforcement learning. Despite the tremendous …
The sample complexity of online contract design
We study the hidden-action principal-agent problem in an online setting. In each round, the
principal posts a contract that specifies the payment to the agent based on each outcome …
principal posts a contract that specifies the payment to the agent based on each outcome …
Online learning in stackelberg games with an omniscient follower
We study the problem of online learning in a two-player decentralized cooperative
Stackelberg game. In each round, the leader first takes an action, followed by the follower …
Stackelberg game. In each round, the leader first takes an action, followed by the follower …
Made: Exploration via maximizing deviation from explored regions
In online reinforcement learning (RL), efficient exploration remains particularly challenging
in high-dimensional environments with sparse rewards. In low-dimensional environments …
in high-dimensional environments with sparse rewards. In low-dimensional environments …
Understanding Deep Neural Function Approximation in Reinforcement Learning via -Greedy Exploration
This paper provides a theoretical study of deep neural function approximation in
reinforcement learning (RL) with the $\epsilon $-greedy exploration under the online setting …
reinforcement learning (RL) with the $\epsilon $-greedy exploration under the online setting …
First steps toward understanding the extrapolation of nonlinear models to unseen domains
Real-world machine learning applications often involve deploying neural networks to
domains that are not seen in the training time. Hence, we need to understand the …
domains that are not seen in the training time. Hence, we need to understand the …
Lifting the information ratio: An information-theoretic analysis of thompson sampling for contextual bandits
We study the Bayesian regret of the renowned Thompson Sampling algorithm in contextual
bandits with binary losses and adversarially-selected contexts. We adapt the information …
bandits with binary losses and adversarially-selected contexts. We adapt the information …
Fast rates for nonparametric online learning: from realizability to learning in games
C Daskalakis, N Golowich - Proceedings of the 54th Annual ACM …, 2022 - dl.acm.org
We study fast rates of convergence in the setting of nonparametric online regression, namely
where regret is defined with respect to an arbitrary function class which has bounded …
where regret is defined with respect to an arbitrary function class which has bounded …
Representation learning beyond linear prediction functions
Recent papers on the theory of representation learning has shown the importance of a
quantity called diversity when generalizing from a set of source tasks to a target task. Most of …
quantity called diversity when generalizing from a set of source tasks to a target task. Most of …