Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Nonstationary bandit learning via predictive sampling
Thompson sampling has proven effective across a wide range of stationary bandit
environments. However, as we demonstrate in this paper, it can perform poorly when …
environments. However, as we demonstrate in this paper, it can perform poorly when …
Causal semantic communication for digital twins: A generalizable imitation learning approach
A digital twin (DT) leverages a virtual representation of the physical world, along with
communication (eg, 6G), computing (eg, edge computing), and artificial intelligence (AI) …
communication (eg, 6G), computing (eg, edge computing), and artificial intelligence (AI) …
Bayesian reinforcement learning with limited cognitive load
All biological and artificial agents must act given limits on their ability to acquire and process
information. As such, a general theory of adaptive behavior should be able to account for the …
information. As such, a general theory of adaptive behavior should be able to account for the …
Contextual information-directed sampling
Abstract Information-directed sampling (IDS) has recently demonstrated its potential as a
data-efficient reinforcement learning algorithm. However, it is still unclear what is the right …
data-efficient reinforcement learning algorithm. However, it is still unclear what is the right …
Deciding what to model: Value-equivalent sampling for reinforcement learning
The quintessential model-based reinforcement-learning agent iteratively refines its
estimates or prior beliefs about the true underlying model of the environment. Recent …
estimates or prior beliefs about the true underlying model of the environment. Recent …
Satisficing exploration for deep reinforcement learning
A default assumption in the design of reinforcement-learning algorithms is that a decision-
making agent always explores to learn optimal behavior. In sufficiently complex …
making agent always explores to learn optimal behavior. In sufficiently complex …
Provably efficient information-directed sampling algorithms for multi-agent reinforcement learning
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement
learning (MARL) based on the principle of information-directed sampling (IDS). These …
learning (MARL) based on the principle of information-directed sampling (IDS). These …
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning
Throughout the cognitive-science literature, there is widespread agreement that decision-
making agents operating in the real world do so under limited information-processing …
making agents operating in the real world do so under limited information-processing …
Exploration Unbound
A sequential decision-making agent balances between exploring to gain new knowledge
about an environment and exploiting current knowledge to maximize immediate reward. For …
about an environment and exploiting current knowledge to maximize immediate reward. For …
Parallel Bayesian Optimization Using Satisficing Thompson Sampling for Time-Sensitive Black-Box Optimization
X Song, B Jiang - arxiv preprint arxiv:2310.12526, 2023 - arxiv.org
Bayesian optimization (BO) is widely used for black-box optimization problems, and have
been shown to perform well in various real-world tasks. However, most of the existing BO …
been shown to perform well in various real-world tasks. However, most of the existing BO …