Študovňa Google

Y Liu, B Van Roy, K Xu - International Conference on …, 2023 - proceedings.mlr.press

Thompson sampling has proven effective across a wide range of stationary bandit
environments. However, as we demonstrate in this paper, it can perform poorly when …

Uložiť Citovať Citované 24-krát Súvisiace články Všetky verzie 3 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Causal semantic communication for digital twins: A generalizable imitation learning approach

CK Thomas, W Saad, Y **ao - IEEE Journal on Selected Areas …, 2023 - ieeexplore.ieee.org

A digital twin (DT) leverages a virtual representation of the physical world, along with
communication (eg, 6G), computing (eg, edge computing), and artificial intelligence (AI) …

Uložiť Citovať Citované 17-krát Súvisiace články Všetky verzie 5

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

Bayesian reinforcement learning with limited cognitive load

D Arumugam, MK Ho, ND Goodman, B Van Roy - Open Mind, 2024 - direct.mit.edu

All biological and artificial agents must act given limits on their ability to acquire and process
information. As such, a general theory of adaptive behavior should be able to account for the …

Uložiť Citovať Citované 12-krát Súvisiace články Všetky verzie 8

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Contextual information-directed sampling

B Hao, T Lattimore, C Qin - International Conference on …, 2022 - proceedings.mlr.press

Abstract Information-directed sampling (IDS) has recently demonstrated its potential as a
data-efficient reinforcement learning algorithm. However, it is still unclear what is the right …

Uložiť Citovať Citované 19-krát Súvisiace články Všetky verzie 4 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Deciding what to model: Value-equivalent sampling for reinforcement learning

D Arumugam, B Van Roy - Advances in neural information …, 2022 - proceedings.neurips.cc

The quintessential model-based reinforcement-learning agent iteratively refines its
estimates or prior beliefs about the true underlying model of the environment. Recent …

Uložiť Citovať Citované 16-krát Súvisiace články Všetky verzie 7 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Satisficing exploration for deep reinforcement learning

D Arumugam, S Kumar, R Gummadi… - arxiv preprint arxiv …, 2024 - arxiv.org

A default assumption in the design of reinforcement-learning algorithms is that a decision-
making agent always explores to learn optimal behavior. In sufficiently complex …

Uložiť Citovať Citované 2-krát Súvisiace články Všetky verzie 4 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Provably efficient information-directed sampling algorithms for multi-agent reinforcement learning

Q Zhang, C Bai, S Hu, Z Wang, X Li - arxiv preprint arxiv:2404.19292, 2024 - arxiv.org

This work designs and analyzes a novel set of algorithms for multi-agent reinforcement
learning (MARL) based on the principle of information-directed sampling (IDS). These …

Uložiť Citovať Citované 2-krát Súvisiace články Všetky verzie 2 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning

D Arumugam, MK Ho, ND Goodman… - arxiv preprint arxiv …, 2022 - arxiv.org

Throughout the cognitive-science literature, there is widespread agreement that decision-
making agents operating in the real world do so under limited information-processing …

Uložiť Citovať Citované 4-krát Súvisiace články Všetky verzie 4 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Exploration Unbound

D Arumugam, W Xu, B Van Roy - arxiv preprint arxiv:2407.12178, 2024 - arxiv.org

A sequential decision-making agent balances between exploring to gain new knowledge
about an environment and exploiting current knowledge to maximize immediate reward. For …

Uložiť Citovať Súvisiace články Všetky verzie 3 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Parallel Bayesian Optimization Using Satisficing Thompson Sampling for Time-Sensitive Black-Box Optimization

X Song, B Jiang - arxiv preprint arxiv:2310.12526, 2023 - arxiv.org

Bayesian optimization (BO) is widely used for black-box optimization problems, and have
been shown to perform well in various real-world tasks. However, most of the existing BO …

Uložiť Citovať Súvisiace články Všetky verzie 2 HTML verzia

Vytvoriť upozornenie

Citovať

Rozšírené vyhľadávanie

Uložené do mojej knižnice

The value of information when deciding what to learn

Nonstationary bandit learning via predictive sampling

Causal semantic communication for digital twins: A generalizable imitation learning approach

Bayesian reinforcement learning with limited cognitive load

Contextual information-directed sampling

Deciding what to model: Value-equivalent sampling for reinforcement learning

Satisficing exploration for deep reinforcement learning

Provably efficient information-directed sampling algorithms for multi-agent reinforcement learning

On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning

Exploration Unbound

Parallel Bayesian Optimization Using Satisficing Thompson Sampling for Time-Sensitive Black-Box Optimization