Byol-explore: Exploration by bootstrapped prediction
We present BYOL-Explore, a conceptually simple yet general approach for curiosity-driven
exploration in visually complex environments. BYOL-Explore learns the world …
exploration in visually complex environments. BYOL-Explore learns the world …
Model-free representation learning and exploration in low-rank mdps
The low-rank MDP has emerged as an important model for studying representation learning
and exploration in reinforcement learning. With a known representation, several model-free …
and exploration in reinforcement learning. With a known representation, several model-free …
Fast active learning for pure exploration in reinforcement learning
Realistic environments often provide agents with very limited feedback. When the
environment is initially unknown, the feedback, in the beginning, can be completely absent …
environment is initially unknown, the feedback, in the beginning, can be completely absent …
Reward is enough for convex mdps
Maximising a cumulative reward function that is Markov and stationary, ie, defined over state-
action pairs and independent of time, is sufficient to capture many kinds of goals in a Markov …
action pairs and independent of time, is sufficient to capture many kinds of goals in a Markov …
Unified algorithms for rl with decision-estimation coefficients: No-regret, pac, and reward-free learning
Finding unified complexity measures and algorithms for sample-efficient learning is a central
topic of research in reinforcement learning (RL). The Decision-Estimation Coefficient (DEC) …
topic of research in reinforcement learning (RL). The Decision-Estimation Coefficient (DEC) …
Policy finetuning in reinforcement learning via design of experiments using offline data
In some applications of reinforcement learning, a dataset of pre-collected experience is
already availablebut it is also possible to acquire some additional online data to help …
already availablebut it is also possible to acquire some additional online data to help …
On the statistical efficiency of reward-free exploration in non-linear rl
We study reward-free reinforcement learning (RL) under general non-linear function
approximation, and establish sample efficiency and hardness results under various standard …
approximation, and establish sample efficiency and hardness results under various standard …
The challenges of exploration for offline reinforcement learning
Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked
processes of reinforcement learning: collecting informative experience and inferring optimal …
processes of reinforcement learning: collecting informative experience and inferring optimal …
Drm: Mastering visual reinforcement learning through dormant ratio minimization
Visual reinforcement learning (RL) has shown promise in continuous control tasks. Despite
its progress, current algorithms are still unsatisfactory in virtually every aspect of the …
its progress, current algorithms are still unsatisfactory in virtually every aspect of the …
Provably efficient reward-agnostic navigation with linear value iteration
There has been growing progress on theoretical analyses for provably efficient learning in
MDPs with linear function approximation, but much of the existing work has made strong …
MDPs with linear function approximation, but much of the existing work has made strong …