Offline reinforcement learning: Tutorial, review, and perspectives on open problems

S Levine, A Kumar, G Tucker, J Fu - arxiv preprint arxiv:2005.01643, 2020 - arxiv.org
In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

Accelerating reinforcement learning with learned skill priors

K Pertsch, Y Lee, J Lim - Conference on robot learning, 2021 - proceedings.mlr.press
Intelligent agents rely heavily on prior experience when learning a new task, yet most
modern reinforcement learning (RL) approaches learn every task from scratch. One …

Learning to navigate in cities without a map

P Mirowski, M Grimes, M Malinowski… - Advances in neural …, 2018 - proceedings.neurips.cc
Navigating through unstructured environments is a basic capability of intelligent creatures,
and thus is of fundamental interest in the study and development of artificial intelligence …

The streetlearn environment and dataset

P Mirowski, A Banki-Horvath, K Anderson… - arxiv preprint arxiv …, 2019 - arxiv.org
Navigation is a rich and well-grounded problem domain that drives progress in many
different areas of research: perception, planning, memory, exploration, and optimisation in …

[HTML][HTML] A survey of demonstration learning

A Correia, LA Alexandre - Robotics and Autonomous Systems, 2024 - Elsevier
With the fast improvement of machine learning, reinforcement learning (RL) has been used
to automate human tasks in different areas. However, training such agents is difficult and …

Embodied visual navigation with automatic curriculum learning in real environments

SD Morad, R Mecca, RPK Poudel… - IEEE Robotics and …, 2021 - ieeexplore.ieee.org
We present NavACL, a method of automatic curriculum learning tailored to the navigation
task. NavACL is simple to train and efficiently selects relevant tasks using geometric …

Mo2: Model-based offline options

S Salter, M Wulfmeier, D Tirumala… - Conference on …, 2022 - proceedings.mlr.press
The ability to discover useful behaviours from past experience and transfer them to new
tasks is considered a core component of natural embodied intelligence. Inspired by …

A dynamic adjusting reward function method for deep reinforcement learning with adjustable parameters

Z Hu, K Wan, X Gao, Y Zhai - Mathematical Problems in …, 2019 - Wiley Online Library
In deep reinforcement learning, network convergence speed is often slow and easily
converges to local optimal solutions. For an environment with reward saltation, we propose …

Cross-view policy learning for street navigation

A Li, H Hu, P Mirowski… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
The ability to navigate from visual observations in unfamiliar environments is a core
component of intelligent agents and an ongoing challenge for Deep Reinforcement …

Offline reinforcement learning with representations for actions

X Lou, Q Yin, J Zhang, C Yu, Z He, N Cheng… - Information Sciences, 2022 - Elsevier
Prevailing offline reinforcement learning (RL) methods limit the policy within the area
supported by the offline dataset to avoid the distributional shift problem. But potential high …