Interference and generalization in temporal difference learning

E Bengio, J Pineau, D Precup - International Conference on …, 2020 - proceedings.mlr.press
We study the link between generalization and interference in temporal-difference (TD)
learning. Interference is defined as the inner product of two different gradients, representing …

[PDF][PDF] State-of-the-art reinforcement learning algorithms

D Mehta - International Journal of Engineering Research and …, 2020 - academia.edu
This research paper brings together many different aspects of the current research on
several fields associated to Reinforcement Learning which has been growing rapidly …

PBCS: Efficient Exploration and Exploitation Using a Synergy Between Reinforcement Learning and Motion Planning

G Matheron, N Perrin, O Sigaud - International Conference on Artificial …, 2020 - Springer
The exploration-exploitation trade-off is at the heart of reinforcement learning (RL). However,
most continuous control benchmarks used in recent RL research only require local …

Adaptive temporal-difference learning for policy evaluation with per-state uncertainty estimates

C Riquelme, H Penedones, D Vincent… - Advances in …, 2019 - proceedings.neurips.cc
We consider the core reinforcement-learning problem of on-policy value function
approximation from a batch of trajectory data, and focus on various issues of Temporal …

[BUCH][B] Generalization, optimization, diverse generation: insights and advances in the use of bootstrap** in deep neural networks

E Bengio - 2022 - search.proquest.com
This thesis investigates the use of bootstrap** in Temporal Difference (TD) learning, a
central mechanism in reinforcement learning (RL), when applied to deep neural networks. I …

Использование нейронных сетей для решения игровых задач на примере задачи поиска пути в лабиринте

ДО Романников, АА Воевода - … , вычислительная техника и …, 2018 - cyberleninka.ru
Рассматривается решение игровых задач на примере задачи поиска пути в лабиринте
при помощи нейронной сети. Такая задача может быть решена одним из …

Unsupervised Pretraining of State Representations in a Rewardless Environment

A Merckling - 2021 - theses.hal.science
This thesis seeks to extend the capabilities of state representation learning (SRL) to help
scale deep reinforcement learning (DRL) algorithms to continuous control tasks with high …

Integrating motion planning into reinforcement learning to solve hard exploration problems

G Matheron - 2020 - theses.hal.science
Motion planning is able to solve robotics problems much quicker than any reinforcement
learning algorithm by efficiently searching for a viable trajectory. Indeed, while the main …