Accelerated gradient temporal difference learning

Y Pan, A White, M White - Proceedings of the AAAI Conference on …, 2017 - ojs.aaai.org
The family of temporal difference (TD) methods span a spectrum from computationally frugal
linear methods like TD (λ) to data efficient least squares methods. Least square methods …

Meta-descent for online, continual prediction

A Jacobsen, M Schlegel, C Linke, T Degris… - Proceedings of the …, 2019 - ojs.aaai.org
This paper investigates different vector step-size adaptation approaches for non-stationary
online, continual prediction problems. Vanilla stochastic gradient descent can be …

Representation alignment in neural networks

E Imani, W Hu, M White - ar** in deep neural networks
E Bengio - 2022 - search.proquest.com
This thesis investigates the use of bootstrap** in Temporal Difference (TD) learning, a
central mechanism in reinforcement learning (RL), when applied to deep neural networks. I …

Improving Sample Efficiency of Online Temporal Difference Learning

Y Pan - 2021 - era.library.ualberta.ca
A common scientific challenge for putting a reinforcement learning agent into practice is how
to improve sample efficiency as much as possible with limited computational or memory …

Vector Step-size Adaptation for Continual, Online Prediction

A Jacobsen - 2019 - era.library.ualberta.ca
In this thesis, we investigate different vector step-size adaptation approaches for continual,
online prediction problems. Vanilla stochastic gradient descent can be considerably …