Študovňa Google

Y Pan, A White, M White - Proceedings of the AAAI Conference on …, 2017 - ojs.aaai.org

The family of temporal difference (TD) methods span a spectrum from computationally frugal
linear methods like TD (λ) to data efficient least squares methods. Least square methods …

Uložiť Citovať Citované 37-krát Súvisiace články Všetky verzie 12 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Meta-descent for online, continual prediction

A Jacobsen, M Schlegel, C Linke, T Degris… - Proceedings of the …, 2019 - ojs.aaai.org

This paper investigates different vector step-size adaptation approaches for non-stationary
online, continual prediction problems. Vanilla stochastic gradient descent can be …

Uložiť Citovať Citované 25-krát Súvisiace články Všetky verzie 10 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Representation alignment in neural networks

E Imani, W Hu, M White - ar** in deep neural networks

E Bengio - 2022 - search.proquest.com

This thesis investigates the use of bootstrap** in Temporal Difference (TD) learning, a
central mechanism in reinforcement learning (RL), when applied to deep neural networks. I …

Uložiť Citovať Súvisiace články Všetky verzie 3 Vyhľadávanie knižnice

[Free GPT-4]
[DeepSeek]

[PDF] ualberta.ca

Improving Sample Efficiency of Online Temporal Difference Learning

Y Pan - 2021 - era.library.ualberta.ca

A common scientific challenge for putting a reinforcement learning agent into practice is how
to improve sample efficiency as much as possible with limited computational or memory …

Uložiť Citovať Súvisiace články Všetky verzie 3 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] ualberta.ca

Vector Step-size Adaptation for Continual, Online Prediction

A Jacobsen - 2019 - era.library.ualberta.ca

In this thesis, we investigate different vector step-size adaptation approaches for continual,
online prediction problems. Vanilla stochastic gradient descent can be considerably …

Uložiť Citovať Súvisiace články Všetky verzie 2 HTML verzia

Vytvoriť upozornenie

Citovať

Rozšírené vyhľadávanie

Uložené do mojej knižnice

Accelerated gradient temporal difference learning algorithms

Accelerated gradient temporal difference learning

Meta-descent for online, continual prediction

Representation alignment in neural networks

Improving Sample Efficiency of Online Temporal Difference Learning

Vector Step-size Adaptation for Continual, Online Prediction