Unbiased gradient estimation in unrolled computation graphs with persistent evolution strategies

P Vicol, L Metz, J Sohl-Dickstein - … Conference on Machine …, 2021 - proceedings.mlr.press
Unrolled computation graphs arise in many scenarios, including training RNNs, tuning
hyperparameters through unrolled optimization, and training learned optimizers. Current …

Online learning of long-range dependencies

N Zucchet, R Meier, S Schug… - Advances in Neural …, 2023 - proceedings.neurips.cc
Online learning holds the promise of enabling efficient long-term credit assignment in
recurrent neural networks. However, current algorithms fall short of offline backpropagation …

A unified framework of online learning algorithms for training recurrent neural networks

O Marschall, K Cho, C Savin - Journal of machine learning research, 2020 - jmlr.org
We present a framework for compactly summarizing many recent results in efficient and/or
biologically plausible online training of recurrent neural networks (RNN). The framework …

Online spatio-temporal learning in deep neural networks

T Bohnstingl, S Woźniak, A Pantazi… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Biological neural networks are equipped with an inherent capability to continuously adapt
through online learning. This aspect remains in stark contrast to learning with error …

Learning by directional gradient descent

D Silver, A Goyal, I Danihelka, M Hessel… - International …, 2021 - openreview.net
How should state be constructed from a sequence of observations, so as to best achieve
some objective? Most deep learning methods update the parameters of the state …

Gradient descent on neurons and its link to approximate second-order optimization

F Benzing - International Conference on Machine Learning, 2022 - proceedings.mlr.press
Second-order optimizers are thought to hold the potential to speed up neural network
training, but due to the enormous size of the curvature matrix, they typically require …

Exploring the promise and limits of real-time recurrent learning

K Irie, A Gopalakrishnan, J Schmidhuber - arxiv preprint arxiv:2305.19044, 2023 - arxiv.org
Real-time recurrent learning (RTRL) for sequence-processing recurrent neural networks
(RNNs) offers certain conceptual advantages over backpropagation through time (BPTT) …

A practical sparse approximation for real time recurrent learning

J Menick, E Elsen, U Evci, S Osindero… - arxiv preprint arxiv …, 2020 - arxiv.org
Current methods for training recurrent neural networks are based on backpropagation
through time, which requires storing a complete history of network states, and prohibits …

Amortized proximal optimization

J Bae, P Vicol, JZ HaoChen… - Advances in Neural …, 2022 - proceedings.neurips.cc
We propose a framework for online meta-optimization of parameters that govern
optimization, called Amortized Proximal Optimization (APO). We first interpret various …

General value function networks

M Schlegel, A Jacobsen, Z Abbas, A Patterson… - Journal of Artificial …, 2021 - jair.org
State construction is important for learning in partially observable environments. A general
purpose strategy for state construction is to learn the state update using a Recurrent Neural …