[PDF][PDF] Predictive representations: Building blocks of intelligence

W Carvalho, MS Tomov, W de Cothi, C Barry… - Neural …, 2024 - direct.mit.edu
Adaptive behavior often requires predicting future events. The theory of reinforcement
learning prescribes what kinds of predictive representations are useful and how to compute …

For sale: State-action representation learning for deep reinforcement learning

S Fujimoto, WD Chang, E Smith… - Advances in …, 2024 - proceedings.neurips.cc
In reinforcement learning (RL), representation learning is a proven tool for complex image-
based tasks, but is often overlooked for environments with low-level states, such as physical …

Data representativity for machine learning and AI systems

LH Clemmensen, RD Kjærsgaard - arxiv preprint arxiv:2203.04706, 2022 - arxiv.org
Data representativity is crucial when drawing inference from data through machine learning
models. Scholars have increased focus on unraveling the bias and fairness in models, also …

Beyond uniform sampling: Offline reinforcement learning with imbalanced datasets

ZW Hong, A Kumar, S Karnik… - Advances in …, 2023 - proceedings.neurips.cc
Offline reinforcement learning (RL) enables learning a decision-making policy without
interaction with the environment. This makes it particularly beneficial in situations where …

Marginal density ratio for off-policy evaluation in contextual bandits

MF Taufiq, A Doucet, R Cornish… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Off-Policy Evaluation (OPE) in contextual bandits is crucial for assessing new
policies using existing data without costly experimentation. However, current OPE methods …

Why should i trust you, bellman? the bellman error is a poor replacement for value error

S Fujimoto, D Meger, D Precup, O Nachum… - arxiv preprint arxiv …, 2022 - arxiv.org
In this work, we study the use of the Bellman equation as a surrogate objective for value
prediction accuracy. While the Bellman equation is uniquely solved by the true value …

Sample complexity of nonparametric off-policy evaluation on low-dimensional manifolds using deep networks

X Ji, M Chen, M Wang, T Zhao - arxiv preprint arxiv:2206.02887, 2022 - arxiv.org
We consider the off-policy evaluation problem of reinforcement learning using deep
convolutional neural networks. We analyze the deep fitted Q-evaluation method for …

Composing task knowledge with modular successor feature approximators

W Carvalho, A Filos, RL Lewis, S Singh - arxiv preprint arxiv …, 2023 - arxiv.org
Recently, the Successor Features and Generalized Policy Improvement (SF&GPI) framework
has been proposed as a method for learning, composing, and transferring predictive …

Policy Correction and State-Conditioned Action Evaluation for Few-Shot Lifelong Deep Reinforcement Learning

M Xu, X Chen, J Wang - IEEE Transactions on Neural …, 2024 - ieeexplore.ieee.org
Lifelong deep reinforcement learning (DRL) approaches are commonly employed to adapt
continuously to new tasks without forgetting previously acquired knowledge. While current …

Knowledge transfer in multi-agent reinforcement learning with incremental number of agents

W Liu, L Dong, J Liu, C Sun - Journal of systems engineering …, 2022 - ieeexplore.ieee.org
In this paper, the reinforcement learning method for cooperative multi-agent systems (MAS)
with incremental number of agents is studied. The existing multi-agent reinforcement …