Jarvis-1: Open-world multi-task agents with memory-augmented multimodal language models

Z Wang, S Cai, A Liu, Y **, J Hou… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Achieving human-like planning and control with multimodal observations in an open world is
a key milestone for more functional generalist agents. Existing approaches can handle …

A definition of continual reinforcement learning

D Abel, A Barreto, B Van Roy… - Advances in …, 2024 - proceedings.neurips.cc
In a standard view of the reinforcement learning problem, an agent's goal is to efficiently
identify a policy that maximizes long-term reward. However, this perspective is based on a …

Deep reinforcement learning for task offloading in mobile edge computing systems

M Tang, VWS Wong - IEEE Transactions on Mobile Computing, 2020 - ieeexplore.ieee.org
In mobile edge computing systems, an edge node may have a high load when a large
number of mobile devices offload their tasks to it. Those offloaded tasks may experience …

Optimistic linear support and successor features as a basis for optimal policy transfer

LN Alegre, A Bazzan… - … conference on machine …, 2022 - proceedings.mlr.press
In many real-world applications, reinforcement learning (RL) agents might have to solve
multiple tasks, each one typically modeled via a reward function. If reward functions are …

Vlad: Task-agnostic vae-based lifelong anomaly detection

K Faber, R Corizzo, B Sniezynski, N Japkowicz - Neural Networks, 2023 - Elsevier
Lifelong learning represents an emerging machine learning paradigm that aims at designing
new methods providing accurate analyses in complex and dynamic real-world …

Near-optimal model-free reinforcement learning in non-stationary episodic mdps

W Mao, K Zhang, R Zhu… - … on Machine Learning, 2021 - proceedings.mlr.press
We consider model-free reinforcement learning (RL) in non-stationary Markov decision
processes. Both the reward functions and the state transition functions are allowed to vary …

Prediction and control in continual reinforcement learning

N Anand, D Precup - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Temporal difference (TD) learning is often used to update the estimate of the value function
which is used by RL agents to extract useful policies. In this paper, we focus on value …

Non-stationary Markov decision processes, a worst-case approach using model-based reinforcement learning

E Lecarpentier, E Rachelson - Advances in neural …, 2019 - proceedings.neurips.cc
This work tackles the problem of robust zero-shot planning in non-stationary stochastic
environments. We study Markov Decision Processes (MDPs) evolving over time and …

Fast trac: A parameter-free optimizer for lifelong reinforcement learning

A Muppidi, Z Zhang, H Yang - Advances in Neural …, 2025 - proceedings.neurips.cc
A key challenge in lifelong reinforcement learning (RL) is the loss of plasticity, where
previous learning progress hinders an agent's adaptation to new tasks. While regularization …

Representative task self-selection for flexible clustered lifelong learning

G Sun, Y Cong, Q Wang, B Zhong… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Consider the lifelong machine learning paradigm whose objective is to learn a sequence of
tasks depending on previous experiences, eg, knowledge library or deep network weights …