How to reuse and compose knowledge for a lifetime of tasks: A survey on continual learning and functional composition

JA Mendez, E Eaton - arxiv preprint arxiv:2207.07730, 2022 - arxiv.org
A major goal of artificial intelligence (AI) is to create an agent capable of acquiring a general
understanding of the world. Such an agent would require the ability to continually …

Explainable reinforcement learning (XRL): a systematic literature review and taxonomy

Y Bekkemoen - Machine Learning, 2024 - Springer
In recent years, reinforcement learning (RL) systems have shown impressive performance
and remarkable achievements. Many achievements can be attributed to combining RL with …

A survey on interpretable reinforcement learning

C Glanois, P Weng, M Zimmer, D Li, T Yang, J Hao… - Machine Learning, 2024 - Springer
Although deep reinforcement learning has become a promising machine learning approach
for sequential decision-making problems, it is still not mature enough for high-stake domains …

Rl-gpt: Integrating reinforcement learning and code-as-policy

S Liu, H Yuan, M Hu, Y Li, Y Chen… - Advances in Neural …, 2025 - proceedings.neurips.cc
Abstract Large Language Models (LLMs) have demonstrated proficiency in utilizing various
tools by coding, yet they face limitations in handling intricate logic and precise control. In …

Generating code world models with large language models guided by monte carlo tree search

N Dainese, M Merler, M Alakuijala… - Advances in Neural …, 2025 - proceedings.neurips.cc
In this work we consider Code World Models, world models generated by a Large Language
Model (LLM) in the form of Python code for model-based Reinforcement Learning (RL) …

Interpretable and editable programmatic tree policies for reinforcement learning

H Kohler, Q Delfosse, R Akrour, K Kersting… - arxiv preprint arxiv …, 2024 - arxiv.org
Deep reinforcement learning agents are prone to goal misalignments. The black-box nature
of their policies hinders the detection and correction of such misalignments, and the trust …

Artificial collective intelligence engineering: a survey of concepts and perspectives

R Casadei - Artificial Life, 2023 - ieeexplore.ieee.org
Collectiveness is an important property of many systems—both natural and artificial. By
exploiting a large number of individuals, it is often possible to produce effects that go far …

Instructing goal-conditioned reinforcement learning agents with temporal logic objectives

W Qiu, W Mao, H Zhu - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Goal-conditioned reinforcement learning (RL) is a powerful approach for learning general-
purpose skills by reaching diverse goals. However, it has limitations when it comes to task …

Show me the way! Bilevel search for synthesizing programmatic strategies

DS Aleixo, LHS Lelis - Proceedings of the AAAI Conference on Artificial …, 2023 - ojs.aaai.org
The synthesis of programmatic strategies requires one to search in large non-differentiable
spaces of computer programs. Current search algorithms use self-play approaches to guide …

Synthesizing programmatic reinforcement learning policies with large language model guided search

M Liu, CH Yu, WH Lee, CW Hung, YC Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
Programmatic reinforcement learning (PRL) has been explored for representing policies
through programs as a means to achieve interpretability and generalization. Despite …