First return, then explore

A Ecoffet, J Huizinga, J Lehman, KO Stanley, J Clune - Nature, 2021 - nature.com
Reinforcement learning promises to solve complex sequential-decision problems
autonomously by specifying a high-level reward function only. However, reinforcement …

Go-explore: a new approach for hard-exploration problems

A Ecoffet, J Huizinga, J Lehman, KO Stanley… - arxiv preprint arxiv …, 2019 - arxiv.org
A grand challenge in reinforcement learning is intelligent exploration, especially when
rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard …

Revisiting the arcade learning environment: Evaluation protocols and open problems for general agents

MC Machado, MG Bellemare, E Talvitie… - Journal of Artificial …, 2018 - jair.org
The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge
of building AI agents with general competency across dozens of Atari 2600 games. It …

[LIVRE][B] Artificial intelligence and games

GN Yannakakis, J Togelius - 2018 - Springer
Georgios N. Yannakakis Julian Togelius Page 1 Artificial Intelligence and Games Georgios N.
Yannakakis Julian Togelius Page 2 Artificial Intelligence and Games Page 3 Georgios N …

A survey of algorithms for black-box safety validation of cyber-physical systems

A Corso, R Moss, M Koren, R Lee… - Journal of Artificial …, 2021 - jair.org
Autonomous cyber-physical systems (CPS) can improve safety and efficiency for safety-
critical applications, but require rigorous testing before deployment. The complexity of these …

The benchmark lottery

M Dehghani, Y Tay, AA Gritsenko, Z Zhao… - arxiv preprint arxiv …, 2021 - arxiv.org
The world of empirical machine learning (ML) strongly relies on benchmarks in order to
determine the relative effectiveness of different algorithms and methods. This paper …

[LIVRE][B] Distributional reinforcement learning

MG Bellemare, W Dabney, M Rowland - 2023 - books.google.com
The first comprehensive guide to distributional reinforcement learning, providing a new
mathematical formalism for thinking about decisions from a probabilistic perspective …

State of the art control of atari games using shallow reinforcement learning

Y Liang, MC Machado, E Talvitie, M Bowling - arxiv preprint arxiv …, 2015 - arxiv.org
The recently introduced Deep Q-Networks (DQN) algorithm has gained attention as one of
the first successful combinations of deep neural networks and reinforcement learning. Its …

Best-first width search: Exploration and exploitation in classical planning

N Lipovetzky, H Geffner - Proceedings of the AAAI Conference on …, 2017 - ojs.aaai.org
It has been shown recently that the performance of greedy best-first search (GBFS) for
computing plans that are not necessarily optimal can be improved by adding forms of …

Model-free, model-based, and general intelligence

H Geffner - arxiv preprint arxiv:1806.02308, 2018 - arxiv.org
During the 60s and 70s, AI researchers explored intuitions about intelligence by writing
programs that displayed intelligent behavior. Many good ideas came out from this work but …