Dota 2 with large scale deep reinforcement learning
C Berner, G Brockman, B Chan, V Cheung… - arxiv preprint arxiv …, 2019 - arxiv.org
On April 13th, 2019, OpenAI Five became the first AI system to defeat the world champions
at an esports game. The game of Dota 2 presents novel challenges for AI systems such as …
at an esports game. The game of Dota 2 presents novel challenges for AI systems such as …
Go-explore: a new approach for hard-exploration problems
A grand challenge in reinforcement learning is intelligent exploration, especially when
rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard …
rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard …
Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions
While the history of machine learning so far largely encompasses a series of problems
posed by researchers and algorithms that learn their solutions, an important question is …
posed by researchers and algorithms that learn their solutions, an important question is …
Latent learning, cognitive maps, and curiosity
Curiosity is a desire for information that is not motivated by strategic concerns. Latent
learning is not driven by standard reinforcement processes. We propose that curiosity serves …
learning is not driven by standard reinforcement processes. We propose that curiosity serves …
Obstacle tower: A generalization challenge in vision, control, and planning
The rapid pace of recent research in AI has been driven in part by the presence of fast and
challenging simulation environments. These environments often take the form of games; …
challenging simulation environments. These environments often take the form of games; …
Curiosity-driven exploration in sparse-reward multi-agent reinforcement learning
J Li, P Gajane - arxiv preprint arxiv:2302.10825, 2023 - arxiv.org
Sparsity of rewards while applying a deep reinforcement learning method negatively affects
its sample-efficiency. A viable solution to deal with the sparsity of rewards is to learn via …
its sample-efficiency. A viable solution to deal with the sparsity of rewards is to learn via …
Go-blend behavior and affect
This paper proposes a paradigm shift for affective computing by viewing the affect modeling
task as a reinforcement learning process. According to our proposed framework the context …
task as a reinforcement learning process. According to our proposed framework the context …
Monte-Carlo graph search for AlphaZero
The AlphaZero algorithm has been successfully applied in a range of discrete domains,
most notably board games. It utilizes a neural network, that learns a value and policy …
most notably board games. It utilizes a neural network, that learns a value and policy …
Predictive coding for boosting deep reinforcement learning with sparse rewards
While recent progress in deep reinforcement learning has enabled robots to learn complex
behaviors, tasks with long horizons and sparse rewards remain an ongoing challenge. In …
behaviors, tasks with long horizons and sparse rewards remain an ongoing challenge. In …
Toybox: a suite of environments for experimental evaluation of deep reinforcement learning
Evaluation of deep reinforcement learning (RL) is inherently challenging. In particular,
learned policies are largely opaque, and hypotheses about the behavior of deep RL agents …
learned policies are largely opaque, and hypotheses about the behavior of deep RL agents …