Google 학술 검색

C Berner, G Brockman, B Chan, V Cheung… - arxiv preprint arxiv …, 2019 - arxiv.org

On April 13th, 2019, OpenAI Five became the first AI system to defeat the world champions
at an esports game. The game of Dota 2 presents novel challenges for AI systems such as …

[Free GPT-4]

[PDF] arxiv.org

Go-explore: a new approach for hard-exploration problems

A Ecoffet, J Huizinga, J Lehman, KO Stanley… - arxiv preprint arxiv …, 2019 - arxiv.org

A grand challenge in reinforcement learning is intelligent exploration, especially when
rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard …

저장 인용 468회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions

R Wang, J Lehman, J Clune, KO Stanley - arxiv preprint arxiv:1901.01753, 2019 - arxiv.org

While the history of machine learning so far largely encompasses a series of problems
posed by researchers and algorithms that learn their solutions, an important question is …

저장 인용 279회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]

[HTML] nih.gov

Latent learning, cognitive maps, and curiosity

MZ Wang, BY Hayden - Current Opinion in Behavioral Sciences, 2021 - Elsevier

Curiosity is a desire for information that is not motivated by strategic concerns. Latent
learning is not driven by standard reinforcement processes. We propose that curiosity serves …

저장 인용 49회 인용 관련 학술자료 전체 7개의 버전

[Free GPT-4]

[PDF] arxiv.org

Obstacle tower: A generalization challenge in vision, control, and planning

A Juliani, A Khalifa, VP Berges, J Harper… - arxiv preprint arxiv …, 2019 - arxiv.org

The rapid pace of recent research in AI has been driven in part by the presence of fast and
challenging simulation environments. These environments often take the form of games; …

[Free GPT-4]

[PDF] arxiv.org

Curiosity-driven exploration in sparse-reward multi-agent reinforcement learning

J Li, P Gajane - arxiv preprint arxiv:2302.10825, 2023 - arxiv.org

Sparsity of rewards while applying a deep reinforcement learning method negatively affects
its sample-efficiency. A viable solution to deal with the sparsity of rewards is to learn via …

저장 인용 7회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Go-blend behavior and affect

M Barthet, A Liapis… - 2021 9th International …, 2021 - ieeexplore.ieee.org

This paper proposes a paradigm shift for affective computing by viewing the affect modeling
task as a reinforcement learning process. According to our proposed framework the context …

저장 인용 9회 인용 관련 학술자료 전체 7개의 버전

[Free GPT-4]

[PDF] arxiv.org

Monte-Carlo graph search for AlphaZero

J Czech, P Korus, K Kersting - arxiv preprint arxiv:2012.11045, 2020 - arxiv.org

The AlphaZero algorithm has been successfully applied in a range of discrete domains,
most notably board games. It utilizes a neural network, that learns a value and policy …

저장 인용 9회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Predictive coding for boosting deep reinforcement learning with sparse rewards

X Lu, S Tiomkin, P Abbeel - arxiv preprint arxiv:1912.13414, 2019 - arxiv.org

While recent progress in deep reinforcement learning has enabled robots to learn complex
behaviors, tasks with long horizons and sparse rewards remain an ongoing challenge. In …

저장 인용 7회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Toybox: a suite of environments for experimental evaluation of deep reinforcement learning

E Tosch, K Clary, J Foley, D Jensen - arxiv preprint arxiv:1905.02825, 2019 - arxiv.org

Evaluation of deep reinforcement learning (RL) is inherently challenging. In particular,
learned policies are largely opaque, and hypotheses about the behavior of deep RL agents …

저장 인용 6회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Montezuma’s revenge solved by go-explore, a new algorithm for hard-exploration problems...

Dota 2 with large scale deep reinforcement learning

Go-explore: a new approach for hard-exploration problems

Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions

Latent learning, cognitive maps, and curiosity

Obstacle tower: A generalization challenge in vision, control, and planning

Curiosity-driven exploration in sparse-reward multi-agent reinforcement learning

Go-blend behavior and affect

Monte-Carlo graph search for AlphaZero

Predictive coding for boosting deep reinforcement learning with sparse rewards

Toybox: a suite of environments for experimental evaluation of deep reinforcement learning