- Academic Search

Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org

Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

保存引用被引用数: 352 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Combining deep reinforcement learning and search for imperfect-information games

N Brown, A Bakhtin, A Lerer… - Advances in Neural …, 2020 - proceedings.neurips.cc

The combination of deep reinforcement learning and search at both training and test time is
a powerful paradigm that has led to a number of successes in single-agent settings and …

保存引用被引用数: 172 関連記事全 10 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] science.org Full View

Student of Games: A unified learning algorithm for both perfect and imperfect information games

M Schmid, M Moravčík, N Burch, R Kadlec… - Science …, 2023 - science.org

Games have a long history as benchmarks for progress in artificial intelligence. Approaches
using search and learning produced strong performance across many perfect information …

保存引用被引用数: 79 関連記事全 8 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Causal multi-agent reinforcement learning: Review and open problems

SJ Grimbly, J Shock, A Pretorius - arxiv preprint arxiv:2111.06721, 2021 - arxiv.org

This paper serves to introduce the reader to the field of multi-agent reinforcement learning
(MARL) and its intersection with methods from the study of causality. We highlight key …

保存引用被引用数: 18 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Improving policies via search in cooperative partially observable games

A Lerer, H Hu, J Foerster, N Brown - … of the AAAI conference on artificial …, 2020 - ojs.aaai.org

Recent superhuman results in games have largely been achieved in a variety of zero-sum
settings, such as Go and Poker, in which agents need to compete against others. However …

保存引用被引用数: 85 関連記事全 8 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Near-optimal learning of extensive-form games with imperfect information

Y Bai, C **, S Mei, T Yu - International Conference on …, 2022 - proceedings.mlr.press

This paper resolves the open question of designing near-optimal algorithms for learning
imperfect-information extensive-form games from bandit feedback. We present the first line …

保存引用被引用数: 33 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dream: Deep regret minimization with advantage baselines and model-free learning

E Steinberger, A Lerer, N Brown - arxiv preprint arxiv:2006.10410, 2020 - arxiv.org

We introduce DREAM, a deep reinforcement learning algorithm that finds optimal strategies
in imperfect-information games with multiple agents. Formally, DREAM converges to a Nash …

保存引用被引用数: 59 関連記事全 4 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Escher: Eschewing importance sampling in games by computing a history value function to estimate regret

S McAleer, G Farina, M Lanctot, T Sandholm - arxiv preprint arxiv …, 2022 - arxiv.org

Recent techniques for approximating Nash equilibria in very large games leverage neural
networks to learn approximately optimal policies (strategies). One promising line of research …

保存引用被引用数: 22 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Honeypot allocation for cyber deception under uncertainty

AH Anwar, CA Kamhoua, NO Leslie… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Cyber deception aims to misrepresent the state of the network to mislead the attackers,
falsify their reconnaissance conclusions, and deflect them away from their goals. Honeypots …

保存引用被引用数: 26 関連記事

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

HSVI can solve zero-sum partially observable stochastic games

A Delage, O Buffet, JS Dibangoye… - Dynamic Games and …, 2024 - Springer

State-of-the-art methods for solving 2-player zero-sum imperfect information games rely on
linear programming or regret minimization, though not on dynamic programming (DP) or …

保存引用被引用数: 13 関連記事全 13 バージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Rethinking formal models of partially observable multiagent decision making

An overview of multi-agent reinforcement learning from game theoretical perspective

Combining deep reinforcement learning and search for imperfect-information games

Student of Games: A unified learning algorithm for both perfect and imperfect information games

Causal multi-agent reinforcement learning: Review and open problems

Improving policies via search in cooperative partially observable games

Near-optimal learning of extensive-form games with imperfect information

Dream: Deep regret minimization with advantage baselines and model-free learning

Escher: Eschewing importance sampling in games by computing a history value function to estimate regret

Honeypot allocation for cyber deception under uncertainty

HSVI can solve zero-sum partially observable stochastic games