Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Multi-agent reinforcement learning: A selective overview of theories and algorithms
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …
has registered tremendous success in solving various sequential decision-making problems …
An overview of multi-agent reinforcement learning from game theoretical perspective
Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
Near-optimal learning of extensive-form games with imperfect information
This paper resolves the open question of designing near-optimal algorithms for learning
imperfect-information extensive-form games from bandit feedback. We present the first line …
imperfect-information extensive-form games from bandit feedback. We present the first line …
Efficient deviation types and learning for hindsight rationality in extensive-form games
Hindsight rationality is an approach to playing general-sum games that prescribes no-regret
learning dynamics for individual agents with respect to a set of deviations, and further …
learning dynamics for individual agents with respect to a set of deviations, and further …
Double neural counterfactual regret minimization
Counterfactual Regret Minimization (CRF) is a fundamental and effective technique for
solving Imperfect Information Games (IIG). However, the original CRF algorithm only works …
solving Imperfect Information Games (IIG). However, the original CRF algorithm only works …
Escher: Eschewing importance sampling in games by computing a history value function to estimate regret
Recent techniques for approximating Nash equilibria in very large games leverage neural
networks to learn approximately optimal policies (strategies). One promising line of research …
networks to learn approximately optimal policies (strategies). One promising line of research …
Single deep counterfactual regret minimization
E Steinberger - arxiv preprint arxiv:1901.07621, 2019 - arxiv.org
Counterfactual Regret Minimization (CFR) is the most successful algorithm for finding
approximate Nash equilibria in imperfect information games. However, CFR's reliance on …
approximate Nash equilibria in imperfect information games. However, CFR's reliance on …
Time and space: Why imperfect information games are hard
N Burch - 2018 - era.library.ualberta.ca
Decision-making problems with two agents can be modeled as two player games, and a
Nash equilibrium is the basic solution concept describing good play in adversarial games …
Nash equilibrium is the basic solution concept describing good play in adversarial games …
Steering language models with game-theoretic solvers
Mathematical models of strategic interactions among rational agents have long been studied
in game theory. However the interactions studied are often over a small set of discrete …
in game theory. However the interactions studied are often over a small set of discrete …
The advantage regret-matching actor-critic
Regret minimization has played a key role in online learning, equilibrium computation in
games, and reinforcement learning (RL). In this paper, we describe a general model-free RL …
games, and reinforcement learning (RL). In this paper, we describe a general model-free RL …