Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Multi-agent reinforcement learning: A selective overview of theories and algorithms
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …
has registered tremendous success in solving various sequential decision-making problems …
An overview of multi-agent reinforcement learning from game theoretical perspective
Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
Human-level play in the game of Diplomacy by combining language models with strategic reasoning
Meta Fundamental AI Research Diplomacy Team … - Science, 2022 - science.org
Despite much progress in training artificial intelligence (AI) systems to imitate human
language, building agents that use language to communicate intentionally with humans in …
language, building agents that use language to communicate intentionally with humans in …
Solving imperfect-information games via discounted regret minimization
Counterfactual regret minimization (CFR) is a family of iterative algorithms that are the most
popular and, in practice, fastest approach to approximately solving large …
popular and, in practice, fastest approach to approximately solving large …
Safe and nested subgame solving for imperfect-information games
In imperfect-information games, the optimal strategy in a subgame may depend on the
strategy in other, unreached subgames. Thus a subgame cannot be solved in isolation and …
strategy in other, unreached subgames. Thus a subgame cannot be solved in isolation and …
Actor-critic policy optimization in partially observable multiagent environments
Optimization of parameterized policies for reinforcement learning (RL) is an important and
challenging problem in artificial intelligence. Among the most common approaches are …
challenging problem in artificial intelligence. Among the most common approaches are …
Mastering the game of no-press diplomacy via human-regularized reinforcement learning and planning
No-press Diplomacy is a complex strategy game involving both cooperation and competition
that has served as a benchmark for multi-agent AI research. While self-play reinforcement …
that has served as a benchmark for multi-agent AI research. While self-play reinforcement …
XDO: A double oracle algorithm for extensive-form games
Abstract Policy Space Response Oracles (PSRO) is a reinforcement learning (RL) algorithm
for two-player zero-sum games that has been empirically shown to find approximate Nash …
for two-player zero-sum games that has been empirically shown to find approximate Nash …
Faster game solving via predictive blackwell approachability: Connecting regret matching and mirror descent
Blackwell approachability is a framework for reasoning about repeated games with vector-
valued payoffs. We introduce predictive Blackwell approachability, where an estimate of the …
valued payoffs. We introduce predictive Blackwell approachability, where an estimate of the …
Computing approximate equilibria in sequential adversarial games by exploitability descent
In this paper, we present exploitability descent, a new algorithm to compute approximate
equilibria in two-player zero-sum extensive-form games with imperfect information, by direct …
equilibria in two-player zero-sum extensive-form games with imperfect information, by direct …