„Google“ mokslinčius

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

V-Learning--A Simple, Efficient, Decentralized Algorithm for Multiagent RL

C **, Q Liu, Y Wang, T Yu - arxiv preprint arxiv:2110.14555, 2021 - arxiv.org

A major challenge of multiagent reinforcement learning (MARL) is the curse of multiagents,
where the size of the joint action space scales exponentially with the number of agents. This …

Išsaugoti Cituoti Cituoja 128 Susiję straipsniai Visos 4 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

When can we learn general-sum Markov games with a large number of players sample-efficiently?

Z Song, S Mei, Y Bai - arxiv preprint arxiv:2110.04184, 2021 - arxiv.org

Multi-agent reinforcement learning has made substantial empirical progresses in solving
games with a large number of players. However, theoretically, the best known sample …

Išsaugoti Cituoti Cituoja 118 Susiję straipsniai Visos 3 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of decision making in adversarial games

X Li, M Meng, Y Hong, J Chen - Science China Information Sciences, 2024 - Springer

In many practical applications, such as poker, chess, drug interdiction, cybersecurity, and
national defense, players often have adversarial stances, ie, the selfish actions of each …

Išsaugoti Cituoti Cituoja 18 Susiję straipsniai Visos 4 versijos

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

The power of exploiter: Provable multi-agent rl in large state spaces

C **, Q Liu, T Yu - International Conference on Machine …, 2022 - proceedings.mlr.press

Modern reinforcement learning (RL) commonly engages practical problems with large state
spaces, where function approximation must be deployed to approximate either the value …

Išsaugoti Cituoti Cituoja 69 Susiję straipsniai Visos 8 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Breaking the traditional: a survey of algorithmic mechanism design applied to economic and complex environments

Q Chen, X Wang, ZL Jiang, Y Wu, H Li, L Cui… - Neural Computing and …, 2023 - Springer

The mechanism design theory can be applied not only in the economy but also in many
fields, such as politics and military affairs, which has important practical and strategic …

Išsaugoti Cituoti Cituoja 6 Susiję straipsniai Visos 8 versijos

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Sequential information design: Learning to persuade in the dark

M Bernasconi, M Castiglioni… - Advances in …, 2022 - proceedings.neurips.cc

We study a repeated information design problem faced by an informed sender who tries to
influence the behavior of a self-interested receiver. We consider settings where the receiver …

Išsaugoti Cituoti Cituoja 32 Susiję straipsniai Visos 8 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Computing optimal equilibria and mechanisms via learning in zero-sum extensive-form games

B Zhang, G Farina, I Anagnostides… - Advances in …, 2023 - proceedings.neurips.cc

We introduce a new approach for computing optimal equilibria via learning in games. It
applies to extensive-form settings with any number of players, including mechanism design …

Išsaugoti Cituoti Cituoja 18 Susiję straipsniai Visos 11 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dream: Deep regret minimization with advantage baselines and model-free learning

E Steinberger, A Lerer, N Brown - arxiv preprint arxiv:2006.10410, 2020 - arxiv.org

We introduce DREAM, a deep reinforcement learning algorithm that finds optimal strategies
in imperfect-information games with multiple agents. Formally, DREAM converges to a Nash …

Išsaugoti Cituoti Cituoja 61 Susiję straipsniai Visos 2 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Near-optimal learning of extensive-form games with imperfect information

Y Bai, C **, S Mei, T Yu - International Conference on …, 2022 - proceedings.mlr.press

This paper resolves the open question of designing near-optimal algorithms for learning
imperfect-information extensive-form games from bandit feedback. We present the first line …

Išsaugoti Cituoti Cituoja 34 Susiję straipsniai Visos 7 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Polynomial-time linear-swap regret minimization in imperfect-information sequential games

G Farina, C Pipis - Advances in Neural Information …, 2023 - proceedings.neurips.cc

No-regret learners seek to minimize the difference between the loss they cumulated through
the actions they played, and the loss they would have cumulated in hindsight had they …

Išsaugoti Cituoti Cituoja 13 Susiję straipsniai Visos 6 versijos HTML kopija

Cituoti

Išplėstinė paieška

Išsaugota skiltyje „Mano biblioteka“

V-Learning--A Simple, Efficient, Decentralized Algorithm for Multiagent RL

When can we learn general-sum Markov games with a large number of players sample-efficiently?

A survey of decision making in adversarial games

The power of exploiter: Provable multi-agent rl in large state spaces

Breaking the traditional: a survey of algorithmic mechanism design applied to economic and complex environments

Sequential information design: Learning to persuade in the dark

Computing optimal equilibria and mechanisms via learning in zero-sum extensive-form games

Dream: Deep regret minimization with advantage baselines and model-free learning

Near-optimal learning of extensive-form games with imperfect information

Polynomial-time linear-swap regret minimization in imperfect-information sequential games