محقق Google

N Brown, A Lerer, S Gross… - … conference on machine …, 2019‏ - proceedings.mlr.press‏

Abstract Counterfactual Regret Minimization (CFR) is the leading algorithm for solving large
imperfect-information games. It converges to an equilibrium by iteratively traversing the …‏

ذخیره ارجاع بیان شده در 294 یافته مقاله‌های مربوط تمام نسخه‌های 5 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Combining deep reinforcement learning and search for imperfect-information games‏

N Brown, A Bakhtin, A Lerer… - Advances in neural …, 2020‏ - proceedings.neurips.cc‏

The combination of deep reinforcement learning and search at both training and test time is
a powerful paradigm that has led to a number of successes in single-agent settings and …‏

ذخیره ارجاع بیان شده در 175 یافته مقاله‌های مربوط تمام نسخه‌های 9 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games‏

S Sokota, R D'Orazio, JZ Kolter, N Loizou… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

This work studies an algorithm, which we call magnetic mirror descent, that is inspired by
mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is …‏

ذخیره ارجاع بیان شده در 62 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Robust multi-agent reinforcement learning with state uncertainty‏

S He, S Han, S Su, S Han, S Zou, F Miao - arxiv preprint arxiv:2307.16212, 2023‏ - arxiv.org‏

In real-world multi-agent reinforcement learning (MARL) applications, agents may not have
perfect state information (eg, due to inaccurate measurement or malicious attacks), which …‏

ذخیره ارجاع بیان شده در 42 یافته مقاله‌های مربوط تمام نسخه‌های 6 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Faster game solving via predictive blackwell approachability: Connecting regret matching and mirror descent‏

G Farina, C Kroer, T Sandholm - … of the AAAI Conference on Artificial …, 2021‏ - ojs.aaai.org‏

Blackwell approachability is a framework for reasoning about repeated games with vector-
valued payoffs. We introduce predictive Blackwell approachability, where an estimate of the …‏

ذخیره ارجاع بیان شده در 79 یافته مقاله‌های مربوط تمام نسخه‌های 8 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

[PDF][PDF] From external to swap regret 2.0: An efficient reduction for large action spaces‏

Y Dagan, C Daskalakis, M Fishelson… - Proceedings of the 56th …, 2024‏ - dl.acm.org‏

We provide a novel reduction from swap-regret minimization to external-regret minimization,
which improves upon the classical reductions of Blum-Mansour and Stoltz-Lugosi in that it …‏

ذخیره ارجاع بیان شده در 17 یافته مقاله‌های مربوط تمام نسخه‌های 3

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Last-iterate convergence in extensive-form games‏

CW Lee, C Kroer, H Luo - Advances in Neural Information …, 2021‏ - proceedings.neurips.cc‏

Regret-based algorithms are highly efficient at finding approximate Nash equilibria in
sequential games such as poker games. However, most regret-based algorithms, including …‏

ذخیره ارجاع بیان شده در 51 یافته مقاله‌های مربوط تمام نسخه‌های 6 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Learning in two-player zero-sum partially observable Markov games with perfect recall‏

T Kozuno, P Ménard, R Munos… - Advances in Neural …, 2021‏ - proceedings.neurips.cc‏

We study the problem of learning a Nash equilibrium (NE) in an extensive game with
imperfect information (EGII) through self-play. Precisely, we focus on two-player, zero-sum …‏

ذخیره ارجاع بیان شده در 48 یافته مقاله‌های مربوط تمام نسخه‌های 9 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Kernelized multiplicative weights for 0/1-polyhedral games: Bridging the gap between learning in extensive-form and normal-form games‏

G Farina, CW Lee, H Luo… - … Conference on Machine …, 2022‏ - proceedings.mlr.press‏

While extensive-form games (EFGs) can be converted into normal-form games (NFGs),
doing so comes at the cost of an exponential blowup of the strategy space. So, progress on …‏

ذخیره ارجاع بیان شده در 35 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Stochastic mirror descent: Convergence analysis and adaptive variants via the mirror stochastic polyak stepsize‏

R D'Orazio, N Loizou, I Laradji, I Mitliagkas - arxiv preprint arxiv …, 2021‏ - arxiv.org‏

We investigate the convergence of stochastic mirror descent (SMD) under interpolation in
relatively smooth and smooth convex optimization. In relatively smooth convex optimization …‏

ذخیره ارجاع بیان شده در 35 یافته مقاله‌های مربوط تمام نسخه‌های 5 نسخه HTML

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

Faster algorithms for extensive-form game solving via improved smoothing functions

Deep counterfactual regret minimization‏

Combining deep reinforcement learning and search for imperfect-information games‏

A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games‏

Robust multi-agent reinforcement learning with state uncertainty‏

Faster game solving via predictive blackwell approachability: Connecting regret matching and mirror descent‏

[PDF][PDF] From external to swap regret 2.0: An efficient reduction for large action spaces‏

Last-iterate convergence in extensive-form games‏

Learning in two-player zero-sum partially observable Markov games with perfect recall‏

Kernelized multiplicative weights for 0/1-polyhedral games: Bridging the gap between learning in extensive-form and normal-form games‏

Stochastic mirror descent: Convergence analysis and adaptive variants via the mirror stochastic polyak stepsize‏