Adversarial policies beat superhuman go AIs

TT Wang, A Gleave, T Tseng, K Pelrine… - International …, 2023 - proceedings.mlr.press
We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies
against it, achieving a $> $97% win rate against KataGo running at superhuman settings …

Adversarial Machine Learning Attacks and Defences in Multi-Agent Reinforcement Learning

M Standen, J Kim, C Szabo - ACM Computing Surveys, 2023 - dl.acm.org
Multi-Agent Reinforcement Learning (MARL) is susceptible to Adversarial Machine Learning
(AML) attacks. Execution-time AML attacks against MARL are complex due to effects that …

Evaluating superhuman models with consistency checks

L Fluri, D Paleka, F Tramèr - 2024 IEEE Conference on Secure …, 2024 - ieeexplore.ieee.org
If machine learning models were to achieve superhuman abilities at various reasoning or
decision-making tasks, how would we go about evaluating such models, given that humans …

Adversarial policies beat professional-level go ais

TT Wang, A Gleave, N Belrose, T Tseng… - Deep Reinforcement …, 2022 - openreview.net
We attack the state-of-the-art Go-playing AI system, KataGo, by training an adversarial policy
that plays against a frozen KataGo victim. Our attack achieves a> 99% win-rate against …

Learning near-optimal intrusion responses against dynamic attackers

K Hammar, R Stadler - IEEE Transactions on Network and …, 2023 - ieeexplore.ieee.org
We study automated intrusion response and formulate the interaction between an attacker
and a defender as an optimal stop** game where attack and defense strategies evolve …

Last-iterate convergence with full and noisy feedback in two-player zero-sum games

K Abe, K Ariu, M Sakamoto, K Toyoshima… - arxiv preprint arxiv …, 2022 - arxiv.org
This paper proposes Mutation-Driven Multiplicative Weights Update (M2WU) for learning an
equilibrium in two-player zero-sum normal-form games and proves that it exhibits the last …

Scalable learning of intrusion response through recursive decomposition

K Hammar, R Stadler - International Conference on Decision and Game …, 2023 - Springer
We study automated intrusion response for an IT infrastructure and formulate the interaction
between an attacker and a defender as a partially observed stochastic game. To solve the …

Mutation-driven follow the regularized leader for last-iterate convergence in zero-sum games

K Abe, M Sakamoto, A Iwasaki - Uncertainty in Artificial …, 2022 - proceedings.mlr.press
In this study, we consider a variant of the Follow the Regularized Leader (FTRL) dynamics in
two-player zero-sum games. FTRL is guaranteed to converge to a Nash equilibrium when …

Computing ex ante coordinated team-maxmin equilibria in zero-sum multiplayer extensive-form games

Y Zhang, B An, J Černý - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org
Computational game theory has many applications in the modern world in both adversarial
situations and the optimization of social good. While there exist many algorithms for …

Robust Deep Reinforcement Learning Through Adversarial Attacks and Training: A Survey

L Schott, J Delas, H Hajri, E Gherbi, R Yaich… - arxiv preprint arxiv …, 2024 - arxiv.org
Deep Reinforcement Learning (DRL) is a subfield of machine learning for training
autonomous agents that take sequential actions across complex environments. Despite its …