Adversarial policies: Attacking deep reinforcement learning

A Gleave, M Dennis, C Wild, N Kant, S Levine… - arxiv preprint arxiv …, 2019 - arxiv.org
Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial
perturbations to their observations, similar to adversarial examples for classifiers. However …

From motor control to team play in simulated humanoid football

S Liu, G Lever, Z Wang, J Merel, SMA Eslami… - Science Robotics, 2022 - science.org
Learning to combine control at the level of joint torques with longer-term goal-directed
behavior is a long-standing challenge for physically embodied artificial agents. Intelligent …

Scalable evaluation of multi-agent reinforcement learning with melting pot

JZ Leibo, EA Dueñez-Guzman… - International …, 2021 - proceedings.mlr.press
Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess
generalization to novel situations as their primary objective (unlike supervised learning …

The ai economist: Improving equality and productivity with ai-driven tax policies

S Zheng, A Trott, S Srinivasa, N Naik… - arxiv preprint arxiv …, 2020 - arxiv.org
Tackling real-world socio-economic challenges requires designing and testing economic
policies. However, this is hard in practice, due to a lack of appropriate (micro-level) …

Multi-objective multi-agent decision making: a utility-based analysis and survey

R Rădulescu, P Mannion, DM Roijers… - Autonomous Agents and …, 2020 - Springer
The majority of multi-agent system implementations aim to optimise agents' policies with
respect to a single objective, despite the fact that many real-world problem domains are …

Adversarial policies beat superhuman go AIs

TT Wang, A Gleave, T Tseng, K Pelrine… - International …, 2023 - proceedings.mlr.press
We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies
against it, achieving a $> $97% win rate against KataGo running at superhuman settings …