Smacv2: An improved benchmark for cooperative multi-agent reinforcement learning B Ellis, J Cook, S Moalla, M Samvelyan, M Sun, A Mahajan, J Foerster, ... Advances in Neural Information Processing Systems 36, 2024 | 96 | 2024 |
Lift: Reinforcement learning in computer systems by learning from demonstrations M Schaarschmidt, A Kuhnle, B Ellis, K Fricke, F Gessert, E Yoneki arXiv preprint arXiv:1808.07903, 2018 | 47 | 2018 |
Jaxmarl: Multi-agent rl environments in jax A Rutherford*, B Ellis*, M Gallici*, J Cook, A Lupu, G Ingvarsson, T Willi, ... arXiv preprint arXiv:2311.10090, 2023 | 46* | 2023 |
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning M Matthews, M Beukman, B Ellis, M Samvelyan, M Jackson, S Coward, ... arXiv preprint arXiv:2402.16801, 2024 | 25 | 2024 |
Generalization in cooperative multi-agent systems A Mahajan, M Samvelyan, T Gupta, B Ellis, M Sun, T Rocktäschel, ... arXiv preprint arXiv:2202.00104, 2022 | 22 | 2022 |
Simplifying deep temporal difference learning M Gallici, M Fellows, B Ellis, B Pou, I Masmitja, JN Foerster, M Martin arXiv preprint arXiv:2407.04811, 2024 | 10 | 2024 |
Policy-guided diffusion MT Jackson, MT Matthews, C Lu, B Ellis, S Whiteson, J Foerster arXiv preprint arXiv:2404.06356, 2024 | 9 | 2024 |
Trust-region-free policy optimization for stochastic policies M Sun, B Ellis, A Mahajan, S Devlin, K Hofmann, S Whiteson arXiv preprint arXiv:2302.07985, 2023 | 4 | 2023 |
CURATe: Benchmarking Personalised Alignment of Conversational AI Assistants L Alberts, B Ellis, A Lupu, J Foerster arXiv preprint arXiv:2410.21159, 2024 | 1 | 2024 |
Adaptive stream processing with deep reinforcement learning B Ellis Technical Report, 2018 | 1 | 2018 |
Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps B Ellis, MT Jackson, A Lupu, AD Goldie, M Fellows, S Whiteson, ... arXiv preprint arXiv:2412.17113, 2024 | | 2024 |
Beyond the Boundaries of Proximal Policy Optimization CB Tan, E Toledo, B Ellis, JN Foerster, F Huszár arXiv preprint arXiv:2411.00666, 2024 | | 2024 |
Investigating Ratio Clipping in Multi-agent Reinforcement Learning B Ellis, M Sun, S Whiteson | | |