Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ... arXiv preprint arXiv:2312.11805, 2023 | 2556 | 2023 |
A generalist agent S Reed, K Zolna, E Parisotto, SG Colmenarejo, A Novikov, G Barth-Maron, ... arXiv preprint arXiv:2205.06175, 2022 | 1007 | 2022 |
Distributed prioritized experience replay D Horgan, J Quan, D Budden, G Barth-Maron, M Hessel, H Van Hasselt, ... arXiv preprint arXiv:1803.00933, 2018 | 977 | 2018 |
Distributed distributional deterministic policy gradients G Barth-Maron, MW Hoffman, D Budden, W Dabney, D Horgan, D Tb, ... arXiv preprint arXiv:1804.08617, 2018 | 693 | 2018 |
Data-efficient deep reinforcement learning for dexterous manipulation I Popov, N Heess, T Lillicrap, R Hafner, G Barth-Maron, M Vecerik, ... arXiv preprint arXiv:1704.03073, 2017 | 338 | 2017 |
Acme: A research framework for distributed reinforcement learning MW Hoffman, B Shahriari, J Aslanides, G Barth-Maron, N Momchev, ... arXiv preprint arXiv:2006.00979, 2020 | 272 | 2020 |
Observe and look further: Achieving consistent performance on atari T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ... arXiv preprint arXiv:1805.11593, 2018 | 145 | 2018 |
Making efficient use of demonstrations to solve hard exploration problems TL Paine, C Gulcehre, B Shahriari, M Denil, M Hoffman, H Soyer, ... arXiv preprint arXiv:1909.01387, 2019 | 99 | 2019 |
Goal-based action priors D Abel, D Hershkowitz, G Barth-Maron, S Brawner, K O'Farrell, ... Proceedings of the International Conference on Automated Planning and …, 2015 | 59 | 2015 |
QuaRL: Quantization for fast and environmentally sustainable reinforcement learning S Krishnan, M Lam, S Chitlangia, Z Wan, G Barth-Maron, A Faust, ... arXiv preprint arXiv:1910.01055, 2019 | 30 | 2019 |
One-shot high-fidelity imitation: Training large-scale deep nets with rl TL Paine, SG Colmenarejo, Z Wang, S Reed, Y Aytar, T Pfaff, ... arXiv preprint arXiv:1810.05017, 2018 | 29 | 2018 |
Reverb: A framework for experience replay A Cassirer, G Barth-Maron, E Brevdo, S Ramos, T Boyd, T Sottiaux, ... arXiv preprint arXiv:2102.04736, 2021 | 26 | 2021 |
Launchpad: A programming model for distributed machine learning research F Yang, G Barth-Maron, P Stańczyk, M Hoffman, S Liu, M Kroiss, A Pope, ... arXiv preprint arXiv:2106.04516, 2021 | 22 | 2021 |
Toward affordance-aware planning D Abel, G Barth-Maron, J MacGlashan, S Tellex First Workshop on Affordances: Affordances in Vision for Cognitive Robotics, 2014 | 16 | 2014 |
Data-efficient reinforcement learning for continuous control tasks M Riedmiller, R Hafner, M Vecerik, TP Lillicrap, T Lampe, I Popov, ... US Patent 10,664,725, 2020 | 15 | 2020 |
Reinforcement learning using distributed prioritized replay D Budden, G Barth-Maron, J Quan, DG Horgan US Patent 11,625,604, 2023 | 11 | 2023 |
Distributional reinforcement learning for continuous control tasks D Budden, MW Hoffman, G Barth-Maron US Patent 11,481,629, 2022 | 10 | 2022 |
Diego de Las Casas, Andreas Fidjeland, Tim Green, Adrià Puigdomènech, Sébastien Racanière, Jack Rae, and Fabio Viola. Open sourcing Sonnet-a new library for constructing neural … M Reynolds, G Barth-Maron, F Besse | 10 | 2017 |
Affordances as transferable knowledge for planning agents G Barth-Maron, D Abel, J MacGlashan, S Tellex 2014 AAAI Fall Symposium Series, 2014 | 7 | 2014 |
Quantized reinforcement learning (quarl) Z Wan | 6 | 2019 |