Theo dõi
Johan Ferret
Johan Ferret
Research Scientist, Google DeepMind
Email được xác minh tại google.com - Trang chủ
Tiêu đề
Trích dẫn bởi
Trích dẫn bởi
Năm
Gemini: a Family of Highly Capable Multimodal Models
G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ...
arXiv preprint arXiv:2312.11805, 2023
32082023
Gemma: Open Models Based on Gemini Research and Technology
G Team, T Mesnard, C Hardin, R Dadashi, S Bhupatiraju, S Pathak, ...
arXiv preprint arXiv:2403.08295, 2024
1187*2024
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
H Lee, S Phatale, H Mansoor, T Mesnard, J Ferret, K Lu, C Bishop, E Hall, ...
International Conference on Machine Learning (ICML 2024), 2023
548*2023
Gemma 2: Improving Open Language Models at a Practical Size
G Team, M Riviere, S Pathak, PG Sessa, C Hardin, S Bhupatiraju, ...
arXiv preprint arXiv:2408.00118, 2024
473*2024
Acme: A Research Framework for Distributed Reinforcement Learning
MW Hoffman, B Shahriari, J Aslanides, G Barth-Maron, N Momchev, ...
arXiv preprint arXiv:2006.00979, 2020
2752020
Direct Language Model Alignment from Online AI Feedback
S Guo, B Zhang, T Liu, T Liu, M Khalman, F Llinares, A Rame, T Mesnard, ...
arXiv preprint arXiv:2402.04792, 2024
1072024
Adversarially Guided Actor-Critic
Y Flet-Berliac*, J Ferret*, O Pietquin, P Preux, M Geist
International Conference on Learning Representations (ICLR 2021), 2021
962021
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
P Roit*, J Ferret*, L Shani*, R Aharoni, G Cideron, R Dadashi, M Geist, ...
ACL, 2023
772023
WARM: On the Benefits of Weight Averaged Reward Models
A Ramé, N Vieillard, L Hussenot, R Dadashi, G Cideron, O Bachem, ...
International Conference on Machine Learning (ICML 2024), 2024
66*2024
Self-Attentional Credit Assignment for Transfer in Reinforcement Learning
J Ferret, R Marinier, M Geist, O Pietquin
International Joint Conference on Artificial Intelligence (IJCAI 2020), 2019
362019
Lazy-MDPs: Towards Interpretable Reinforcement Learning By Learning When To Act
A Jacq*, J Ferret*, O Pietquin, M Geist
International Conference on Autonomous Agents and Multiagent Systems (AAMAS …, 2022
28*2022
Self-Imitation Advantage Learning
J Ferret, O Pietquin, M Geist
International Conference on Autonomous Agents and Multiagent Systems (AAMAS …, 2020
262020
There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning
N Grinsztajn*, J Ferret*, O Pietquin, P Preux, M Geist
Advances in Neural Information Processing Systems (NeurIPS 2021), 2021
242021
BOND: Aligning LLMs with Best-of-N Distillation
PG Sessa, R Dadashi, L Hussenot, J Ferret, N Vieillard, A Ramé, ...
International Conference on Learning Representations (ICLR 2025), 2025
212025
WARP: On the Benefits of Weight Averaged Rewarded Policies
A Ramé, J Ferret, N Vieillard, R Dadashi, L Hussenot, PL Cedoz, ...
arXiv preprint arXiv:2406.16768, 2024
17*2024
A Survey of Temporal Credit Assignment in Deep Reinforcement Learning
E Pignatelli, J Ferret, M Geist, T Mesnard, H van Hasselt, O Pietquin, ...
Transactions on Machine Learning Research (TMLR), 2023
142023
Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
K Wang, R Kidambi, R Sullivan, A Agarwal, C Dann, A Michi, M Gelmi, ...
EMNLP Findings, 2024
92024
Recurrentgemma: Moving past transformers for efficient open language models
A Botev, S De, SL Smith, A Fernando, GC Muraru, R Haroun, L Berrada, ...
arXiv preprint arXiv:2404.07839, 2024
9*2024
Credit assignment as a proxy for transfer in reinforcement learning
J Ferret, R Marinier, M Geist, O Pietquin
Learning Transferrable Skills Workshop, NeurIPS, 2019
62019
Humanity's Last Exam
L Phan, A Gatti, Z Han, N Li, J Hu, H Zhang, S Shi, M Choi, A Agrawal, ...
arXiv preprint arXiv:2501.14249, 2025
52025
Hệ thống không thể thực hiện thao tác ngay bây giờ. Hãy thử lại sau.
Bài viết 1–20