Theo dõi
Geoffrey Cideron
Geoffrey Cideron
Google DeepMind
Email được xác minh tại google.com
Tiêu đề
Trích dẫn bởi
Trích dẫn bởi
Năm
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ...
arXiv preprint arXiv:2312.11805, 2023
32082023
Qd-rl: Efficient mixing of quality and diversity in reinforcement learning
G Cideron, T Pierrot, N Perrin, K Beguir, O Sigaud
arXiv preprint arXiv:2006.08505 36, 2020
97*2020
Factually consistent summarization via reinforcement learning with textual entailment feedback
P Roit, J Ferret, L Shani, R Aharoni, G Cideron, R Dadashi, M Geist, ...
arXiv preprint arXiv:2306.00186, 2023
772023
Warm: On the benefits of weight averaged reward models
A Ramé, N Vieillard, L Hussenot, R Dadashi, G Cideron, O Bachem, ...
arXiv preprint arXiv:2401.12187, 2024
592024
Higher: Improving instruction following with hindsight generation for experience replay
G Cideron, M Seurin, F Strub, O Pietquin
2020 IEEE Symposium Series on Computational Intelligence (SSCI), 225-232, 2020
56*2020
Bond: Aligning llms with best-of-n distillation
PG Sessa, R Dadashi, L Hussenot, J Ferret, N Vieillard, A Ramé, ...
arXiv preprint arXiv:2407.14622, 2024
212024
Musicrl: Aligning music generation to human preferences
G Cideron, S Girgin, M Verzetti, D Vincent, M Kastelic, Z Borsos, ...
arXiv preprint arXiv:2402.04229, 2024
142024
Conditional Language Policy: A General Framework for Steerable Multi-Objective Finetuning
K Wang, R Kidambi, R Sullivan, A Agarwal, C Dann, A Michi, M Gelmi, ...
arXiv preprint arXiv:2407.15762, 2024
92024
Get back here: Robust imitation by return-to-distribution planning
G Cideron, B Tabanpour, S Curi, S Girgin, L Hussenot, G Dulac-Arnold, ...
arXiv preprint arXiv:2305.01400, 2023
62023
vec2text with round-trip translations
G Cideron, S Girgin, A Raichuk, O Pietquin, O Bachem, L Hussenot
arXiv preprint arXiv:2209.06792, 2022
42022
Diversity-rewarded CFG distillation
G Cideron, A Agostinelli, J Ferret, S Girgin, R Elie, O Bachem, S Perrin, ...
arXiv preprint arXiv:2410.06084, 2024
22024
Hệ thống không thể thực hiện thao tác ngay bây giờ. Hãy thử lại sau.
Bài viết 1–11