Zhaohan Daniel Guo

Процитировано

	Все	Начиная с 2020 г.
Статистика цитирования	10673	10514
h-индекс	19	18
i10-индекс	24	24

3500

1750

875

2625

2018201920202021202220232024202546 73 245 1147 2285 2970 3426 422

Общий доступ

Просмотреть все

3 статьи

0 статей

доступно

недоступно

На основе финансирования

Соавторы

Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityПодтвержден адрес электронной почты в домене cs.stanford.edu
Philip ThomasUniversity of Massachusetts AmherstПодтвержден адрес электронной почты в домене cs.umass.edu
Shayan DoroudiAssistant Professor at the University of California, IrvineПодтвержден адрес электронной почты в домене uci.edu
Yao LiuAmazonПодтвержден адрес электронной почты в домене stanford.edu

Zhaohan Daniel Guo

DeepMind

Подтвержден адрес электронной почты в домене google.com - Главная страница

Reinforcement learning


Название По числу цитат По году По названию	Процитировано Процитировано	Год
Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	7414	2020
Agent57: Outperforming the atari human benchmark AP Badia, B Piot, S Kapturowski, P Sprechmann, A Vitvitskyi, ZD Guo, ... International conference on machine learning, 507-517, 2020	720	2020
koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	544	2020
Never give up: Learning directed exploration strategies AP Badia, P Sprechmann, A Vitvitskyi, D Guo, B Piot, S Kapturowski, ... arXiv preprint arXiv:2002.06038, 2020	379	2020
A general theoretical paradigm to understand learning from human preferences MG Azar, ZD Guo, B Piot, R Munos, M Rowland, M Valko, D Calandriello International Conference on Artificial Intelligence and Statistics, 4447-4455, 2024	366	2024
Joint semantic utterance classification and slot filling with recursive neural networks D Guo, G Tur, W Yih, G Zweig 2014 IEEE Spoken Language Technology Workshop (SLT), 554-559, 2014	258	2014
Bootstrap latent-predictive representations for multitask reinforcement learning ZD Guo, BA Pires, B Piot, JB Grill, F Altché, R Munos, MG Azar International Conference on Machine Learning, 3875-3886, 2020	168	2020
Nash learning from human feedback R Munos, M Valko, D Calandriello, MG Azar, M Rowland, ZD Guo, Y Tang, ... arXiv preprint arXiv:2312.00886, 2023	95	2023
Neural predictive belief representations ZD Guo, MG Azar, B Piot, BA Pires, R Munos arXiv preprint arXiv:1811.06407, 2018	95	2018
Byol-explore: Exploration by bootstrapped prediction Z Guo, S Thakoor, M Pîslar, B Avila Pires, F Altché, C Tallec, A Saade, ... Advances in neural information processing systems 35, 31855-31870, 2022	72	2022
A pac rl algorithm for episodic pomdps ZD Guo, S Doroudi, E Brunskill Artificial Intelligence and Statistics, 510-518, 2016	72	2016
Generalized preference optimization: A unified approach to offline alignment Y Tang, ZD Guo, Z Zheng, D Calandriello, R Munos, M Rowland, ... arXiv preprint arXiv:2402.05749, 2024	65	2024
Using options and covariance testing for long horizon off-policy policy evaluation Z Guo, PS Thomas, E Brunskill Advances in Neural Information Processing Systems 30, 2017	52	2017
Bootstrap your own latent: A new approach to self-supervised learning. arXiv JB Grill, F Strub, F Altché, C Tallec, PH Richemond, E Buchatskaya, ... arXiv preprint arXiv:2006.07733, 2020	47	2020
Geometric entropic exploration ZD Guo, MG Azar, A Saade, S Thakoor, B Piot, BA Pires, M Valko, ... arXiv preprint arXiv:2101.02055, 2021	43	2021
Understanding the performance gap between online and offline alignment algorithms Y Tang, DZ Guo, Z Zheng, D Calandriello, Y Cao, E Tarassov, R Munos, ... arXiv preprint arXiv:2405.08448, 2024	34	2024
Understanding self-predictive learning for reinforcement learning Y Tang, ZD Guo, PH Richemond, BA Pires, Y Chandak, R Munos, ... International Conference on Machine Learning, 33632-33656, 2023	34	2023
Concurrent pac rl Z Guo, E Brunskill Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015	33	2015
Pac continuous state online multitask reinforcement learning with identification Y Liu, Z Guo, E Brunskill Proceedings of the 2016 International Conference on Autonomous Agents …, 2016	22	2016
Charline Le Lan, Michal Valko, Tianqi Liu, et al. Human alignment of large language models through online preference optimisation D Calandriello, D Guo, R Munos, M Rowland, Y Tang, BA Pires, ... arXiv preprint arXiv:2403.08635, 2024	18	2024

В данный момент система не может выполнить эту операцию. Повторите попытку позднее.

Статьи 1–20

Ссылок за год

Повторяющиеся цитирования

Объединенные цитирования

СоавторыСоавторы

Подписаться

Процитировано

Соавторы