Mohammad Gheshlaghi Azar

Navedeno

	Vse	Od leta 2020
Navedbe	16659	15322
indeks h	30	27
indeks i10	39	38

4400

2200

1100

3300

2015201620172018201920202021202220232024202544 54 92 339 714 1005 2023 3190 3975 4362 750

Javni dostop

Prikaži vse

3 članki

0 člankov

na voljo

ni na voljo

Na podlagi zahtev v povezavi s financiranjem

Soavtorji

Rémi MunosFAIR, MetaPreverjeni e-poštni naslov na inria.fr
Bilal PiotGoogle DeepmindPreverjeni e-poštni naslov na google.com
Michal ValkoChief Models Officer @ Stealth Startup, Inria & MVA - Ex: Llama at Meta; Gemini and BYOL @ DeepmindPreverjeni e-poštni naslov na meta.com
Zhaohan Daniel GuoDeepMindPreverjeni e-poštni naslov na google.com
Florent AltchéResearch Engineer, DeepMindPreverjeni e-poštni naslov na google.com
Jean-bastien GrillPreverjeni e-poštni naslov na google.com
Olivier PietquinEarth Species Project | ex Google DeepMind (On leave - Professor at University of Lille)Preverjeni e-poštni naslov na univ-lille.fr
Corentin TallecDeepMindPreverjeni e-poštni naslov na google.com
Florian STRUBCoherePreverjeni e-poštni naslov na cohere.com
Hado van HasseltResearch Scientist, DeepMind; Honorary Professor, UCLPreverjeni e-poštni naslov na google.com
Pierre RichemondGoogle DeepMindPreverjeni e-poštni naslov na deepmind.com
Hilbert Johan KappenRadboud UniversityPreverjeni e-poštni naslov na science.ru.nl
Will DabneyDeepMindPreverjeni e-poštni naslov na google.com
Elena BuchatskayaResearch Engineer, Google DeepMindPreverjeni e-poštni naslov na google.com
Eva L. DyerGeorgia Institute of TechnologyPreverjeni e-poštni naslov na gatech.edu
Matteo HesselResearch Engineer, Google DeepMindPreverjeni e-poštni naslov na google.com
Dan HorganGoogle DeepMindPreverjeni e-poštni naslov na google.com
Mark RowlandResearch Scientist, Google DeepMindPreverjeni e-poštni naslov na google.com
Ian OsbandOpenAIPreverjeni e-poštni naslov na openai.com
Shantanu ThakoorResearch Engineer at DeepMindPreverjeni e-poštni naslov na google.com

Spremljaj

Mohammad Gheshlaghi Azar

Cohere

Preverjeni e-poštni naslov na cohere.com - Domača stran

RL for Generative AI Self-Supervised Learning Exploration Optimization


Naslov Razvrsti po navedbah Razvrsti po letniku Razvrsti po naslovu	Navedeno Navedeno	Leto
Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	7478	2020
Rainbow: Combining improvements in deep reinforcement learning M Hessel, J Modayil, H Van Hasselt, T Schaul, G Ostrovski, W Dabney, ... Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018	2956	2018
Noisy Networks for Exploration M Fortunato, MG Azar, B Piot International Conference on Learning representations, 0	1212*
Minimax regret bounds for reinforcement learning MG Azar, I Osband, R Munos International conference on machine learning, 263-272, 2017	905	2017
Large-scale representation learning on graphs via bootstrapping S Thakoor, C Tallec, MG Azar, M Azabou, EL Dyer, R Munos, P Veličković, ... arXiv preprint arXiv:2102.06514, 2021	541*	2021
koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent-a new approach to self-supervised learning JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ... Advances in neural information processing systems 33, 21271-21284, 2020	529	2020
A general theoretical paradigm to understand learning from human preferences MG Azar, ZD Guo, B Piot, R Munos, M Rowland, M Valko, D Calandriello International Conference on Artificial Intelligence and Statistics, 4447-4455, 2024	413	2024
Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model M Gheshlaghi Azar, R Munos, HJ Kappen Machine learning 91, 325-349, 2013	330	2013
Speedy Q-Learning MG Azar, M Ghavamzadeh, HJ Kappen, R Munos Advances in Neural Information Processing Systems, 2411-2419, 2011	218*	2011
The reactor: A fast and sample-efficient actor-critic agent for reinforcement learning A Gruslys, W Dabney, MG Azar, B Piot, M Bellemare, R Munos arXiv preprint arXiv:1704.04651, 2017	183*	2017
Bootstrap latent-predictive representations for multitask reinforcement learning ZD Guo, BA Pires, B Piot, JB Grill, F Altché, R Munos, MG Azar International Conference on Machine Learning, 3875-3886, 2020	167	2020
Dynamic Policy Programming M Gheshlaghi Azar, V Gomez, HJ Kappen Journal of Machine Learning Research 13, 3207-3245, 2012	160	2012
Observe and look further: Achieving consistent performance on atari T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ... arXiv preprint arXiv:1805.11593, 2018	146	2018
Sequential transfer in multi-armed bandit with finite set of models MG Azar, A Lazaric, E Brunskill Advances in Neural Information Processing Systems, 2220-2228, 2013	126	2013
On the sample complexity of reinforcement learning with a generative model MG Azar, R Munos, B Kappen arXiv preprint arXiv:1206.6461, 2012	118	2012
Meta-learning of sequential strategies PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ... arXiv preprint arXiv:1905.03030, 2019	114	2019
Nash learning from human feedback R Munos, M Valko, D Calandriello, MG Azar, M Rowland, ZD Guo, Y Tang, ... arXiv preprint arXiv:2312.00886 18, 2023	112	2023
Hindsight credit assignment A Harutyunyan, W Dabney, T Mesnard, M Gheshlaghi Azar, B Piot, ... Advances in neural information processing systems 32, 2019	106	2019
Neural predictive belief representations ZD Guo, MG Azar, B Piot, BA Pires, R Munos arXiv preprint arXiv:1811.06407, 2018	95	2018
Byol-explore: Exploration by bootstrapped prediction Z Guo, S Thakoor, M Pîslar, B Avila Pires, F Altché, C Tallec, A Saade, ... Advances in neural information processing systems 35, 31855-31870, 2022	73	2022

Sistem trenutno ne more izvesti postopka. Poskusite znova pozneje.

Članki 1–20

Št. navedb na leto

Podvojene navedbe

Združene navedbe

Dodajanje soavtorjevSoavtorji

Spremljaj

Navedeno

Soavtorji