Jan Leike

Навело

	Све	Од 2020
Наводи	35213	34437
h-индекс	29	25
i10-индекс	36	31

21000

10500

5250

15750

20182019202020212022202320242025188 289 382 510 1196 7135 20837 4294

Јавни приступ

Прикажи све

10 чланака

0 чланака

доступно

није доступно

На основу услова финансирања

Коаутори

Jeffrey WuAnthropic AI, OpenAIВерификована је имејл адреса на anthropic.com
Paul ChristianoNational Institute of Standards and TechnologyВерификована је имејл адреса на nist.gov
John SchulmanAnthropicВерификована је имејл адреса на anthropic.com
Ryan LoweOpenAIВерификована је имејл адреса на openai.com
Marcus HutterResearcher@DeepMind & Professor at ANUВерификована је имејл адреса на anu.edu.au
Dario AmodeiCEO and Co-Founder at AnthropicВерификована је имејл адреса на anthropic.com
Ilya SutskeverCo-Founder and Chief Scientist at Safe Superintelligence IncВерификована је имејл адреса на ssi.inc
David Scott KruegerUniversity Assistant Professor, University of CambridgeВерификована је имејл адреса на cam.ac.uk
Matthias HeizmannUniversity of Stuttgart, GermanyВерификована је имејл адреса на heizmann.name
Tom EverittStaff Research Scientist at Google DeepMindВерификована је имејл адреса на google.com
Yuri BurdaOpenAIВерификована је имејл адреса на openai.com
Pushmeet KohliDeepMindВерификована је имејл адреса на google.com
Andreas PodelskiProfessor of Computer Science, Freiburg UniversityВерификована је имејл адреса на informatik.uni-freiburg.de
Geoffrey IrvingUK AI Security Institute (AISI)Верификована је имејл адреса на naml.us
Tegan MaharajAssistant Professor at MilaВерификована је имејл адреса на polymtl.ca
William SaundersOpenAIВерификована је имејл адреса на cs.toronto.edu
Collin BurnsResearcher, AnthropicВерификована је имејл адреса на anthropic.com
Pavel IzmailovAnthropic; NYUВерификована је имејл адреса на anthropic.com
Adam GleaveCEO at FAR AIВерификована је имејл адреса на far.ai
Andrew TraskUniversity of Oxford and OpenMinedВерификована је имејл адреса на openmined.org

Прати

Jan Leike

OpenAI

Верификована је имејл адреса на openai.com - Почетна страница

reinforcement learning deep learning agent alignment


Наслов Сортирај по наводима Сортирај по години Сортирај по наслову	Навело Навело	Година
Training language models to follow instructions with human feedback L Ouyang, J Wu, X Jiang, D Almeida, C Wainwright, P Mishkin, C Zhang, ... Advances in Neural Information Processing Systems 35, 27730-27744, 2022	12507	2022
GPT-4 technical report OpenAI arXiv, 2023	10731*	2023
Evaluating large language models trained on code M Chen, J Tworek, H Jun, Q Yuan, HPO Pinto, J Kaplan, H Edwards, ... arXiv preprint arXiv:2107.03374, 2021	4016	2021
Deep reinforcement learning from human preferences PF Christiano, J Leike, T Brown, M Martic, S Legg, D Amodei Advances in Neural Information Processing Systems 30, 4299-4307, 2017	3585	2017
Let's Verify Step by Step H Lightman, V Kosaraju, Y Burda, H Edwards, B Baker, T Lee, J Leike, ... arXiv preprint arXiv:2305.20050, 2023	650	2023
Reward learning from human preferences and demonstrations in Atari B Ibarz, J Leike, T Pohlen, G Irving, S Legg, D Amodei Advances in Neural Information Processing Systems, 8011-8023, 2018	464	2018
Scalable agent alignment via reward modeling: a research direction J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg arXiv preprint arXiv:1811.07871, 2018	379	2018
AI Safety Gridworlds J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ... arXiv preprint arXiv:1711.09883, 2017	362	2017
Recursively summarizing books with human feedback J Wu, L Ouyang, DM Ziegler, N Stiennon, R Lowe, J Leike, P Christiano arXiv preprint arXiv:2109.10862, 2021	282	2021
Language models can explain neurons in language models S Bills, N Cammarata, D Mossing, H Tillman, L Gao, G Goh, I Sutskever, ... URL https://openaipublic. blob. core. windows. net/neuron-explainer/paper …, 2023	248	2023
Self-critiquing models for assisting human evaluators W Saunders, C Yeh, J Wu, S Bills, L Ouyang, J Ward, J Leike arXiv preprint arXiv:2206.05802, 2022	240	2022
GPT-4o System Card A Hurst, A Lerer, AP Goucher, A Perelman, A Ramesh, A Clark, AJ Ostrow, ... arXiv preprint arXiv:2410.21276, 2024	225	2024
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision C Burns, P Izmailov, JH Kirchner, B Baker, L Gao, L Aschenbrenner, ... arXiv preprint arXiv:2312.09390, 2023	224	2023
Learning to Understand Goal Specifications by Modelling Reward D Bahdanau, F Hill, J Leike, E Hughes, P Kohli, E Grefenstette arXiv preprint arXiv:1806.01946, 2018	209*	2018
Ranking Templates for Linear Loops J Leike, M Heizmann Logical Methods in Computer Science, 2015	100	2015
Scaling and evaluating sparse autoencoders L Gao, TD la Tour, H Tillman, G Goh, R Troll, A Radford, I Sutskever, ... arXiv preprint arXiv:2406.04093, 2024	99	2024
Learning human objectives by evaluating hypothetical behavior S Reddy, A Dragan, S Levine, S Legg, J Leike International Conference on Machine Learning, 8020-8029, 2020	91	2020
Quantifying Differences in Reward Functions A Gleave, M Dennis, S Legg, S Russell, J Leike arXiv preprint arXiv:2006.13900, 2020	74	2020
Institutionalizing ethics in AI through broader impact requirements CEA Prunkl, C Ashurst, M Anderljung, H Webb, J Leike, A Dafoe Nature Machine Intelligence 3 (2), 104-110, 2021	72	2021
Linear ranking for linear lasso programs M Heizmann, J Hoenicke, J Leike, A Podelski Automated Technology for Verification and Analysis, 365-380, 2013	72	2013

Систем тренутно не може да изврши ову радњу. Пробајте поново касније.

Чланци 1–20

Годишњи број навода

Дупли наводи

Обједињени наводи

Додавање коаутораКоаутори

Прати

Навело

Коаутори