Tom Everitt

צוטט על ידי

	הכל	מאז 2020
ציטוטים ביבליוגרפיים	2441	2080
H-index	23	21
i10-index	41	36

660

330

165

495

2015201620172018201920202021202220232024202512 12 28 101 178 250 279 324 456 643 123

גישה ציבורית

הצג הכל

10 מאמרים

0 מאמרים

זמין

לא זמין

על סמך ייפוי כח מהמממנים

מחברים משותפים

Marcus HutterResearcher@DeepMind & Professor at ANUכתובת אימייל מאומתת בדומיין anu.edu.au
Victoria KrakovnaSenior Research Scientist at Google DeepMindכתובת אימייל מאומתת בדומיין google.com
Ramana KumarDeepMindכתובת אימייל מאומתת בדומיין cl.cam.ac.uk
Ryan CareyUniversity of Oxfordכתובת אימייל מאומתת בדומיין philosophy.ox.ac.uk
Pedro A. OrtegaArtificial Intelligence & Machine Learningכתובת אימייל מאומתת בדומיין adaptiveagents.org
Zachary KentonGoogle DeepMindכתובת אימייל מאומתת בדומיין google.com
Miljan MarticDeepMindכתובת אימייל מאומתת בדומיין google.com
Jan LeikeOpenAIכתובת אימייל מאומתת בדומיין openai.com
Laurent OrseauResearch Scientist at Google DeepMindכתובת אימייל מאומתת בדומיין google.com
Vishal MainiDeepMindכתובת אימייל מאומתת בדומיין deepmind.com
Matt MacDermottPhD Student, Imperial College Londonכתובת אימייל מאומתת בדומיין ic.ac.uk
Francesco BelardinelliImperial College Londonכתובת אימייל מאומתת בדומיין imperial.ac.uk
Eric LangloisSanctuary AIכתובת אימייל מאומתת בדומיין cs.toronto.edu
James FoxUniversity of Oxfordכתובת אימייל מאומתת בדומיין keble.ox.ac.uk
Jarryd MartinThe Walter and Eliza Hall Instituteכתובת אימייל מאומתת בדומיין wehi.edu.au
David Scott KruegerUniversity Assistant Professor, University of Cambridgeכתובת אימייל מאומתת בדומיין cam.ac.uk
Andrew LefrancqDeepMindכתובת אימייל מאומתת בדומיין google.com
Matthew RahtzGoogle DeepMindכתובת אימייל מאומתת בדומיין google.com
Jonathan RichensDeepMindכתובת אימייל מאומתת בדומיין google.com
Alessandro AbateProfessor of Verification and Control, University of Oxford, UKכתובת אימייל מאומתת בדומיין cs.ox.ac.uk

עקוב אחר

Tom Everitt

Staff Research Scientist at Google DeepMind

כתובת אימייל מאומתת בדומיין google.com - דף הבית

AI Safety Artificial General Intelligence Causality Incentives


כותרת מיון לפי ציטוט ביבליוגרפי מיון לפי שנה מיון לפי כותרת	צוטט על ידי צוטט על ידי	שנה
Scalable agent alignment via reward modeling: a research direction‏ J Leike, D Krueger, T Everitt, M Martic, V Maini, S Legg‏ arXiv preprint arXiv:1811.07871, 2018‏	379	2018
AI safety gridworlds‏ J Leike, M Martic, V Krakovna, PA Ortega, T Everitt, A Lefrancq, L Orseau, ...‏ arXiv preprint arXiv:1711.09883, 2017‏	362	2017
Alignment of language agents‏ Z Kenton, T Everitt, L Weidinger, I Gabriel, V Mikulik, G Irving‏ arXiv preprint arXiv:2103.14659, 2021‏	167	2021
AGI safety literature review‏ T Everitt, G Lea, M Hutter‏ International Joint Conference on AI (IJCAI), 2018‏	160	2018
Count-based exploration in feature space for reinforcement learning‏ J Martin, SN Sasikumar, T Everitt, M Hutter‏ International Joint Conference on AI (IJCAI), 2017‏	151	2017
Reinforcement Learning with Corrupted Reward Channel‏ T Everitt, V Krakovna, L Orseau, M Hutter, S Legg‏ 26th International Joint Conference on Artificial Intelligence (IJCAI), 2017‏	131	2017
Specification gaming: the flip side of AI ingenuity‏ V Krakovna, J Uesato, V Mikulik, M Rahtz, T Everitt, R Kumar, Z Kenton, ...‏ DeepMind Blog 3, 2020‏	123	2020
Reward tampering problems and solutions in reinforcement learning: A causal influence diagram perspective‏ T Everitt, M Hutter, R Kumar, V Krakovna‏ Synthese, 2021‏	104	2021
Shaking the foundations: delusions in sequence models for interaction and control‏ PA Ortega, M Kunesch, G Delétang, T Genewein, J Grau-Moya, J Veness, ...‏ arXiv preprint arXiv:2110.10819, 2021‏	65	2021
Agent incentives: A causal perspective‏ T Everitt, R Carey, ED Langlois, PA Ortega, S Legg‏ Proceedings of the AAAI Conference on Artificial Intelligence 35 (13), 11487 …, 2021‏	59	2021
Avoiding wireheading with value reinforcement learning‏ T Everitt, M Hutter‏ International Conference on Artificial General Intelligence (AGI), 12-22, 2016‏	53	2016
Towards safe artificial general intelligence‏ T Everitt‏ PQDT-Global, 2019‏	41	2019
Robust agents learn causal world models‏ J Richens, T Everitt‏ The Twelfth International Conference on Learning Representations, 2024‏	39	2024
Discovering agents‏ Z Kenton, R Kumar, S Farquhar, J Richens, M MacDermott, T Everitt‏ Artificial Intelligence 322, 103963, 2023‏	38	2023
Universal artificial intelligence: Practical agents and fundamental challenges‏ T Everitt, M Hutter‏ Foundations of trusted autonomy, 15-46, 2018‏	36	2018
Understanding agent incentives using causal influence diagrams. Part I: Single action settings‏ T Everitt, PA Ortega, E Barnes, S Legg‏ arXiv preprint arXiv:1902.09980, 2019‏	33	2019
Self-modification of policy and utility function in rational agents‏ T Everitt, D Filan, M Daswani, M Hutter‏ International Conference on Artificial General Intelligence (AGI), 1-11, 2016‏	33	2016
Honesty is the best policy: defining and mitigating AI deception‏ F Ward, F Toni, F Belardinelli, T Everitt‏ Advances in neural information processing systems 36, 2313-2341, 2023‏	30	2023
Path-specific objectives for safer agent incentives‏ S Farquhar, R Carey, T Everitt‏ Proceedings of the AAAI Conference on Artificial Intelligence 36 (9), 9529-9538, 2022‏	29	2022
A game-theoretic analysis of the off-switch game‏ T Wängberg, M Böörs, E Catt, T Everitt, M Hutter‏ Artificial General Intelligence: 10th International Conference, AGI 2017 …, 2017‏	27	2017

המערכת אינה יכולה לבצע את הפעולה כעת. נסה שוב מאוחר יותר.

מאמרים 1–20

ציטוטים ביבליוגרפיים בשנה

ציטוטים ביביליוגרפיים כפולים

ציטוטים ביביליוגרפיים שמוזגו

הוסף מחברים שותפיםמחברים משותפים

עקוב אחר

צוטט על ידי

מחברים משותפים