Joar Skalse

عدد مرات الاقتباسات

	الكل	قبل 2020
اقتباسات	702	697
h-index	11	11
i10-index	11	11

380

190

285

20192020202120222023202420252 12 23 72 176 378 34

عدد المنشورات المتاحة للجميع

عرض المجموعة جميعها

مقالتان (2)

0 مقالة

المقالات البحثية المتاحة للجميع

المقالات البحثية غير المتاحة للجميع

تمّ اختيار المعلومات استنادًا إلى تفويضات التمويل

متابعة

Joar Skalse

DPhil Student in Computer Science, Oxford University

بريد إلكتروني تم التحقق منه على cs.ox.ac.uk

machine learning


عنوان ترتيب حسب الاقتباسات ترتيب حسب السنة الترتيب حسب العنوان	عدد مرات الاقتباسات عدد مرات الاقتباسات	السنة
Defining and characterizing reward gaming‏ J Skalse, N Howe, D Krasheninnikov, D Krueger‏ Advances in Neural Information Processing Systems 35, 9460-9471, 2022‏	241	2022
Risks from learned optimization in advanced machine learning systems‏ E Hubinger, C van Merwijk, V Mikulik, J Skalse, S Garrabrant‏ arXiv preprint arXiv:1906.01820, 2019‏	151	2019
Is SGD a Bayesian sampler? Well, almost‏ C Mingard, G Valle-Pérez, J Skalse, AA Louis‏ Journal of Machine Learning Research 22 (79), 1-64, 2021‏	57	2021
Invariance in policy optimisation and partial identifiability in reward learning‏ JMV Skalse, M Farrugia-Roberts, S Russell, A Abate, A Gleave‏ International Conference on Machine Learning, 32033-32058, 2023‏	51	2023
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems‏ D Dalrymple, J Skalse, Y Bengio, S Russell, M Tegmark, S Seshia, ...‏ arXiv preprint arXiv:2405.06624, 2024‏	37	2024
Neural networks are a priori biased towards boolean functions with low entropy‏ C Mingard, J Skalse, G Valle-Pérez, D Martínez-Rubio, V Mikulik, ...‏ arXiv preprint arXiv:1909.11522, 2019‏	33	2019
Misspecification in inverse reinforcement learning‏ J Skalse, A Abate‏ Proceedings of the AAAI Conference on Artificial Intelligence 37 (12), 15136 …, 2023‏	31	2023
Lexicographic multi-objective reinforcement learning‏ J Skalse, L Hammond, C Griffin, A Abate‏ arXiv preprint arXiv:2212.13769, 2022‏	27	2022
Reinforcement learning in Newcomblike environments‏ J Bell, L Linsefors, C Oesterheld, J Skalse‏ Advances in Neural Information Processing Systems 34, 22146-22157, 2021‏	17	2021
On the limitations of Markovian rewards to express multi-objective, risk-sensitive, and modal tasks‏ J Skalse, A Abate‏ Uncertainty in Artificial Intelligence, 1974-1984, 2023‏	13	2023
Goodhart's Law in Reinforcement Learning‏ J Karwowski, O Hayman, X Bai, K Kiendlhofer, C Griffin, J Skalse‏ arXiv preprint arXiv:2310.09144, 2023‏	12	2023
STARC: A General Framework For Quantifying Differences Between Reward Functions‏ J Skalse, L Farnik, SR Motwani, E Jenner, A Gleave, A Abate‏ arXiv preprint arXiv:2309.15257, 2023‏	7	2023
The reward hypothesis is false‏ JMV Skalse, A Abate‏	5	2022
Quantifying the Sensitivity of Inverse Reinforcement Learning to Misspecification‏ J Skalse, A Abate‏ arXiv preprint arXiv:2403.06854, 2024‏	4	2024
On The Expressivity of Objective-Specification Formalisms in Reinforcement Learning‏ R Subramani, M Williams, M Heitmann, H Holm, C Griffin, J Skalse‏ arXiv preprint arXiv:2310.11840, 2023‏	4	2023
A general framework for reward function distances‏ E Jenner, JMV Skalse, A Gleave‏ NeurIPS ML Safety Workshop, 2022‏	4	2022
All’s Well That Ends Well: Avoiding Side Effects with Distance-Impact Penalties‏ C Griffin, JMV Skalse, L Hammond, A Abate‏ NeurIPS ML Safety Workshop, 2022‏	2	2022
A General Counterexample to Any Decision Theory and Some Responses‏ J Skalse‏ arXiv preprint arXiv:2101.00280, 2021‏	2	2021
Safety Properties of Inductive Logic Programming.‏ G Leech, N Schoots, J Skalse‏ SafeAI@ AAAI, 2021‏	2	2021
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems‏ J Skalse, Y Bengio, S Russell, M Tegmark, S Seshia, S Omohundro, ...‏ arXiv e-prints, arXiv: 2405.06624, 2024‏	1	2024

يتعذر على النظام إجراء العملية في الوقت الحالي. عاود المحاولة لاحقًا.

مقالات 1–20

عدد الاقتباسات في العام

اقتباسات مكررة

الاقتباسات المدمجة

إضافة مؤلفين مشاركينالمؤلفون المشاركون

متابعة

عدد مرات الاقتباسات