Mantas Mazeika

عدد مرات الاقتباسات

	الكل	قبل 2020
اقتباسات	11480	11310
h-index	21	21
i10-index	23	23

6000

3000

1500

4500

2019202020212022202320242025132 449 909 1276 2515 5585 552

عدد المنشورات المتاحة للجميع

عرض المجموعة جميعها

مقالتان (2)

0 مقالة

المقالات البحثية المتاحة للجميع

المقالات البحثية غير المتاحة للجميع

تمّ اختيار المعلومات استنادًا إلى تفويضات التمويل

المؤلفون المشاركون

Dan HendrycksDirector of the Center for AI Safety (advisor for xAI and Scale)بريد إلكتروني تم التحقق منه على berkeley.edu
Dawn SongProfessor of Computer Science, UC Berkeleyبريد إلكتروني تم التحقق منه على cs.berkeley.edu
Andy ZouPhD Student, Carnegie Mellon Universityبريد إلكتروني تم التحقق منه على andrew.cmu.edu
Bo LiUniversity of Illinois at Urbana–Champaignبريد إلكتروني تم التحقق منه على illinois.edu
David ForsythProfessor of Computer Science, University of Illinois, Urbana Champaignبريد إلكتروني تم التحقق منه على uiuc.edu

متابعة

Mantas Mazeika

University of Illinois Urbana-Champaign

بريد إلكتروني تم التحقق منه على illinois.edu

ML Safety AI Safety Machine Ethics ML Reliability


عنوان ترتيب حسب الاقتباسات ترتيب حسب السنة الترتيب حسب العنوان	عدد مرات الاقتباسات عدد مرات الاقتباسات	السنة
Measuring massive multitask language understanding‏ D Hendrycks, C Burns, S Basart, A Zou, M Mazeika, D Song, J Steinhardt‏ arXiv preprint arXiv:2009.03300, 2020‏	3007	2020
Deep anomaly detection with outlier exposure‏ D Hendrycks, M Mazeika, T Dietterich‏ arXiv preprint arXiv:1812.04606, 2018‏	1799	2018
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models‏ A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ...‏ arXiv preprint arXiv:2206.04615, 2022‏	1286	2022
Using self-supervised learning can improve model robustness and uncertainty‏ D Hendrycks, M Mazeika, S Kadavath, D Song‏ Advances in neural information processing systems 32, 2019‏	1133	2019
Using pre-training can improve model robustness and uncertainty‏ D Hendrycks, K Lee, M Mazeika‏ International Conference on Machine Learning, 2712-2721, 2019‏	897	2019
Using trusted data to train deep networks on labels corrupted by severe noise‏ D Hendrycks, M Mazeika, D Wilson, K Gimpel‏ Advances in neural information processing systems 31, 2018‏	672	2018
Measuring coding challenge competence with apps‏ D Hendrycks, S Basart, S Kadavath, M Mazeika, A Arora, E Guo, C Burns, ...‏ arXiv preprint arXiv:2105.09938, 2021‏	544	2021
Scaling out-of-distribution detection for real-world settings‏ D Hendrycks, S Basart, M Mazeika, A Zou, J Kwon, M Mostajabi, ...‏ arXiv preprint arXiv:1911.11132, 2019‏	488	2019
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.‏ B Wang, W Chen, H Pei, C Xie, M Kang, C Zhang, C Xu, Z Xiong, R Dutta, ...‏ NeurIPS, 2023‏	382	2023
Representation engineering: A top-down approach to ai transparency‏ A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ...‏ arXiv preprint arXiv:2310.01405, 2023‏	281	2023
An overview of catastrophic ai risks‏ D Hendrycks, M Mazeika, T Woodside‏ arXiv preprint arXiv:2306.12001, 2023‏	199	2023
Harmbench: A standardized evaluation framework for automated red teaming and robust refusal‏ M Mazeika, L Phan, X Yin, A Zou, Z Wang, N Mu, E Sakhaee, N Li, ...‏ arXiv preprint arXiv:2402.04249, 2024‏	165	2024
Pixmix: Dreamlike pictures comprehensively improve safety measures‏ D Hendrycks, A Zou, M Mazeika, L Tang, B Li, D Song, J Steinhardt‏ Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022‏	144	2022
The wmdp benchmark: Measuring and reducing malicious use with unlearning‏ N Li, A Pan, A Gopal, S Yue, D Berrios, A Gatti, JD Li, AK Dombrowski, ...‏ arXiv preprint arXiv:2403.03218, 2024‏	94	2024
X-risk analysis for ai research‏ D Hendrycks, M Mazeika‏ arXiv preprint arXiv:2206.05862, 2022‏	80	2022
A benchmark for anomaly segmentation‏ D Hendrycks, S Basart, M Mazeika, M Mostajabi, J Steinhardt, D Song‏	74	2019
What would jiminy cricket do? towards agents that behave morally‏ D Hendrycks, M Mazeika, A Zou, S Patel, C Zhu, J Navarro, D Song, B Li, ...‏ arXiv preprint arXiv:2110.13136, 2021‏	70	2021
Forecasting future world events with neural networks‏ A Zou, T Xiao, R Jia, J Kwon, M Mazeika, R Li, D Song, J Steinhardt, ...‏ Advances in Neural Information Processing Systems 35, 27293-27305, 2022‏	34	2022
The trojan detection challenge‏ M Mazeika, D Hendrycks, H Li, X Xu, S Hough, A Zou, A Rajabi, Q Yao, ...‏ NeurIPS 2022 Competition Track, 279-291, 2022‏	31	2022
How to steer your adversary: Targeted and efficient model stealing defenses with gradient redirection‏ M Mazeika, B Li, D Forsyth‏ International Conference on Machine Learning, 15241-15254, 2022‏	28	2022

يتعذر على النظام إجراء العملية في الوقت الحالي. عاود المحاولة لاحقًا.

مقالات 1–20

عدد الاقتباسات في العام

اقتباسات مكررة

الاقتباسات المدمجة

إضافة مؤلفين مشاركينالمؤلفون المشاركون

متابعة

عدد مرات الاقتباسات

المؤلفون المشاركون