Spremljaj
Steven Basart
Steven Basart
PhD, University of Chicago
Preverjeni e-poštni naslov na ttic.edu - Domača stran
Naslov
Navedeno
Navedeno
Leto
Measuring massive multitask language understanding
D Hendrycks, C Burns, S Basart, A Zou, M Mazeika, D Song, J Steinhardt
arXiv preprint arXiv:2009.03300, 2020
34032020
The many faces of robustness: A critical analysis of out-of-distribution generalization
D Hendrycks, S Basart, N Mu, S Kadavath, F Wang, E Dorundo, R Desai, ...
Proceedings of the IEEE/CVF international conference on computer vision …, 2021
18592021
Natural adversarial examples
D Hendrycks, K Zhao, S Basart, J Steinhardt, D Song
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021
16712021
Measuring mathematical problem solving with the math dataset
D Hendrycks, C Burns, S Kadavath, A Arora, S Basart, E Tang, D Song, ...
arXiv preprint arXiv:2103.03874, 2021
14252021
Measuring coding challenge competence with apps
D Hendrycks, S Basart, S Kadavath, M Mazeika, A Arora, E Guo, C Burns, ...
arXiv preprint arXiv:2105.09938, 2021
5742021
Improving and Assessing Anomaly Detectors for Large-Scale Settings
D Hendrycks, S Basart, M Mazeika, A Zou, J Kwon, M Mostajabi, ...
570*2022
Aligning ai with shared human values
D Hendrycks, C Burns, S Basart, A Critch, J Li, D Song, J Steinhardt
arXiv preprint arXiv:2008.02275, 2020
5052020
Representation engineering: A top-down approach to ai transparency
A Zou, L Phan, S Chen, J Campbell, P Guo, R Ren, A Pan, X Yin, ...
arXiv preprint arXiv:2310.01405, 2023
3252023
Diode: A dense indoor and outdoor depth dataset
I Vasiljevic, N Kolkin, S Zhang, R Luo, H Wang, FZ Dai, AF Daniele, ...
arXiv preprint arXiv:1908.00463, 2019
2302019
Harmbench: A standardized evaluation framework for automated red teaming and robust refusal
M Mazeika, L Phan, X Yin, A Zou, Z Wang, N Mu, E Sakhaee, N Li, ...
arXiv preprint arXiv:2402.04249, 2024
2172024
Testing robustness against unforeseen adversaries
D Kang, Y Sun, D Hendrycks, T Brown, J Steinhardt
1512019
Do the rewards justify the means? measuring trade-offs between rewards and ethical behavior in the machiavelli benchmark
A Pan, JS Chan, A Zou, N Li, S Basart, T Woodside, H Zhang, S Emmons, ...
International conference on machine learning, 26837-26867, 2023
1322023
The wmdp benchmark: Measuring and reducing malicious use with unlearning
N Li, A Pan, A Gopal, S Yue, D Berrios, A Gatti, JD Li, AK Dombrowski, ...
arXiv preprint arXiv:2403.03218, 2024
1152024
Measuring mathematical problem solving with the math dataset, 2021
D Hendrycks, C Burns, S Kadavath, A Arora, S Basart, E Tang, D Song, ...
URL https://arxiv. org/abs/2103.03874, 2024
502024
The many faces of robustness: A critical analysis of out-of-distribution generalization. 2021 IEEE
D Hendrycks, S Basart, N Mu, S Kadavath, F Wang, E Dorundo, R Desai, ...
CVF International Conference on Computer Vision (ICCV) 2, 2020
292020
Scaling out-of-distribution detection for real-world settings
S Basart, M Mantas, M Mohammadreza, S Jacob, S Dawn
International Conference on Machine Learning, 2022
202022
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
R Ren, S Basart, A Khoja, A Gatti, L Phan, X Yin, M Mazeika, A Pan, ...
Advances in Neural Information Processing Systems 37, 68559-68594, 2025
172025
How would the viewer feel? Estimating wellbeing from video scenarios
M Mazeika, E Tang, A Zou, S Basart, JS Chan, D Song, D Forsyth, ...
Advances in Neural Information Processing Systems 35, 18571-18585, 2022
172022
A quantitative measure of generative adversarial network distributions
D Hendrycks, S Basart
42017
Evaluating Robustness to Unforeseen Adversarial Attacks
M Kaufmann, D Kang, Y Sun, X Yin, S Basart, M Mazeika, A Dziedzic, ...
2023
Sistem trenutno ne more izvesti postopka. Poskusite znova pozneje.
Članki 1–20