Jacob Steinhardt

Navedeno

	Vse	Od leta 2020
Navedbe	25231	23304
indeks h	51	50
indeks i10	90	83

10000

5000

2500

7500

201620172018201920202021202220232024202567 178 487 1007 1589 2174 2806 5062 9523 2103

Javni dostop

Prikaži vse

27 člankov

0 člankov

na voljo

ni na voljo

Na podlagi zahtev v povezavi s financiranjem

Soavtorji

Dan HendrycksDirector of the Center for AI Safety (advisor for xAI and Scale)Preverjeni e-poštni naslov na berkeley.edu
Dawn SongProfessor of Computer Science, UC BerkeleyPreverjeni e-poštni naslov na cs.berkeley.edu
Steven BasartPhD, University of ChicagoPreverjeni e-poštni naslov na ttic.edu
Percy LiangAssociate Professor of Computer Science, Stanford UniversityPreverjeni e-poštni naslov na cs.stanford.edu
Christopher OlahAnthropicPreverjeni e-poštni naslov na google.com
John SchulmanAnthropicPreverjeni e-poštni naslov na anthropic.com
Dario AmodeiCEO and Co-Founder at AnthropicPreverjeni e-poštni naslov na anthropic.com
Aditi RaghunathanAssistant professor, Carnegie Mellon UniversityPreverjeni e-poštni naslov na cmu.edu
Paul ChristianoNational Institute of Standards and TechnologyPreverjeni e-poštni naslov na nist.gov
Gregory ValiantAssistant Professor of Computer Science, Stanford UniversityPreverjeni e-poštni naslov na stanford.edu
Zachary C. LiptonRaj Reddy Associate Professor of Machine Learning @ Carnegie Mellon University; CTO + CSO @ AbridgePreverjeni e-poštni naslov na cmu.edu
Pang Wei KohUniversity of WashingtonPreverjeni e-poštni naslov na cs.washington.edu
Moses CharikarProfessor of Computer Science, Stanford UniversityPreverjeni e-poštni naslov na cs.stanford.edu
Jerry LiUniversity of WashingtonPreverjeni e-poštni naslov na cs.washington.edu
Daniel KangUIUCPreverjeni e-poštni naslov na illinois.edu
Tom B BrownAnthropicPreverjeni e-poštni naslov na anthropic.com
Andrew IlyasMassachusetts Institute of TechnologyPreverjeni e-poštni naslov na mit.edu
Banghua ZhuUniversity of California, BerkeleyPreverjeni e-poštni naslov na berkeley.edu
Jiantao JiaoAssistant Professor of EECS and Statistics, University of California, BerkeleyPreverjeni e-poštni naslov na berkeley.edu
Pravesh K. KothariPrinceton UniversityPreverjeni e-poštni naslov na cs.cmu.edu

Spremljaj

Jacob Steinhardt

Stanford University

Preverjeni e-poštni naslov na cs.stanford.edu - Domača stran

Machine learning Statistics


Naslov Razvrsti po navedbah Razvrsti po letniku Razvrsti po naslovu	Navedeno Navedeno	Leto
Measuring massive multitask language understanding D Hendrycks, C Burns, S Basart, A Zou, M Mazeika, D Song, J Steinhardt arXiv preprint arXiv:2009.03300, 2020	3403	2020
Concrete problems in AI safety D Amodei, C Olah, J Steinhardt, P Christiano, J Schulman, D Mané arXiv preprint arXiv:1606.06565, 2016	3168	2016
The many faces of robustness: A critical analysis of out-of-distribution generalization D Hendrycks, S Basart, N Mu, S Kadavath, F Wang, E Dorundo, R Desai, ... Proceedings of the IEEE/CVF international conference on computer vision …, 2021	1859	2021
Natural adversarial examples D Hendrycks, K Zhao, S Basart, J Steinhardt, D Song Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021	1671	2021
Measuring mathematical problem solving with the math dataset D Hendrycks, C Burns, S Kadavath, A Arora, S Basart, E Tang, D Song, ... arXiv preprint arXiv:2103.03874, 2021	1425	2021
The malicious use of artificial intelligence: Forecasting, prevention, and mitigation M Brundage, S Avin, J Clark, H Toner, P Eckersley, B Garfinkel, A Dafoe, ... arXiv preprint arXiv:1802.07228, 2018	1243	2018
Certified defenses against adversarial examples A Raghunathan, J Steinhardt, P Liang arXiv preprint arXiv:1801.09344, 2018	1147	2018
Certified defenses for data poisoning attacks J Steinhardt, PWW Koh, PS Liang Advances in neural information processing systems 30, 2017	957	2017
Jailbroken: How does llm safety training fail? A Wei, N Haghtalab, J Steinhardt Advances in Neural Information Processing Systems 36, 80079-80110, 2023	827	2023
Measuring coding challenge competence with apps D Hendrycks, S Basart, S Kadavath, M Mazeika, A Arora, E Guo, C Burns, ... arXiv preprint arXiv:2105.09938, 2021	574	2021
Semidefinite relaxations for certifying robustness to adversarial examples A Raghunathan, J Steinhardt, PS Liang Advances in neural information processing systems 31, 2018	514	2018
Scaling out-of-distribution detection for real-world settings D Hendrycks, S Basart, M Mazeika, A Zou, J Kwon, M Mostajabi, ... arXiv preprint arXiv:1911.11132, 2019	509	2019
Aligning ai with shared human values D Hendrycks, C Burns, S Basart, A Critch, J Li, D Song, J Steinhardt arXiv preprint arXiv:2008.02275, 2020	505	2020
Interpretability in the wild: a circuit for indirect object identification in gpt-2 small K Wang, A Variengien, A Conmy, B Shlegeris, J Steinhardt arXiv preprint arXiv:2211.00593, 2022	429	2022
Troubling Trends in Machine Learning Scholarship: Some ML papers suffer from flaws that could mislead the public and stymie future research. ZC Lipton, J Steinhardt Queue 17 (1), 45-77, 2019	389	2019
Progress measures for grokking via mechanistic interpretability N Nanda, L Chan, T Lieberum, J Smith, J Steinhardt arXiv preprint arXiv:2301.05217, 2023	360	2023
Sever: A robust meta-algorithm for stochastic optimization I Diakonikolas, G Kamath, D Kane, J Li, J Steinhardt, A Stewart International Conference on Machine Learning, 1596-1606, 2019	350	2019
Unsolved problems in ml safety D Hendrycks, N Carlini, J Schulman, J Steinhardt arXiv preprint arXiv:2109.13916, 2021	348	2021
Sonyc: A system for monitoring, analyzing, and mitigating urban noise pollution JP Bello, C Silva, O Nov, RL Dubois, A Arora, J Salamon, C Mydlarz, ... Communications of the ACM 62 (2), 68-77, 2019	348	2019
Learning from untrusted data M Charikar, J Steinhardt, G Valiant Proceedings of the 49th annual ACM SIGACT symposium on theory of computing …, 2017	347	2017

Sistem trenutno ne more izvesti postopka. Poskusite znova pozneje.

Članki 1–20

Št. navedb na leto

Podvojene navedbe

Združene navedbe

Dodajanje soavtorjevSoavtorji

Spremljaj

Navedeno

Soavtorji