Wes Gurnee

عدد مرات الاقتباسات

	الكل	قبل 2020
اقتباسات	656	656
h-index	10	10
i10-index	10	10

460

230

115

345

202220232024202512 85 458 98

عدد المنشورات المتاحة للجميع

عرض المجموعة جميعها

مقالتان (2)

0 مقالة

المقالات البحثية المتاحة للجميع

المقالات البحثية غير المتاحة للجميع

تمّ اختيار المعلومات استنادًا إلى تفويضات التمويل

المؤلفون المشاركون

Neel NandaMechanistic Interpretability Team Lead, Google DeepMindبريد إلكتروني تم التحقق منه على deepmind.com
Dimitris BertsimasBoeing Professor of Operations Research, MITبريد إلكتروني تم التحقق منه على mit.edu
Max TegmarkProfessor of Physics, MITبريد إلكتروني تم التحقق منه على mit.edu
Nina PanicksseryAnthropicبريد إلكتروني تم التحقق منه على anthropic.com
Matthew PaulyUndergraduate Student, Harvard Universityبريد إلكتروني تم التحقق منه على college.harvard.edu
David ShmoysProfessor of Operations Research & Information Engineering and of Computer Scienceبريد إلكتروني تم التحقق منه على cs.cornell.edu
Isaac LiaoCarnegie Mellon Universityبريد إلكتروني تم التحقق منه على andrew.cmu.edu
Josh EngelsPhD Student, MITبريد إلكتروني تم التحقق منه على mit.edu
Zifan Carl GuoMITبريد إلكتروني تم التحقق منه على mit.edu
Eric J. MichaudGraduate student, MITبريد إلكتروني تم التحقق منه على mit.edu
Nikhil GargAssistant Professor, Cornell Techبريد إلكتروني تم التحقق منه على cornell.edu
David RothschildMicrosoft Researchبريد إلكتروني تم التحقق منه على researchdmr.com
Lovis HeindrichMax Planck Institute for Intelligent Systemsبريد إلكتروني تم التحقق منه على tuebingen.mpg.de
Andy Arditi

متابعة

Wes Gurnee

Anthropic

بريد إلكتروني تم التحقق منه على mit.edu - الصفحة الرئيسية

Mechanistic Interpretability AI Alignment Optimization Governance


عنوان ترتيب حسب الاقتباسات ترتيب حسب السنة الترتيب حسب العنوان	عدد مرات الاقتباسات عدد مرات الاقتباسات	السنة
Language models represent space and time‏ W Gurnee, M Tegmark‏ ICLR 2024, 2023‏	198*	2023
Finding Neurons in a Haystack: Case Studies with Sparse Probing‏ W Gurnee, N Nanda, M Pauly, K Harvey, D Troitskii, D Bertsimas‏ Transactions of Machine Learning Research (TMLR), 2023‏	146*	2023
Refusal in language models is mediated by a single direction‏ A Arditi, O Obeso, A Syed, D Paleka, N Panickssery, W Gurnee, N Nanda‏ arXiv preprint arXiv:2406.11717, 2024‏	90*	2024
Learning sparse nonlinear dynamics via mixed-integer optimization‏ D Bertsimas, W Gurnee‏ Nonlinear Dynamics 111 (7), 6585-6604, 2023‏	47	2023
Not all language model features are linear‏ J Engels, EJ Michaud, I Liao, W Gurnee, M Tegmark‏ arXiv preprint arXiv:2405.14860, 2024‏	44*	2024
Fairmandering: A column generation heuristic for fairness-optimized political districting‏ W Gurnee, DB Shmoys‏ SIAM Conference on Applied and Computational Discrete Algorithms (ACDA21), 88-99, 2021‏	42	2021
Universal neurons in GPT2 language models‏ W Gurnee, T Horsley, ZC Guo, TR Kheirkhah, Q Sun, W Hathaway, ...‏ Transactions of Machine Learning Research (TMLR), 2024‏	28*	2024
The Remarkable Robustness of LLMs: Stages of Inference?‏ V Lad, W Gurnee, M Tegmark‏ arXiv preprint arXiv:2406.19384, 2024‏	21*	2024
Combatting gerrymandering with social choice: The design of multi-member districts‏ N Garg, W Gurnee, D Rothschild, D Shmoys‏ Proceedings of the 23rd ACM Conference on Economics and Computation, 560-561, 2022‏	15	2022
Confidence regulation neurons in language models‏ A Stolfo, B Wu, W Gurnee, Y Belinkov, X Song, M Sachan, N Nanda‏ Advances in Neural Information Processing Systems 37, 125019-125049, 2025‏	10*	2025
Sae reconstruction errors are (empirically) pathological‏ W Gurnee‏ AI Alignment Forum, 16, 2024‏	9*	2024
Training Dynamics of Contextual N-Grams in Language Models‏ L Quirke, L Heindrich, W Gurnee, N Nanda‏ NeurIPS 2023 Workshop on Attributing Model Behavior at Scale, 2023‏	4	2023
Multilevel interpretability of artificial neural networks: leveraging framework and methods from neuroscience‏ Z He, J Achterberg, K Collins, K Nejad, D Akarca, Y Yang, W Gurnee, ...‏ arXiv preprint arXiv:2408.12664, 2024‏	1	2024
Scalable approximations of capacitated k-medians for political districting‏ W Gurnee‏ Technical report, Cornell University, Ithaca, United States, 2020‏	1	2020

يتعذر على النظام إجراء العملية في الوقت الحالي. عاود المحاولة لاحقًا.

مقالات 1–14

عدد الاقتباسات في العام

اقتباسات مكررة

الاقتباسات المدمجة

إضافة مؤلفين مشاركينالمؤلفون المشاركون

متابعة

عدد مرات الاقتباسات

المؤلفون المشاركون