Abhay Sheshadri

2024202542 18

Public access

1 article

0 articles

available

not available

Based on funding mandates

Aidan EwartMaths Undergrad @ University of BristolVerified email at bristol.ac.uk
Phillip GuoUniversity of MarylandVerified email at umd.edu
Aengus LynchUniversity College London, MATSVerified email at ucl.ac.uk
Stephen CasperPhD student, MITVerified email at mit.edu
Ethan PerezAnthropic; New York UniversityVerified email at anthropic.com
Dylan Hadfield-MenellMassachusetts Institute of TechnologyVerified email at csail.mit.edu
Asa Cooper SticklandPostdoctoral Researcher, New York UniversityVerified email at ed.ac.uk
Henry SleightResearch Manager, Anthropic Fellows Program, Program Manager, ConstellationVerified email at constellation.org
Jannik BrinkmannPhD student, University of MannheimVerified email at uni-mannheim.de
Aaquib SyedMATS 5.0 | Student, University of MarylandVerified email at umd.edu
Gintare Karolina DziugaiteGoogle DeepMindVerified email at google.com
Jacob PfauNYUVerified email at nyu.edu
Alex InfangerGraduate Student, Stanford UniversityVerified email at stanford.edu

Abhay Sheshadri

Verified email at gatech.edu


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs A Sheshadri, A Ewart, P Guo, A Lynch, C Wu, V Hebbar, H Sleight, ... arXiv e-prints, arXiv: 2407.15549, 2024	31*	2024
A mechanistic analysis of a transformer trained on a symbolic multi-step reasoning task J Brinkmann, A Sheshadri, V Levoso*, P Swoboda, C Bartelt ACL 2024 (Findings), 2024	14	2024
Eliciting Language Model Behaviors using Reverse Language Models J Pfau, A Infanger, A Sheshadri*, A Panda, J Michael, C Huebner NeurIPS SOLAR Workshop, 2023	9	2023
Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization P Guo, A Syed, A Sheshadri, A Ewart, GK Dziugaite arXiv preprint arXiv:2410.12949, 2024	6*	2024
Obfuscated Activations Bypass LLM Latent-Space Defenses L Bailey, A Serrano, A Sheshadri, M Seleznyov, J Taylor, E Jenner, ... arXiv preprint arXiv:2412.09565, 2024		2024

The system can't perform the operation now. Try again later.

Articles 1–5

Citations per year