Prati
Aaquib Syed
Naslov
Citirano
Citirano
Godina
Refusal in language models is mediated by a single direction
A Arditi*, O Obeso*, A Syed, D Paleka, N Panickssery, W Gurnee, ...
NeurIPS 2024, Poster, 2024
93*2024
Attribution patching outperforms automated circuit discovery
A Syed, C Rager, A Conmy
BlackboxNLP at EMNLP 2024, 2023
452023
Machine learning with textural analysis of longitudinal multiparametric MRI and molecular subtypes accurately predicts pathologic complete response in patients with invasive …
A Syed, R Adam, T Ren, J Lu, T Maldjian, TQ Duong
PloS one 18 (1), e0280320, 2023
202023
Prune and tune: Improving efficient pruning techniques for massive language models
A Syed, PH Guo, V Sundarapandiyan
TinyPapers Workshop at ICLR 2023 (Notable - Top 6%), 2023
16*2023
Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization
P Guo*, A Syed*, A Sheshadri, A Ewart, GK Dziugaite
arXiv preprint arXiv:2410.12949, 2024
6*2024
Sustav trenutno ne može provesti ovu radnju. Pokušajte ponovo kasnije.
Članci 1–5