Tinghao Xie

Cited by

	All	Since 2020
Citations	765	765
h-index	8	8
i10-index	7	7

600

300

150

450

202220232024202516 104 590 51

Public access

View all

1 article

0 articles

available

not available

Based on funding mandates

Co-authors

Xiangyu QIPrinceton UniversityVerified email at princeton.edu
Prateek MittalProfessor, Princeton UniversityVerified email at princeton.edu
Peter HendersonPrinceton UniversityVerified email at princeton.edu
Yi ZengPhD Candidate, Virginia TechVerified email at vt.edu
Ruoxi JiaAssistant Professor, Virginia TechVerified email at vt.edu
Pin-Yu ChenPrincipal Research Scientist, IBM Research AI; MIT-IBM Watson AI Lab; RPI-IBM AIRCVerified email at ibm.com
Yangsibo HuangGoogleVerified email at google.com
Saeed MahloujifarFAIR, MetaVerified email at meta.com
Boyi WeiPhD student, Princeton UniversityVerified email at princeton.edu
Yiming LiNanyang Technological UniversityVerified email at ntu.edu.sg
Kaixuan HuangPrinceton UniversityVerified email at princeton.edu
Luxi HeDepartment of Computer Science, Princeton UniversityVerified email at princeton.edu
Danqi ChenPrinceton UniversityVerified email at cs.princeton.edu
Mengdi WangCenter for Statistics & Machine Learning, ECE, Princeton UniversityVerified email at princeton.edu
Jiachen T. WangPrinceton UniversityVerified email at princeton.edu
Udari Madhushani SehwagResearch Scientist, JPMorgan AI ResearchVerified email at stanford.edu
Bo LiUniversity of Illinois at Urbana–ChampaignVerified email at illinois.edu
Kai BuZhejiang UniversityVerified email at zju.edu.cn
Pan RuizheVerified email at zju.edu.cn
Mengzhou XiaPrinceton UniversityVerified email at princeton.edu

Tinghao Xie

Princeton University

Verified email at princeton.edu - Homepage

Computer Security AI Security Adversarial ML Machine Learning


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Fine-tuning aligned language models compromises safety, even when users do not intend to! X Qi, Y Zeng, T Xie, PY Chen, R Jia, P Mittal, P Henderson ICLR 2024 (Oral), 2023	413	2023
Revisiting the assumption of latent separability for backdoor defenses X Qi, T Xie, Y Li, S Mahloujifar, P Mittal ICLR 2023, 2022	113*	2022
Towards practical deployment-stage backdoor attack on deep neural networks X Qi, T Xie, R Pan, J Zhu, Y Yang, K Bu CVPR 2022 (Oral), 13347-13357, 2022	69	2022
Assessing the brittleness of safety alignment via pruning and low-rank modifications B Wei, K Huang, Y Huang, T Xie, X Qi, M Xia, P Mittal, M Wang, ... ICML 2024, 2024	67	2024
Towards a proactive {ML} approach for detecting backdoor poison samples X Qi, T Xie, JT Wang, T Wu, S Mahloujifar, P Mittal 32nd USENIX Security Symposium (USENIX Security 23), 1685-1702, 2023	46	2023
Sorry-bench: Systematically evaluating large language model safety refusal behaviors T Xie, X Qi, Y Zeng, Y Huang, UM Sehwag, K Huang, L He, B Wei, D Li, ... ICLR 2025, 2024	28	2024
AI Risk Management Should Incorporate Both Safety and Security X Qi, Y Huang, Y Zeng, E Debenedetti, J Geiping, L He, K Huang, ... arXiv preprint arXiv:2405.19524, 2024	12	2024
Fantastic Copyrighted Beasts and How (Not) to Generate Them L He, Y Huang, W Shi, T Xie, H Liu, Y Wang, L Zettlemoyer, C Zhang, ... ICLR 2025, 2024	9	2024
BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection T Xie, X Qi, P He, Y Li, JT Wang, P Mittal ICLR 2024, 2023	5	2023
On evaluating the durability of safeguards for open-weight llms X Qi, B Wei, N Carlini, Y Huang, T Xie, L He, M Jagielski, M Nasr, P Mittal, ... ICLR 2025, 2024	3	2024

The system can't perform the operation now. Try again later.

Articles 1–10

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors