Segueix
Yi Zeng
Yi Zeng
PhD Candidate, Virginia Tech
Correu electrònic verificat a vt.edu - Pàgina d'inici
Títol
Citada per
Citada per
Any
Fine-tuning aligned language models compromises safety, even when users do not intend to!
X Qi, Y Zeng, T Xie, PY Chen, R Jia, P Mittal, P Henderson
ICLR 2024 Oral (top 1.2%), 2024
4622024
: A Deep Learning Based Network Encrypted Traffic Classification and Intrusion Detection Framework
Y Zeng, H Gu, W Wei, Y Guo
IEEE Access 7, 45182-45190, 2019
2722019
Rethinking the Backdoor Attacks' Triggers: A Frequency Perspective
Y Zeng, W Park, ZM Mao, R Jia
International Conference on Computer Vision (ICCV), 2021, 2021
2592021
Deepsweep: An evaluation framework for mitigating DNN backdoor attacks using data augmentation
H Qiu, Y Zeng, S Guo, T Zhang, M Qiu, B Thuraisingham
Proceedings of the 2021 ACM Asia Conference on Computer and Communications …, 2021
238*2021
How johnny can persuade llms to jailbreak them: Rethinking persuasion to challenge ai safety by humanizing llms
Y Zeng, H Lin, J Zhang, D Yang, R Jia, W Shi
ACL 2024 (Best Social Impact Award), 2024
2082024
Adversarial Unlearning of Backdoors via Implicit Hypergradient
Y Zeng, S Chen, W Park, ZM Mao, M Jin, R Jia
The Tenth International Conference on Learning Representations (ICLR 2022), 2021
2032021
Narcissus: A practical clean-label backdoor attack with limited information
Y Zeng, M Pan, HA Just, L Lyu, M Qiu, R Jia
ACM SIGSAC Conference on Computer and Communications Security (CCS), 2023
1922023
Cater: Intellectual property protection on text generation apis via conditional watermarks
X He, Q Xu, Y Zeng, L Lyu, F Wu, J Li, R Jia
Advances in Neural Information Processing Systems 35, 5431-5445, 2022
822022
A data augmentation-based defense method against adversarial attacks in neural networks
Y Zeng, H Qiu, G Memmi, M Qiu
Algorithms and Architectures for Parallel Processing: 20th International …, 2020
772020
DeepVCM: A deep learning based intrusion detection method in VANET
Y Zeng, M Qiu, D Zhu, Z Xue, J Xiong, M Liu
2019 IEEE 5th intl conference on big data security on cloud (BigDataSecurity …, 2019
722019
LAVA: Data Valuation without Pre-Specified Learning Algorithms
HA Just, F Kang, JT Wang, Y Zeng, M Ko, M Jin, R Jia
The Eleventh International Conference on Learning Representations (ICLR 2023), 2023
632023
Fine-tuning Is Not Enough: A Simple yet Effective Watermark Removal Attack for DNN Models
S Guo, T Zhang, H Qiu, Y Zeng, T Xiang, Y Liu
International Joint Conference on Artificial Intelligence (IJCAI), 2021, 2021
57*2021
Senior2local: A machine learning based intrusion detection method for vanets
Y Zeng, M Qiu, Z Ming, M Liu
Smart Computing and Communication: Third International Conference, SmartCom …, 2018
572018
An efficient preprocessing-based approach to mitigate advanced adversarial attacks
H Qiu, Y Zeng, Q Zheng, S Guo, T Zhang, H Li
IEEE Transactions on Computers 73 (3), 645-655, 2021
45*2021
Sorry-bench: Systematically evaluating large language model safety refusal behaviors
T Xie, X Qi, Y Zeng, Y Huang, UM Sehwag, K Huang, L He, B Wei, D Li, ...
ICLR 2025, 2025
382025
A safe harbor for ai evaluation and red teaming
S Longpre, S Kapoor, K Klyman, A Ramaswami, R Bommasani, ...
ICML 2024, 2024
362024
Introducing v0. 5 of the ai safety benchmark from mlcommons
B Vidgen, A Agrawal, AM Ahmed, V Akinwande, N Al-Nuaimi, N Alfaraj, ...
arXiv preprint arXiv:2404.12241, 2024
332024
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content
Z Yuan, Z Xiong, Y Zeng, N Yu, R Jia, D Song, B Li
ICML 2024, 2024
332024
META-SIFT: How to Sift Out a Clean Data Subset in the Presence of Data Poisoning?
Y Zeng, M Pan, H Jahagirdar, M Jin, L Lyu, R Jia
USENIX Security Symposium, 2023, 2023
31*2023
ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms
M Pan, Y Zeng, L Lyu, X Lin, R Jia
USENIX Security Symposium, 2023, 2023
302023
En aquests moments el sistema no pot dur a terme l'operació. Torneu-ho a provar més tard.
Articles 1–20