دنبال کردن
Zeming Wei
Zeming Wei
Undergraduate, Peking University
ایمیل تأیید شده در stu.pku.edu.cn - صفحهٔ اصلی
عنوان
نقل شده توسط
نقل شده توسط
سال
Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations
Z Wei, Y Wang, A Li, Y Mo, Y Wang
arXiv preprint arXiv:2310.06387, 2023
2052023
CFA: Class-wise Calibrated Fair Adversarial Training
Z Wei, Y Wang, Y Guo, Y Wang
CVPR 2023, 2023
652023
Jatmo: Prompt injection defense by task-specific finetuning
J Piet, M Alrashed, C Sitawarin, S Chen, Z Wei, E Sun, ..., D Wagner
ESORICS 2024, 2024
512024
Fight back against jailbreaking via prompt adversarial tuning
Y Mo, Y Wang, Z Wei, Y Wang
NeurIPS 2024, 2024
22*2024
Boosting Jailbreak Attack with Momentum
Y Zhang, Z Wei(✉️)
ICASSP 2025, 2024
182024
Sharpness-aware minimization alone can improve adversarial robustness
Z Wei(✉️), J Zhu, Y Zhang
ICML 2023 Workshop on Adversarial Machine Learning, 2023
17*2023
Architecture Matters: Uncovering Implicit Mechanisms in Graph Contrastive Learning
X Guo, Y Wang, Z Wei, Y Wang
NeurIPS 2023, 2023
152023
A Theoretical Understanding of Self-Correction through In-context Alignment
Y Wang, Y Wu, Z Wei, S Jegelka, Y Wang
NeurIPS 2024, 2024
112024
Weighted Automata Extraction and Explanation of Recurrent Neural Networks for Natural Language Tasks
Z Wei, X Zhang, Y Zhang, M Sun
Journal of Logical and Algebraic Methods in Programming 136, 100907, 2023
112023
Extracting weighted finite automata from recurrent neural networks for natural languages
Z Wei, X Zhang, M Sun
ICFEM 2022, 2022
102022
On the Duality Between Sharpness-Aware Minimization and Adversarial Training
Y Zhang, H He, J Zhu, H Chen, Y Wang, Z Wei(✉️)
ICML 2024, 2024
92024
Using Z3 for Formal Modeling and Verification of FNN Global Robustness
Y Zhang, Z Wei, X Zhang, M Sun
arXiv preprint arXiv:2304.10558, 2023
72023
Adversarial Representation Engineering: A General Model Editing Framework for Large Language Models
Y Zhang, Z Wei, J Sun, M Sun
NeurIPS 2024, 2024
5*2024
Exploring the Robustness of In-Context Learning with Noisy Labels
C Cheng, X Yu, H Wen, J Sun, G Yue, Y Zhang, Z Wei(✉️)
ICASSP 2025, 2024
42024
Automata Extraction from Transformers
Y Zhang, Z Wei, M Sun
arXiv preprint arXiv:2406.05564, 2024
12024
Identifying and Understanding Cross-Class Features in Adversarial Training
Z Wei, Y Guo, Y Wang
OpenReview preprint, 2023
1*2023
Towards the Worst-case Robustness of Large Language Models
H Chen, Y Dong, Z Wei, H Su, J Zhu
arXiv preprint arXiv:2501.19040, 2025
2025
MILE: A Mutation Testing Framework of In-Context Learning Systems
Z Wei, Y Zhang, M Sun
SETTA 2024, 2024
2024
سیستم در حال حاضر قادر به انجام عملکرد نیست. بعداً دوباره امتحان کنید.
مقاله‌ها 1–18