Ikuti
Kaijie Zhu
Kaijie Zhu
Email yang diverifikasi di ucsb.edu - Beranda
Judul
Dikutip oleh
Dikutip oleh
Tahun
A survey on evaluation of large language models
Y Chang, X Wang, J Wang, Y Wu, L Yang, K Zhu, H Chen, X Yi, C Wang, ...
ACM Transactions on Intelligent Systems and Technology (TIST) 15 (3), 1-45, 2024
21912024
PromptRobust: Towards evaluating the robustness of large language models on adversarial prompts
K Zhu, J Wang, J Zhou, Z Wang, H Chen, Y Wang, L Yang, W Ye, Y Zhang, ...
CCS 2024 LAMPS Workshop, 2023
270*2023
The good, the bad, and why: Unveiling emotions in generative ai
C Li, J Wang, Y Zhang, K Zhu, X Wang, W Hou, J Lian, F Luo, Q Yang, ...
International Conference on Machine Learning (ICML), 2023
174*2023
CompeteAI: Understanding the Competition Dynamics of Large Language Model-based Agents
Q Zhao, J Wang, Y Zhang, Y Jin, K Zhu, H Chen, X Xie
International Conference on Machine Learning (ICML), Oral, 2024
62*2024
Dyval: Dynamic evaluation of large language models for reasoning tasks
K Zhu, J Chen, J Wang, NZ Gong, D Yang, X Xie
International Conference on Learning Representations (ICLR), Spotlight, 2024
46*2024
Promptbench: A unified library for evaluation of large language models
K Zhu, Q Zhao, H Chen, J Wang, X Xie
JMLR MLOSS Track, 2023
282023
Improving generalization of adversarial training via robust critical fine-tuning
K Zhu, X Hu, J Wang, X Xie, G Yang
International Conference on Computer Vision (ICCV), 2023
232023
Dynamic Evaluation of Large Language Models by Meta Probing Agents
K Zhu, J Wang, Q Zhao, R Xu, X Xie
International Conference on Machine Learning (ICML), 2024
18*2024
AgentReview: Exploring Peer Review Dynamics with LLM Agents
Y Jin, Q Zhao, Y Wang, H Chen, K Zhu, Y Xiao, J Wang
EMNLP, Oral, 2024
162024
NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models
L Fan, W Hua, X Li, K Zhu, M Jin, L Li, H Ling, J Chi, J Wang, X Ma, ...
arXiv preprint arXiv:2403.01777, 2024
82024
Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities
W Hua, K Zhu, L Li, L Fan, S Lin, M Jin, H Xue, Z Li, JD Wang, Y Zhang
arXiv preprint arXiv:2406.02787, 2024
42024
MELON: Indirect Prompt Injection Defense via Masked Re-execution and Tool Comparison
K Zhu, X Yang, J Wang, W Guo, WY Wang
arXiv preprint arXiv:2502.05174, 2025
2025
Flatter Minima of Loss Landscapes Correspond with Strong Corruption Robustness
L Zhong, K Zhu, G Yang
International Conference on Pattern Recognition, 314-328, 2024
2024
Sistem tidak dapat melakukan operasi ini. Coba lagi nanti.
Artikel 1–13