متابعة
Zhiheng Xi
Zhiheng Xi
بريد إلكتروني تم التحقق منه على m.fudan.edu.cn - الصفحة الرئيسية
عنوان
عدد مرات الاقتباسات
عدد مرات الاقتباسات
السنة
The rise and potential of large language model based agents: A survey
Z Xi, W Chen, X Guo, W He, Y Ding, B Hong, M Zhang, J Wang, S Jin, ...
Science China Information Sciences 68 (2), 121101, 2025
7552025
Delve into ppo: Implementation matters for stable rlhf
R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Y Zhou, ...
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023
139*2023
Secrets of rlhf in large language models part ii: Reward modeling
B Wang, R Zheng, L Chen, Y Liu, S Dou, C Huang, W Shen, S Jin, E Zhou, ...
arXiv preprint arXiv:2401.06080, 2024
84*2024
Self-polish: Enhance reasoning in large language models via problem refinement
Z Xi, S Jin, Y Zhou, R Zheng, S Gao, T Gui, Q Zhang, X Huang
EMNLP 2023 Findings, 2023
382023
LoRAMoE: Alleviating world knowledge forgetting in large language models via MoE-style plugin
S Dou, E Zhou, Y Liu, S Gao, W Shen, L Xiong, Y Zhou, X Wang, Z Xi, ...
Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024
222024
The rise and potential of large language model based agents: A survey. arXiv 2023
Z Xi, W Chen, X Guo, W He, Y Ding, B Hong, M Zhang, J Wang, S Jin, ...
arXiv preprint arXiv:2309.07864, 2023
222023
Towards understanding the capability of large language models on code clone detection: a survey
S Dou, J Shan, H Jia, W Deng, Z Xi, W He, Y Wu, T Gui, Y Liu, X Huang
arXiv preprint arXiv:2308.01191, 2023
202023
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Z Xi, Y Ding, W Chen, B Hong, H Guo, J Wang, D Yang, C Liao, X Guo, ...
arXiv preprint arXiv:2406.04151, 2024
192024
Loramoe: Revolutionizing mixture of experts for maintaining world knowledge in language model alignment
S Dou, E Zhou, Y Liu, S Gao, J Zhao, W Shen, Y Zhou, Z Xi, X Wang, ...
ACL 2024, 2023
19*2023
Safety and Ethical Concerns of Large Language Models
Z Xi, R Zheng, T Gui
CCL 2023, 9-16, 2023
19*2023
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
W Zhou, X Wang, L Xiong, H Xia, Y Gu, M Chai, F Zhu, C Huang, S Dou, ...
arXiv preprint arXiv:2403.12171, 2024
142024
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Z Xi, W Chen, B Hong, S Jin, R Zheng, W He, Y Ding, S Liu, X Guo, ...
ICML 2024, 2024
122024
MouSi: Poly-Visual-Expert Vision-Language Models
X Fan, T Ji, C Jiang, S Li, S Jin, S Song, J Wang, B Hong, L Chen, ...
arXiv preprint arXiv:2401.17221, 2024
122024
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
S Dou, Y Liu, H Jia, L Xiong, E Zhou, J Shan, C Huang, W Shen, X Fan, ...
ACL 2024, 2024
112024
Efficient Adversarial Training with Robust Early-bird Tickets
Z Xi, R Zheng, T Gui, Q Zhang, X Huang
EMNLP 2022, 2022
102022
TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models
X Wang, Y Zhang, T Chen, S Gao, S Jin, X Yang, Z Xi, R Zheng, Y Zou, ...
arXiv preprint arXiv:2310.06762, 2023
92023
Connectivity Patterns are Task Embeddings
Z Xi, R Zheng, Y Zhang, XJ Huang, Z Wei, M Peng, M Sun, Q Zhang, T Gui
ACL 2023 Findings, 2023
52023
Improving generalization of alignment with human preferences through group invariant learning
R Zheng, W Shen, Y Hua, W Lai, S Dou, Y Zhou, Z Xi, X Wang, H Huang, ...
ICLR 2024, 2023
42023
Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision
Z Xi, D Yang, J Huang, J Tang, G Li, Y Ding, W He, B Hong, S Do, W Zhan, ...
arXiv preprint arXiv:2411.16579, 2024
32024
Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data
H Xia, S Gao, Q Ge, Z Xi, Q Zhang, X Huang
arXiv preprint arXiv:2408.14874, 2024
32024
يتعذر على النظام إجراء العملية في الوقت الحالي. عاود المحاولة لاحقًا.
مقالات 1–20