Folgen
limao xiong
limao xiong
Bestätigte E-Mail-Adresse bei m.fudan.edu.cn
Titel
Zitiert von
Zitiert von
Jahr
The rise and potential of large language model based agents: A survey
Z Xi, W Chen, X Guo, W He, Y Ding, B Hong, M Zhang, J Wang, S Jin, ...
Science China Information Sciences 68 (2), 121101, 2025
7562025
Secrets of rlhf in large language models part i: Ppo
R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Q Liu, ...
arXiv preprint arXiv:2307.04964, 2023
140*2023
Easyjailbreak: A unified framework for jailbreaking large language models
W Zhou, X Wang, L Xiong, H Xia, Y Gu, M Chai, F Zhu, C Huang, S Dou, ...
arXiv preprint arXiv:2403.12171, 2024
362024
MINER: Improving out-of-vocabulary named entity recognition from an information theoretic perspective
X Wang, S Dou, L Xiong, Y Zou, Q Zhang, T Gui, L Qiao, Z Cheng, ...
Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022
322022
The rise and potential of large language model based agents: A survey, 2023
Z Xi, W Chen, X Guo, W He, Y Ding, B Hong, M Zhang, J Wang, S Jin, ...
arXiv preprint arXiv:2309.07864, 2023
292023
The rise and potential of large language model based agents: A survey. arXiv 2023
Z Xi, W Chen, X Guo, W He, Y Ding, B Hong, M Zhang, J Wang, S Jin, ...
arXiv preprint arXiv:2309.07864, 2023
222023
Zhiheng Xi
W Zhou, X Wang, L Xiong, H Xia, Y Gu, M Chai, F Zhu, C Huang, S Dou
Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun …, 2024
182024
LoRAMoE: Alleviate world knowledge forgetting in large language models via MoE-style plugin
S Dou, E Zhou, Y Liu, S Gao, J Zhao, W Shen, Y Zhou, Z Xi, X Wang, ...
arXiv preprint arXiv:2312.09979, 2023
162023
Stepcoder: Improve code generation with reinforcement learning from compiler feedback
S Dou, Y Liu, H Jia, L Xiong, E Zhou, W Shen, J Shan, C Huang, X Wang, ...
arXiv preprint arXiv:2402.01391, 2024
122024
Zhiheng Xi, Xiaoran Fan, et al. 2024. Loramoe: Alleviating world knowledge forgetting in large language models via moe-style plugin
S Dou, E Zhou, Y Liu, S Gao, W Shen, L Xiong, Y Zhou, X Wang
Proceedings of the 62nd Annual Meeting of the Association for Computational …, 1932
101932
Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, and Tao Gui. Stepcoder: Improve code generation with reinforcement learning from compiler feedback. CoRR …
S Dou, Y Liu, H Jia, L Xiong
arXiv preprint ARXIV.2402.01391, 0
8
Zhiheng Xi, et al. 2024. StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
S Dou, Y Liu, H Jia, L Xiong, E Zhou, J Shan, C Huang, W Shen, X Fan
arXiv preprint arXiv:2402.01391, 2024
72024
The rise and potential of large language model based agents: a survey. CoRR abs/2309.07864 (2023)
Z Xi, W Chen, X Guo, W He, Y Ding, B Hong, M Zhang, J Wang, S Jin, ...
72023
Delve into PPO: Implementation matters for stable RLHF
R Zheng, S Dou, S Gao, Y Hua, W Shen, B Wang, Y Liu, S Jin, Y Zhou, ...
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023
62023
A Confidence-based Partial Label Learning Model for Crowd-Annotated Named Entity Recognition
L Xiong, J Zhou, Q Zhu, X Wang, Y Wu, Q Zhang, T Gui, X Huang, J Ma, ...
Findings of the Association for Computational Linguistics: ACL 2023, 2023
52023
Metarm: Shifted distributions alignment via meta-learning
S Dou, Y Liu, E Zhou, T Li, H Jia, L Xiong, X Zhao, J Ye, R Zheng, T Gui, ...
arXiv preprint arXiv:2405.00438, 2024
22024
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
E Zhou, G Zheng, B Wang, Z Xi, S Dou, R Bao, W Shen, L Xiong, J Fan, ...
arXiv preprint arXiv:2410.09893, 2024
12024
Multi-Programming Language Sandbox for LLMs
S Dou, J Zhang, J Zang, Y Tao, W Zhou, H Jia, S Liu, Y Yang, Z Xi, S Wu, ...
arXiv preprint arXiv:2410.23074, 2024
2024
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–18