Theo dõi
Ziyu Wan
Ziyu Wan
Email được xác minh tại sjtu.edu.cn
Tiêu đề
Trích dẫn bởi
Trích dẫn bởi
Năm
Alphazero-like tree-search can guide large language model decoding and training
Z Wan, X Feng, M Wen, SM McAleer, Y Wen, W Zhang, J Wang
Forty-first International Conference on Machine Learning, 2024
1102024
Malib: A parallel framework for population-based multi-agent reinforcement learning
M Zhou, Z Wan, H Wang, M Wen, R Wu, Y Wen, Y Yang, Y Yu, J Wang, ...
Journal of Machine Learning Research 24 (150), 1-12, 2023
622023
Neural auto-curricula in two-player zero-sum games
X Feng, O Slumbers, Z Wan, B Liu, S McAleer, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems 34, 3504-3517, 2021
50*2021
Order matters: Agent-by-agent policy optimization
X Wang, Z Tian, Z Wan, Y Wen, J Wang, W Zhang
arXiv preprint arXiv:2302.06205, 2023
252023
Openr: An open source framework for advanced reasoning with large language models
J Wang, M Fang, Z Wan, M Wen, J Zhu, A Liu, Z Gong, Y Song, L Chen, ...
arXiv preprint arXiv:2410.09671, 2024
132024
On realization of intelligent decision-making in the real world: A foundation decision model perspective
Y Wen, Z Wan, M Zhou, S Hou, Z Cao, C Le, J Chen, Z Tian, W Zhang, ...
arXiv preprint arXiv:2212.12669, 2022
112022
Reinforcing LLM Agents via Policy Optimization with Action Decomposition
M Wen, Z Wan, J Wang, W Zhang, Y Wen
The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024
10*2024
Natural language reinforcement learning
X Feng, Z Wan, H Fu, B Liu, M Yang, GA Koushik, Z Hu, Y Wen, J Wang
arXiv preprint arXiv:2411.14251, 2024
32024
Language Games as the Pathway to Artificial Superhuman Intelligence
Y Wen, Z Wan, S Zhang
arXiv preprint arXiv:2501.18924, 2025
2025
Hệ thống không thể thực hiện thao tác ngay bây giờ. Hãy thử lại sau.
Bài viết 1–9