关注
Ziyu Wan
Ziyu Wan
在 sjtu.edu.cn 的电子邮件经过验证
标题
引用次数
引用次数
年份
Alphazero-like tree-search can guide large language model decoding and training
Z Wan, X Feng, M Wen, SM McAleer, Y Wen, W Zhang, J Wang
Forty-first International Conference on Machine Learning, 2024
832024
Malib: A parallel framework for population-based multi-agent reinforcement learning
M Zhou, Z Wan, H Wang, M Wen, R Wu, Y Wen, Y Yang, Y Yu, J Wang, ...
Journal of Machine Learning Research 24 (150), 1-12, 2023
592023
Neural auto-curricula in two-player zero-sum games
X Feng, O Slumbers, Z Wan, B Liu, S McAleer, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems 34, 3504-3517, 2021
52*2021
Order matters: Agent-by-agent policy optimization
X Wang, Z Tian, Z Wan, Y Wen, J Wang, W Zhang
arXiv preprint arXiv:2302.06205, 2023
232023
On realization of intelligent decision-making in the real world: A foundation decision model perspective
Y Wen, Z Wan, M Zhou, S Hou, Z Cao, C Le, J Chen, Z Tian, W Zhang, ...
arXiv preprint arXiv:2212.12669, 2022
92022
Openr: An open source framework for advanced reasoning with large language models
J Wang, M Fang, Z Wan, M Wen, J Zhu, A Liu, Z Gong, Y Song, L Chen, ...
arXiv preprint arXiv:2410.09671, 2024
82024
Reinforcing LLM Agents via Policy Optimization with Action Decomposition
M Wen, Z Wan, J Wang, W Zhang, Y Wen
The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024
7*2024
Natural language reinforcement learning
X Feng, Z Wan, H Fu, B Liu, M Yang, GA Koushik, Z Hu, Y Wen, J Wang
arXiv preprint arXiv:2411.14251, 2024
22024
Language Games as the Pathway to Artificial Superhuman Intelligence
Y Wen, Z Wan, S Zhang
arXiv preprint arXiv:2501.18924, 2025
2025
系统目前无法执行此操作,请稍后再试。
文章 1–9