Следене
Muning Wen
Muning Wen
Потвърден имейл адрес: sjtu.edu.cn
Заглавие
Позовавания
Позовавания
Година
Trust region policy optimisation in multi-agent reinforcement learning
JG Kuba, R Chen, M Wen, Y Wen, F Sun, J Wang, Y Yang
10th International Conference on Learning Representations, 2021
2812021
Multi-agent reinforcement learning is a sequence modeling problem
M Wen, J Kuba, R Lin, W Zhang, Y Wen, J Wang, Y Yang
Advances in Neural Information Processing Systems 35, 16509-16521, 2022
2152022
Offline pre-trained multi-agent decision transformer
L Meng, M Wen, C Le, X Li, D Xing, W Zhang, Y Wen, H Zhang, J Wang, ...
Machine Intelligence Research 20 (2), 233-248, 2023
1172023
Alphazero-like tree-search can guide large language model decoding and training
X Feng, Z Wan, M Wen, Y Wen, W Zhang, J Wang
ICML 2024, 2023
1092023
Settling the variance of multi-agent policy gradients
JG Kuba, M Wen, L Meng, H Zhang, D Mguni, J Wang, Y Yang
Advances in Neural Information Processing Systems 34, 13458-13470, 2021
712021
Malib: A parallel framework for population-based multi-agent reinforcement learning
M Zhou, Z Wan, H Wang, M Wen, R Wu, Y Wen, Y Yang, W Zhang, ...
JMLR, 2021
622021
Multi-agent constrained policy optimisation
S Gu, JG Kuba, M Wen, R Chen, Z Wang, Z Tian, J Wang, A Knoll, Y Yang
arXiv preprint arXiv:2110.02793, 2021
562021
Large sequence models for sequential decision-making: a survey
M Wen, R Lin, H Wang, Y Yang, Y Wen, L Mai, J Wang, H Zhang, ...
Frontiers of Computer Science 17 (6), 176349, 2023
382023
Openr: An open source framework for advanced reasoning with large language models
J Wang, M Fang, Z Wan, M Wen, J Zhu, A Liu, Z Gong, Y Song, L Chen, ...
arXiv preprint arXiv:2410.09671, 2024
152024
Reinforcing LLM Agents via Policy Optimization with Action Decomposition
M Wen, Z Wan, J Wang, W Zhang, Y Wen
The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024
10*2024
Hammer: Robust function-calling for on-device language models via function masking
Q Lin, M Wen, Q Peng, G Nie, J Liao, J Wang, X Mo, J Zhou, C Cheng, ...
The Thirteenth International Conference on Learning Representations, 2024
62024
Entropy-Regularized Token-Level Policy Optimization for Large Language Models
M Wen, C Deng, J Wang, W Zhang, Y Wen
arXiv preprint arXiv:2402.06700, 2024
52024
RoMAT: Role-based multi-agent transformer for generalizable heterogeneous cooperation
D Wang, F Zhong, M Wen, M Li, Y Peng, T Li, Y Yang
Neural Networks, 106129, 2024
52024
Safe multiagent learning with soft constrained policy optimization in real robot control
S Gu, D Huang, M Wen, G Chen, A Knoll
IEEE Transactions on Industrial Informatics, 2024
42024
Autonomous goal detection and cessation in reinforcement learning: A case study on source term estimation
Y Shi, M Wen, Q Zhang, W Zhang, C Liu, W Liu
arXiv preprint arXiv:2409.09541, 2024
32024
TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
R Zhou, Y Yang, M Wen, Y Wen, W Wang, C Xi, G Xu, Y Yu, W Zhang
Proceedings of the 47th International ACM SIGIR Conference on Research and …, 2024
32024
Hammerbench: Fine-grained function-calling evaluation in real mobile device scenarios
J Wang, J Zhou, M Wen, X Mo, H Zhang, Q Lin, C Jin, X Wang, W Zhang, ...
arXiv preprint arXiv:2412.16516, 2024
12024
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement
M Wen, J Liao, C Deng, J Wang, W Zhang, Y Wen
arXiv preprint arXiv:2402.06700, 2024
12024
Robust gymnasium: A unified modular benchmark for robust reinforcement learning
S Gu, L Shi, M Wen, M Jin, E Mazumdar, Y Chi, A Wierman, C Spanos
The Thirteenth International Conference on Learning Representations, 2024
12024
PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement Learning
K Hu, M Wen, X Wang, S Zhang, Y Shi, M Li, M Li, Y Wen
arXiv preprint arXiv:2502.16496, 2025
2025
Системата не може да изпълни операцията сега. Опитайте отново по-късно.
Статии 1–20