Prati
Wenzhe Li
Wenzhe Li
Potvrđena adresa e-pošte na princeton.edu - Početna stranica
Naslov
Citirano
Citirano
Godina
Rethinking goal-conditioned supervised learning and its connection to offline rl
R Yang, Y Lu, W Li, H Sun, M Fang, Y Du, X Li, L Han, C Zhang
arXiv preprint arXiv:2202.04478, 2022
772022
A survey on transformers in reinforcement learning
W Li, H Luo, Z Lin, C Zhang, Z Lu, D Ye
arXiv preprint arXiv:2301.03044, 2023
752023
Offline reinforcement learning with reverse model-based imagination
J Wang, W Li, H Jiang, G Zhu, S Li, C Zhang
Advances in Neural Information Processing Systems 34, 29420-29432, 2021
682021
Lapo: Latent-variable advantage-weighted policy optimization for offline reinforcement learning
X Chen, A Ghadirzadeh, T Yu, J Wang, AY Gao, W Li, L Bin, C Finn, ...
Advances in Neural Information Processing Systems 35, 36902-36913, 2022
55*2022
Estimating high order gradients of the data distribution by denoising
C Meng, Y Song, W Li, S Ermon
Advances in Neural Information Processing Systems 34, 25359-25369, 2021
442021
Flow to control: Offline reinforcement learning with lossless primitive discovery
Y Yang, H Hu, W Li, S Li, J Yang, Q Zhao, C Zhang
Proceedings of the AAAI Conference on Artificial Intelligence 37 (9), 10843 …, 2023
162023
Tractable computation of expected kernels
W Li, Z Zeng, A Vergari, G Van den Broeck
Uncertainty in Artificial Intelligence, 1163-1173, 2021
92021
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations
K Huang, J Guo, Z Li, X Ji, J Ge, W Li, Y Guo, T Cai, H Yuan, R Wang, ...
arXiv preprint arXiv:2502.06453, 2025
22025
FightLadder: A benchmark for competitive multi-agent reinforcement learning
W Li, Z Ding, S Karten, C Jin
arXiv preprint arXiv:2406.02081, 2024
22024
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?
W Li, Y Lin, M Xia, C Jin
arXiv preprint arXiv:2502.00674, 2025
2025
Towards Principled Superhuman AI for Multiplayer Symmetric Games
J Ge, Y Wang, W Li, C Jin
arXiv e-prints, arXiv: 2406.04201, 2024
2024
Sustav trenutno ne može provesti ovu radnju. Pokušajte ponovo kasnije.
Članci 1–11