Counterfactual conservative Q learning for offline multi-agent reinforcement learning J Shao*, Y Qu*, C Chen, H Zhang, X Ji NeurIPS 2023, 2024 | 23 | 2024 |
Hokoff: real game dataset from honor of kings and its offline reinforcement learning benchmarks Y Qu, B Wang, J Shao, Y Jiang, C Chen, Z Ye, L Linc, Y Feng, L Lai, H Qin, ... NeurIPS 2023, 2024 | 8 | 2024 |
Complementary attention for multi-agent reinforcement learning J Shao, H Zhang, Y Qu, C Liu, S He, Y Jiang, X Ji ICML 2023, 2023 | 8 | 2023 |
LLM-Empowered State Representation for Reinforcement Learning B Wang*, Y Qu*, Y Jiang, J Shao, C Liu, W Yang, X Ji ICML 2024, 2024 | 5 | 2024 |
Offline reinforcement learning with ood state correction and ood action suppression Y Mao, C Wang, C Chen, Y Qu, X Ji NeurIPS 2024, 2024 | 4 | 2024 |
Robust fast adaptation from adversarially explicit task distribution generation C Wang, Y Lv, Y Mao, Y Qu, Y Xu, X Ji KDD 2025, 2024 | 4 | 2024 |
Choices are more important than efforts: Llm enables efficient multi-agent exploration Y Qu, B Wang, Y Jiang, J Shao, Y Mao, C Wang, C Liu, X Ji arXiv preprint arXiv:2410.02511, 2024 | 3 | 2024 |
Doubly Mild Generalization for Offline Reinforcement Learning Y Mao, Q Wang, Y Qu, Y Jiang, X Ji NeurIPS 2024, 2024 | 2 | 2024 |
Beyond Any-Shot Adaptation: Predicting Optimization Outcome for Robustness Gains without Extra Pay C Wang*, Z Xiao*, Y Mao*, Y Qu*, J Shen, Y Lv, X Ji arXiv preprint arXiv:2501.11039, 2025 | | 2025 |
Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning Y Qu, Y Jiang, B Wang, Y Mao, C Wang, C Liu, X Ji AAAI 2025, 0 | | |
Consciousness-Aware Multi-Agent Reinforcement Learning J Shao, H Zhang, Y Qu, C Liu, S He, Y Jiang, X Ji | | |
HoK3v3: an Environment for Generalization in Heterogeneous Multi-agent Reinforcement Learning L Liu, J Shao, X Chen, Y Qu, B Wang, Z Ye, Y Tu, H Qin, YJ Feng, L Lai, ... | | |