LLM-empowered state representation for reinforcement learning

B Wang, Y Qu, Y Jiang, J Shao, C Liu, W Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
Conventional state representations in reinforcement learning often omit critical task-related
details, presenting a significant challenge for value networks in establishing accurate …

Doubly mild generalization for offline reinforcement learning

Y Mao, Q Wang, Y Qu, Y Jiang, X Ji - arxiv preprint arxiv:2411.07934, 2024 - arxiv.org
Offline Reinforcement Learning (RL) suffers from the extrapolation error and value
overestimation. From a generalization perspective, this issue can be attributed to the over …

Theoretical investigations and practical enhancements on tail task risk minimization in meta learning

Y Lv, Q Wang, D Liang, Z **e - arxiv preprint arxiv:2410.22788, 2024 - arxiv.org
Meta learning is a promising paradigm in the era of large models and task distributional
robustness has become an indispensable consideration in real-world scenarios. Recent …

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning

Y Qu, Y Jiang, B Wang, Y Mao, C Wang, C Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Reinforcement learning (RL) often encounters delayed and sparse feedback in real-world
applications, even with only episodic rewards. Previous approaches have made some …

Offline Fictitious Self-Play for Competitive Games

J Chen, W **e, W Zhang, Y Wen - arxiv preprint arxiv:2403.00841, 2024 - arxiv.org
Offline Reinforcement Learning (RL) has received significant interest due to its ability to
improve policies in previously collected datasets without online interactions. Despite its …

基于多智能体**化学**的博弈综述

**艺春, 刘泽娇, 洪艺天, 王继超, 王健瑞, **毅, 唐漾 - 自动化学报, 2024 - aas.net.cn
多智能体**化学**作为博弈论, 控制论和多智能体学**的交叉研究领域, 是多智能体系统研究中
的前沿方向, 赋予了智能体在动态多维的复杂环境中通过交互和决策完成多样化任务的能力 …

Enhancing Decision-Making in Offline Reinforcement Learning: Adaptive, Multi-Agent, and Online Perspectives

Y Zhang - 2024 - ses.library.usyd.edu.au
Inspired by the successful application of large models in natural language processing and
computer vision, both the research community and industry have increasingly focused on …