Tian Xu

引用先

	すべて	2020 年以来
引用	443	440
h 指標	9	9
i10 指標	8	8

240

120

180

2020202120222023202420252 9 42 124 235 27

オープンアクセス

すべて表示

6 件の論文

1 件の論文

利用可能

利用不可

助成機関の要件に基づく

共著者

Ziniu LiThe Chinese University of Hong Kong, Shenzhen確認したメールアドレス: link.cuhk.edu.cn
Yang YuProfessor, Nanjing University確認したメールアドレス: nju.edu.cn
Zhi-Quan LuoProfessor, The Chinese University of Hong Kong, Shenzhen, China確認したメールアドレス: cuhk.edu.cn

フォロー

Tian Xu

Nanjing University

確認したメールアドレス: lamda.nju.edu.cn - ホームページ

Reinforcement Learning


タイトル引用回数順公開年順タイトル順	引用先引用先	年
A survey on model-based reinforcement learning FM Luo, T Xu, H Lai, XH Chen, W Zhang, Y Yu Science China Information Sciences 67 (2), 121101, 2024	141*	2024
Error bounds of imitating policies and environments T Xu, Z Li, Y Yu Advances in Neural Information Processing Systems 33, 15737-15749, 2020	116	2020
Remax: A simple, effective, and efficient reinforcement learning method for aligning large language models Z Li, T Xu, Y Zhang, Z Lin, Y Yu, R Sun, ZQ Luo Forty-first International Conference on Machine Learning, 2023	44*	2023
Error bounds of imitating policies and environments for reinforcement learning T Xu, Z Li, Y Yu IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (10), 6968 …, 2021	40	2021
Rethinking ValueDice: Does it really improve performance? Z Li, T Xu, Y Yu, ZQ Luo arXiv preprint arXiv:2202.02468, 2022	14	2022
Yang Yu. Reward-consistent dynamics models are strongly generalizable for offline reinforcement learning FM Luo, T Xu, X Cao arXiv preprint arXiv:2310.05422, 2023	12	2023
Yang Yu. Policy optimization in rlhf: The impact of out-of-preference data Z Li, T Xu arXiv preprint arXiv:2312.10584, 2023	11	2023
Imitation learning from imperfection: Theoretical justifications and algorithms Z Li, T Xu, Z Qin, Y Yu, ZQ Luo Advances in Neural Information Processing Systems 36, 18404-18443, 2023	10	2023
Provably efficient adversarial imitation learning with unknown transitions T Xu, Z Li, Y Yu, ZQ Luo Uncertainty in Artificial Intelligence, 2367-2378, 2023	9	2023
Policy optimization in rlhf: The impact of out-of-preference data Z Li, T Xu, Y Yu arXiv preprint arXiv:2312.10584, 2023	8	2023
Understanding adversarial imitation learning in small sample regime: A stage-coupled analysis T Xu, Z Li, Y Yu, ZQ Luo arXiv preprint arXiv:2208.01899, 2022	6	2022
Testing and evaluation of autonomous vehicles based on safety of the intended functionality J Hu, T Xu, R Zhang 2021 6th International Conference on Transportation Information and Safety …, 2021	6	2021
On generalization of adversarial imitation learning and beyond T Xu, Z Li, Y Yu, ZQ Luo arXiv preprint arXiv:2106.10424, 2021	5	2021
Model gradient: unified model and policy learning in model-based reinforcement learning C Jia, F Zhang, T Xu, JC Pang, Z Zhang, Y Yu Frontiers of Computer Science 18 (4), 184339, 2024	4	2024
Policy Rehearsing: Training Generalizable Policies for Reinforcement Learning C Jia, C Gao, H Yin, F Zhang, XH Chen, T Xu, L Yuan, Z Zhang, ZH Zhou, ... The Twelfth International Conference on Learning Representations, 2024	3	2024
Entropic distribution matching in supervised fine-tuning of LLMs: Less overfitting and better diversity Z Li, C Chen, T Xu, Z Qin, J Xiao, R Sun, ZQ Luo arXiv preprint arXiv:2408.16673, 2024	2	2024
Theoretical analysis of offline imitation with supplementary dataset Z Li, T Xu, Y Yu, ZQ Luo arXiv preprint arXiv:2301.11687, 2023	2	2023
Validation on safety of the intended functionality of automated vehicles: Concept development J Hu, T Xu, X Yan, R Zhang SAE International Journal of Connected and Automated Vehicles 6 (12-06-01 …, 2022	2	2022
Nearly Minimax Optimal Adversarial Imitation Learning with Known and Unknown Transitions T Xu, Z Li, Y Yu CoRR abs/2106.10424, 2021	2	2021
Offline Imitation Learning without Auxiliary High-quality Behavior Data JJ Shao, HS Shi, T Xu, LZ Guo, Y Yu, YF Li	2

現在システムで処理を実行できません。しばらくしてからもう一度お試しください。

論文 1–20

年間引用数

重複した引用

結合された引用

共著者を追加共著者

フォロー

引用先

共著者