Chenlu Ye

引用先

	すべて	2020 年以来
引用	210	210
h 指標	5	5
i10 指標	4	4

180

135

20232024202514 176 20

共著者

Tong ZhangUIUC確認したメールアドレス: tongzhang-ml.org
Wei XiongComputer Science, University of Illinois Urbana-Champaign確認したメールアドレス: illinois.edu
Nan JiangAssociate Professor of Computer Science, UIUC確認したメールアドレス: illinois.edu
Hanze DongSalesforce Research確認したメールアドレス: salesforce.com
Quanquan GuAssociate Professor of Computer Science, UCLA確認したメールアドレス: cs.ucla.edu
Han ZhongPeking University確認したメールアドレス: stu.pku.edu.cn
Heng JiProfessor, Siebel School of Computing and Data Science, AICE Director, UIUC, Amazon Scholar確認したメールアドレス: illinois.edu
Ziqi WangUniversity of Illinois確認したメールアドレス: illinois.edu
Yuheng ZhangUIUC確認したメールアドレス: illinois.edu
Rui YangUniversity of Illinois Urbana-Champaign確認したメールアドレス: illinois.edu
Jiafan HePhD student, Department of Computer Science, UCLA確認したメールアドレス: ucla.edu
Yong LinPrinceton University確認したメールアドレス: princeton.edu
Chen LiuHong Kong University of Science and Technology確認したメールアドレス: connect.ust.hk
Qing LianHKUST確認したメールアドレス: connect.ust.hk
Yuan YaoInstitute of Physics, Chinese Academy of Science確認したメールアドレス: iphy.ac.cn
Jianqing FanProfessor of Statistics, Professor of Finance, Princeton University確認したメールアドレス: princeton.edu
Heyang ZhaoUCLA確認したメールアドレス: cs.ucla.edu
Yuan YAOHong Kong University of Science and Technology確認したメールアドレス: ust.hk
Zhuoran YangYale University確認したメールアドレス: yale.edu
Zhaoran WangAssociate Professor at Northwestern University確認したメールアドレス: northwestern.edu

フォロー

Chenlu Ye

Computer Science, University of Illinois Urbana-Champaign

確認したメールアドレス: illinois.edu - ホームページ

Reinforement learning post-training robust learning


タイトル引用回数順公開年順タイトル順	引用先引用先	年
Iterative preference learning from human feedback: Bridging theory and practice for rlhf under kl-constraint W Xiong, H Dong, C Ye, Z Wang, H Zhong, H Ji, N Jiang, T Zhang Forty-first International Conference on Machine Learning, 2024	122*	2024
Online iterative reinforcement learning from human feedback with general preference model C Ye, W Xiong, Y Zhang, H Dong, N Jiang, T Zhang The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024	30*	2024
Corruption-robust algorithms with uncertainty weighting for nonlinear contextual bandits and markov decision processes C Ye, W Xiong, Q Gu, T Zhang International Conference on Machine Learning, 39834-39863, 2023	28	2023
Corruption-Robust Offline Reinforcement Learning with General Function Approximation C Ye, R Yang, Q Gu, T Zhang Neural Information Processing Systems, 2023	19	2023
Towards robust model-based reinforcement learning against adversarial corruption C Ye, J He, Q Gu, T Zhang arXiv preprint arXiv:2402.08991, 2024	5	2024
Optimal sample selection through uncertainty estimation and its application in deep learning Y Lin, C Liu, C Ye, Q Lian, Y Yao, T Zhang arXiv preprint arXiv:2309.02476, 2023	4	2023
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF H Zhao, C Ye, Q Gu, T Zhang arXiv preprint arXiv:2411.04625, 2024	1	2024
Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks J Fan, Z Wang, Z Yang, C Ye arXiv preprint arXiv:2311.13180, 2023	1	2023
Catoni Contextual Bandits are Robust to Heavy-tailed Rewards C Ye, Y Jin, A Agarwal, T Zhang arXiv preprint arXiv:2502.02486, 2025		2025

現在システムで処理を実行できません。しばらくしてからもう一度お試しください。

論文 1–9

年間引用数

重複した引用

結合された引用

共著者を追加共著者

フォロー

引用先

共著者