Hao Sun

引用先

	すべて	2020 年以来
引用	719	718
h 指標	14	14
i10 指標	18	18

340

170

255

2020202120222023202420259 39 105 206 327 31

オープンアクセス

すべて表示

4 件の論文

0 件の論文

利用可能

利用不可

助成機関の要件に基づく

共著者

Mihaela van der SchaarUniversity of Cambridge, The Alan Turing Institute確認したメールアドレス: ee.ucla.edu
Bolei ZhouAssistant Professor at UCLA確認したメールアドレス: ucla.edu
Qianggang DingUniversity of Montreal / Mila - Quebec AI Institute確認したメールアドレス: umontreal.ca
Rui YangUniversity of Illinois Urbana-Champaign確認したメールアドレス: illinois.edu
Bo DaiThe University of Hong Kong確認したメールアドレス: hku.hk
Alihan HüyükHarvard University確認したメールアドレス: seas.harvard.edu
Meng FangUniversity of Liverpool確認したメールアドレス: liverpool.ac.uk
Dahua LinThe Chinese University of Hong Kong確認したメールアドレス: ie.cuhk.edu.hk
Ziping XuPostdoc Fellow at Harvard University確認したメールアドレス: fas.harvard.edu
Zhenghao PengUniversity of California, Los Angeles確認したメールアドレス: cs.ucla.edu
Boris van BreugelSenior ML Researcher, Qualcomm | PhD candidate, University of Cambridge確認したメールアドレス: cam.ac.uk
Samuel HoltUniversity of Cambridge確認したメールアドレス: cam.ac.uk
Xiaoteng Ma（马骁腾）Dept. Automation, Tsinghua University, Beijing, China確認したメールアドレス: mails.tsinghua.edu.cn
Alex J. ChanConvergence確認したメールアドレス: convergence.ai
Nabeel SeedatUniversity of Cambridge確認したメールアドレス: cam.ac.uk
Thomas POUPLINPh.D. researcher, University of Cambridge確認したメールアドレス: cam.ac.uk
Daniel JarrettResearch Scientist at DeepMind確認したメールアドレス: deepmind.com
Yunyi ShenMIT確認したメールアドレス: mit.edu
Jean-Francois TonByteDance Research確認したメールアドレス: bytedance.com

フォロー

Hao Sun

PhD Candidate, DAMTP, University of Cambridge

確認したメールアドレス: cam.ac.uk - ホームページ

Reinforcement Learning Inverse RL RLHF Large Language Models


タイトル引用回数順公開年順タイトル順	引用先引用先	年
Hierarchical Multi-Scale Gaussian Transformer for Stock Movement Prediction Q Ding, S Wu, H Sun, J Guo, J Guo IJCAI 2020 (Proceedings of the Twenty-Ninth International Joint Conference …, 2020	219	2020
Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL R Yang, Y Lu, W Li, H Sun, M Fang, Y Du, X Li, L Han, C Zhang ICLR 2022 (The Tenth International Conference on Learning Representations), 2022	75	2022
Membership Inference Attacks against Synthetic Data through Overfitting Detection B van Breugel, H Sun, Z Qian, M van der Schaar AISTATS 2023 (The 26th International Conference on Artificial Intelligence …, 2023	49	2023
Policy Continuation with Hindsight Inverse Dynamics H Sun, Z Li, X Liu, D Lin, B Zhou NeurIPS 2019 (Advances in Neural Information Processing Systems 33), 2019	42	2019
Adaptive regularization of labels Q Ding, S Wu, H Sun, J Guo, ST Xia arXiv preprint arXiv:1908.05474, 2019	40	2019
Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL H Sun, A Hüyük, M van der Schaar ICLR 2024 (The Twelfth International Conference on Learning Representations), 2024	39*	2024
Exploit Reward Shifting in Value-Based Deep-RL: Optimistic Curiosity-Based Exploration and Conservative Exploitation via Linear Reward Shaping H Sun, L Han, R Yang, X Ma, J Guo, B Zhou NeurIPS 2022 (Advances in Neural Information Processing Systems) 35, 37719-37734, 2022	39*	2022
Dense reward for free in reinforcement learning from human feedback AJ Chan, H Sun, S Holt, M van der Schaar ICML 2024 (The Forty-first International Conference on Machine Learning), 2024	28	2024
Reinforcement Learning in the Era of LLMs: What is Essential? What is needed? An RL Perspective on RLHF, Prompting, and Beyond H Sun arXiv preprint arXiv:2310.06147, 2023	18	2023
Safe exploration by solving early terminated mdp H Sun, Z Xu, M Fang, Z Peng, J Guo, B Dai, B Zhou arXiv preprint arXiv:2107.04200, 2021	18*	2021
Adaptive regularization of labels Q Ding, S Wu, H Sun, J Guo, ST Xia AAAI 2021 (The Thirty-Fifth AAAI Conference on Artificial Intelligence), 2021	18*	2021
Novel policy seeking with constrained optimization H Sun, Z Peng, B Dai, D Lin, B Zhou NeurIPS 2022 Deep RL Workshop, 2022	17	2022
Supervised Q-Learning can be a Strong Baseline for Continuous Control H Sun, Z Xu, M Fang, B Zhou NeurIPS 2022 Foundation Models for Decision Making Workshop, 2022	17*	2022
Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment H Sun, M van der Schaar arXiv preprint arXiv:2405.15624, 2024	15*	2024
Non-local policy optimization via diversity-regularized collaborative exploration Z Peng, H Sun, B Zhou arXiv preprint arXiv:2006.07781, 2020	14	2020
What is Flagged in Uncertainty Quantification? Latent Density Models for Uncertainty Categorization H Sun, B van Breugel, J Crabbe, N Seedat, M van der Schaar NeurIPS 2023, 2023	12	2023
Neural Laplace Control for Continuous-time Delayed Systems S Holt, A Hüyük, Z Qian, H Sun, M van der Schaar AISTATS 2023 (The 26th International Conference on Artificial Intelligence …, 2023	12	2023
Accountability in offline reinforcement learning: Explaining decisions with a corpus of examples H Sun, A Hüyük, D Jarrett, M van der Schaar Advances in Neural Information Processing Systems 36, 2023	11*	2023
On the guaranteed almost equivalence between imitation learning from observation and demonstration Z Cheng, L Liu, A Liu, H Sun, M Fang, D Tao TNNLS (IEEE Transactions on Neural Networks and Learning Systems), 2021	9	2021
Retrieval-augmented thought process as sequential decision making T Pouplin, H Sun, S Holt, M Van der Schaar arXiv preprint arXiv:2402.07812, 2024	7	2024

現在システムで処理を実行できません。しばらくしてからもう一度お試しください。

論文 1–20

年間引用数

重複した引用

結合された引用

共著者を追加共著者

フォロー

引用先

共著者