Hongyi Guo

Viittaukset

	Kaikki	2020 lähtien
Sitaatit	475	475
h-indeksi	9	9
i10-indeksi	9	9

220

110

165

20202021202220232024202511 44 76 105 212 27

Yleisessä käytössä

Näytä kaikki

6 artikkelia

0 artikkelia

käytettävissä

ei käytettävissä

Perustuu rahoitusehtoihin

Muut kirjoittajat

Yang LiuComputer Science and Engineering, UC Santa CruzVahvistettu sähköpostiosoite verkkotunnuksessa ucsc.edu
Zhaoran WangAssociate Professor at Northwestern UniversityVahvistettu sähköpostiosoite verkkotunnuksessa northwestern.edu
Zhihan LiuNorthwestern UniversityVahvistettu sähköpostiosoite verkkotunnuksessa u.northwestern.edu
Chenjia BaiThe Institute of AI (TeleAI), China TelecomVahvistettu sähköpostiosoite verkkotunnuksessa chinatelecom.cn
Yufeng ZhangPh.D. Student, Northwestern UniversityVahvistettu sähköpostiosoite verkkotunnuksessa u.northwestern.edu
Qi CaiNorthwestern UniversityVahvistettu sähköpostiosoite verkkotunnuksessa u.northwestern.edu
Jingkang WangUniversity of TorontoVahvistettu sähköpostiosoite verkkotunnuksessa cs.toronto.edu
Zhaowei ZhuDocta.ai; University of California, Santa CruzVahvistettu sähköpostiosoite verkkotunnuksessa docta.ai
Zuyue FuNorthwestern UniversityVahvistettu sähköpostiosoite verkkotunnuksessa u.northwestern.edu
Haifeng ZhangInstitute of Automation, Chinese Academy of SciencesVahvistettu sähköpostiosoite verkkotunnuksessa ia.ac.cn
Zhuoran YangYale UniversityVahvistettu sähköpostiosoite verkkotunnuksessa yale.edu

Seuraa

Hongyi Guo

Muut nimet郭洪一

Northwestern University

Vahvistettu sähköpostiosoite verkkotunnuksessa u.northwestern.edu

Large Language Model Reinforcement Learning


Nimike Lajittele sitaattien mukaan Lajittele vuoden mukaan Lajittele otsikon mukaan	Viittaukset Viittaukset	Vuosi
Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates Y Liu, H Guo International Conference on Machine Learning, 2020	279	2020
Reason for future, act for now: A principled framework for autonomous llm agents with provable sample efficiency Z Liu, H Hu, S Zhang, H Guo, S Ke, B Liu, Z Wang arXiv preprint arXiv:2309.17382, 2023	45*	2023
Provably mitigating overoptimization in rlhf: Your sft loss is implicitly an adversarial regularizer Z Liu, M Lu, S Zhang, B Liu, H Guo, Y Yang, J Blanchet, Z Wang arXiv preprint arXiv:2405.16436, 2024	34	2024
Human-instruction-free llm self-alignment with limited samples H Guo, Y Yao, W Shen, J Wei, X Zhang, Z Wang, Y Liu arXiv preprint arXiv:2401.06785, 2024	18	2024
Provably efficient offline reinforcement learning for partially observable markov decision processes H Guo, Q Cai, Y Zhang, Z Yang, Z Wang International Conference on Machine Learning (Spotlight), 8016-8038, 2022	18	2022
Behavior Contrastive Learning for Unsupervised Skill Discovery R Yang, C Bai, H Guo, S Li, B Zhao, Z Wang, P Liu, X Li International Conference on Machine Learning, 2023	17	2023
Measuring and Reducing LLM Hallucination without Gold-Standard Answers J Wei, Y Yao, JF Ton, H Guo, A Estornell, Y Liu arXiv preprint arXiv:2402.10412, 2024	16	2024
Policy learning using weak supervision J Wang, H Guo, Z Zhu*, Y Liu Advances in Neural Information Processing Systems 34, 19960-19973, 2021	13	2021
Decentralized single-timescale actor-critic on zero-sum two-player stochastic games H Guo, Z Fu, Z Yang, Z Wang International Conference on Machine Learning (Spotlight), 3899-3909, 2021	11	2021
Improving reinforcement learning from human feedback using contrastive rewards W Shen, X Zhang, Y Yao, R Zheng, H Guo, Y Liu arXiv preprint arXiv:2403.07708, 2024	8	2024
Can large language models play games? a case study of a self-play approach H Guo, Z Liu, Y Zhang, Z Wang arXiv preprint arXiv:2403.05632, 2024	8	2024
Signal instructed coordination in cooperative multi-agent reinforcement learning L Chen, H Guo, Y Du, F Fang, H Zhang, W Zhang, Y Yu Distributed Artificial Intelligence: Third International Conference, DAI …, 2022	6	2022
Toward Optimal LLM Alignments Using Two-Player Games R Zheng, H Guo, Z Liu, X Zhang, Y Yao, X Xu, Z Wang, Z Xi, T Gui, ... arXiv preprint arXiv:2406.10977, 2024	2	2024
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning H Zhong, Y Yin, S Zhang, X Xu, Y Liu, Y Zuo, Z Liu, B Liu, S Zheng, H Guo, ... arXiv preprint arXiv:2501.18858, 2025		2025
Machine learning model alignment Y Yao, H Guo, X Zhang, Y Liu US Patent App. 18/900,432, 2025		2025
Machine learning model evaluation Y Yao, J Wei, TON Jean-Francois, H Guo, A Estornell, Y Liu US Patent App. 18/885,135, 2025		2025
Diverse randomized value functions: A provably pessimistic approach for offline reinforcement learning X Yu, C Bai, H Guo, C Wang, Z Wang Information Sciences 680, 121146, 2024		2024
Progressive LLM Alignments Using Two-Player Games R Zheng, H Guo, Z Liu, X Zhang, Y Yao, X Xu, Z Wang, Z Xi, T Gui, ...
Robust RLHF with Noisy Rewards W Shen, X Zhang, Y Yao, R Zheng, H Guo, Y Liu
Lightweight Uncertainty for Offline Reinforcement Learning via Bayesian Posterior X Yu, C Bai, H Guo, L Wang, C Wang, Z Wang

Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.

Artikkelit 1–20

Sitaatteja vuodessa

Päällekkäiset lähteet

Yhdistetyt sitaatit

Lisää muut kirjoittajatMuut kirjoittajat

Seuraa

Viittaukset

Muut kirjoittajat