Seuraa
Hongyi Guo
Hongyi Guo
Muut nimet郭洪一
Vahvistettu sähköpostiosoite verkkotunnuksessa u.northwestern.edu
Nimike
Viittaukset
Viittaukset
Vuosi
Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates
Y Liu, H Guo
International Conference on Machine Learning, 2020
2792020
Reason for future, act for now: A principled framework for autonomous llm agents with provable sample efficiency
Z Liu, H Hu, S Zhang, H Guo, S Ke, B Liu, Z Wang
arXiv preprint arXiv:2309.17382, 2023
45*2023
Provably mitigating overoptimization in rlhf: Your sft loss is implicitly an adversarial regularizer
Z Liu, M Lu, S Zhang, B Liu, H Guo, Y Yang, J Blanchet, Z Wang
arXiv preprint arXiv:2405.16436, 2024
342024
Human-instruction-free llm self-alignment with limited samples
H Guo, Y Yao, W Shen, J Wei, X Zhang, Z Wang, Y Liu
arXiv preprint arXiv:2401.06785, 2024
182024
Provably efficient offline reinforcement learning for partially observable markov decision processes
H Guo, Q Cai, Y Zhang, Z Yang, Z Wang
International Conference on Machine Learning (Spotlight), 8016-8038, 2022
182022
Behavior Contrastive Learning for Unsupervised Skill Discovery
R Yang, C Bai, H Guo, S Li, B Zhao, Z Wang, P Liu, X Li
International Conference on Machine Learning, 2023
172023
Measuring and Reducing LLM Hallucination without Gold-Standard Answers
J Wei, Y Yao, JF Ton, H Guo, A Estornell, Y Liu
arXiv preprint arXiv:2402.10412, 2024
162024
Policy learning using weak supervision
J Wang*, H Guo*, Z Zhu*, Y Liu
Advances in Neural Information Processing Systems 34, 19960-19973, 2021
132021
Decentralized single-timescale actor-critic on zero-sum two-player stochastic games
H Guo, Z Fu, Z Yang, Z Wang
International Conference on Machine Learning (Spotlight), 3899-3909, 2021
112021
Improving reinforcement learning from human feedback using contrastive rewards
W Shen, X Zhang, Y Yao, R Zheng, H Guo, Y Liu
arXiv preprint arXiv:2403.07708, 2024
82024
Can large language models play games? a case study of a self-play approach
H Guo, Z Liu, Y Zhang, Z Wang
arXiv preprint arXiv:2403.05632, 2024
82024
Signal instructed coordination in cooperative multi-agent reinforcement learning
L Chen, H Guo, Y Du, F Fang, H Zhang, W Zhang, Y Yu
Distributed Artificial Intelligence: Third International Conference, DAI …, 2022
62022
Toward Optimal LLM Alignments Using Two-Player Games
R Zheng*, H Guo*, Z Liu*, X Zhang*, Y Yao, X Xu, Z Wang, Z Xi, T Gui, ...
arXiv preprint arXiv:2406.10977, 2024
22024
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
H Zhong, Y Yin, S Zhang, X Xu, Y Liu, Y Zuo, Z Liu, B Liu, S Zheng, H Guo, ...
arXiv preprint arXiv:2501.18858, 2025
2025
Machine learning model alignment
Y Yao, H Guo, X Zhang, Y Liu
US Patent App. 18/900,432, 2025
2025
Machine learning model evaluation
Y Yao, J Wei, TON Jean-Francois, H Guo, A Estornell, Y Liu
US Patent App. 18/885,135, 2025
2025
Diverse randomized value functions: A provably pessimistic approach for offline reinforcement learning
X Yu, C Bai, H Guo, C Wang, Z Wang
Information Sciences 680, 121146, 2024
2024
Progressive LLM Alignments Using Two-Player Games
R Zheng, H Guo, Z Liu, X Zhang, Y Yao, X Xu, Z Wang, Z Xi, T Gui, ...
Robust RLHF with Noisy Rewards
W Shen, X Zhang, Y Yao, R Zheng, H Guo, Y Liu
Lightweight Uncertainty for Offline Reinforcement Learning via Bayesian Posterior
X Yu, C Bai, H Guo, L Wang, C Wang, Z Wang
Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.
Artikkelit 1–20