Mitigating the Alignment Tax of RLHF Y Lin, H Lin, W Xiong, S Diao, J Liu, J Zhang, R Pan, H Wang, W Hu, ... EMNLP-2024, 2023 | 114* | 2023 |
R-Tuning: Instructing Large Language Models to Say ‘I Don’t Know’ H Zhang, S Diao, Y Lin, Y Fung, Q Lian, X Wang, Y Chen, H Ji, T Zhang NAACL-2024 (Outstanding Paper Award), 7106-7132, 2024 | 79* | 2024 |
Entropy-Regularized Process Reward Model H Zhang, P Wang, S Diao, Y Lin, R Pan, H Dong, D Zhang, P Molchanov, ... arXiv preprint arXiv:2412.11006, 2024 | 10* | 2024 |
RAG-Reward: Optimizing RAG with Reward Modeling and RLHF H Zhang, J Song, J Zhu, Y Wu, T Zhang, C Niu arXiv preprint arXiv:2501.13264, 2025 | | 2025 |
InfoPattern: Unveiling Information Propagation Patterns in Social Media C Han, J Xu, M Li, H Zhang, T Abdelzaher, H Ji arXiv preprint arXiv:2311.15642, 2023 | | 2023 |