Beyond one-preference-fits-all alignment: Multi-objective direct preference optimization Z Zhou, J Liu, J Shao, X Yue, C Yang, W Ouyang, Y Qiao Findings of the Association for Computational Linguistics ACL 2024, 10586-10613, 2024 | 54* | 2024 |
Inception convolution with efficient dilation search J Liu, C Li, F Liang, C Lin, M Sun, J Yan, W Ouyang, D Xu CVPR 2021 (Oral), 2021 | 45 | 2021 |
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues G Bai*, J Liu*, X Bu, Y He, J Liu, Z Zhou, Z Lin, W Su, T Ge, B Zheng, ... ACL 2024, 2024 | 41 | 2024 |
Efficient Reinforcement Learning for Autonomous Driving with Parameterized Skills and Priors L Wang, J Liu, H Shao, W Wang, R Chen, Y Liu, SL Waslander Robotics: Science and Systems (RSS 2023), 2023 | 31 | 2023 |
ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency C Li*, J Liu*, Y Zhang, Y Wei, Y Niu, Y Yang, Y Liu, W Ouyang Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2023), 2023 | 26 | 2023 |
Map-neo: Highly capable and transparent bilingual large language model series G Zhang, S Qu, J Liu, C Zhang, C Lin, CL Yu, D Pan, E Cheng, J Liu, ... arXiv preprint arXiv:2405.19327, 2024 | 24 | 2024 |
ConceptMath: A Bilingual Concept-wise Benchmark for Measuring Mathematical Reasoning of Large Language Models Y Wu*, J Liu*, X Bu, J Liu, Z Zhou, Y Zhang, C Zhang, Z Bai, H Chen, T Ge, ... ACL 2024 (Findings), 2024 | 13* | 2024 |
GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models S Li, Y He, H Guo, X Bu, G Bai, J Liu, J Liu, X Qu, Y Li, W Ouyang, W Su, ... EMNLP 2024 (Findings), 2024 | 12 | 2024 |
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning Y Zhang*, J Liu*, C Li, Y Niu, Y Yang, Y Liu, W Ouyang Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2024), 2023 | 12 | 2023 |
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level J Liu, Z Zhou, J Liu, X Bu, C Yang, HS Zhong, W Ouyang arXiv preprint arXiv:2406.11817, 2024 | 8 | 2024 |
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! Z Zhou, J Liu, Z Dong, J Liu, C Yang, W Ouyang, Y Qiao ACL 2024 (🏆Outstanding Paper Award), 2024 | 8 | 2024 |
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models Z Zhou, Z Liu, J Liu, Z Dong, C Yang, Y Qiao NeurIPS 2024, 2024 | 7 | 2024 |
DDK: Distilling Domain Knowledge for Efficient Large Language Models J Liu, C Zhang, J Guo, Y Zhang, H Que, K Deng, Z Bai, J Liu, G Zhang, ... NeurIPS 2024, 2024 | 4 | 2024 |
Masked Pretraining for Multi-Agent Decision Making J Liu, Y Zhang, C Li, C Yang, Y Yang, Y Liu, W Ouyang TMLR 2024, 2024 | 3 | 2024 |
Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning C Li, R Jia, J Liu, Y Zhang, Y Niu, Y Yang, Y Liu, W Ouyang Proceedings of the European Conference on Artificial Intelligence, 2023 | 3 | 2023 |
Adaptive pessimism via target Q-value for offline reinforcement learning J Liu, Y Zhang, C Li, Y Yang, Y Liu, W Ouyang Neural Networks 180, 106588, 2024 | 2 | 2024 |
Improving Video Generation with Human Feedback J Liu, G Liu, J Liang, Z Yuan, X Liu, M Zheng, X Wu, Q Wang, W Qin, ... arXiv preprint arXiv:2501.13918, 2025 | | 2025 |
Adaptive Gradient Method with Resilience and Momentum J Liu, C Lin, C Li, L Sheng, M Sun, J Yan, W Ouyang arXiv preprint arXiv:2010.11041, 2020 | | 2020 |