P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks X Liu, K Ji, Y Fu, WL Tam, Z Du, Z Yang, J Tang arXiv preprint arXiv:2110.07602, 2021 | 1456 | 2021 |
Self-play fine-tuning converts weak language models to strong language models Z Chen, Y Deng, H Yuan, K Ji, Q Gu arXiv preprint arXiv:2401.01335, 2024 | 189 | 2024 |
Self-play preference optimization for language model alignment Y Wu, Z Sun, H Yuan, K Ji, Y Yang, Q Gu arXiv preprint arXiv:2405.00675, 2024 | 66 | 2024 |
Parameter-efficient prompt tuning makes generalized and calibrated neural text retrievers WL Tam, X Liu, K Ji, L Xue, X Zhang, Y Dong, J Liu, M Hu, J Tang arXiv preprint arXiv:2207.07087, 2022 | 35 | 2022 |
Reinforcement learning from human feedback with active queries K Ji, J He, Q Gu arXiv preprint arXiv:2402.09401, 2024 | 15 | 2024 |
Self-play fine-tuning of diffusion models for text-to-image generation H Yuan, Z Chen, K Ji, Q Gu arXiv preprint arXiv:2402.10210, 2024 | 9 | 2024 |
Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment J Qi, K Ji, X Wang, J Yu, K Zeng, L Hou, J Li, B Xu arXiv preprint arXiv:2310.10590, 2023 | 5 | 2023 |
Enhancing multi-step reasoning abilities of language models through direct q-function optimization G Liu, K Ji, R Zheng, Z Wu, C Dun, Q Gu, L Yan arXiv preprint arXiv:2410.09302, 2024 | 3 | 2024 |
Horizon-free reinforcement learning in adversarial linear mixture MDPs K Ji, Q Zhao, J He, W Zhang, Q Gu arXiv preprint arXiv:2305.08359, 2023 | 3 | 2023 |
BiLL-VTG: Bridging Large Language Models and Lightweight Visual Tools for Video-based Texts Generation J Qi, K Ji, J Yu, D Wang, B Xu, L Hou, J Li arXiv preprint arXiv:2310.10586, 2023 | | 2023 |