Прати
Kaixuan Ji
Kaixuan Ji
Верификована је имејл адреса на cs.ucla.edu
Наслов
Навело
Навело
Година
P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks
X Liu, K Ji, Y Fu, WL Tam, Z Du, Z Yang, J Tang
arXiv preprint arXiv:2110.07602, 2021
15142021
Self-play fine-tuning converts weak language models to strong language models
Z Chen, Y Deng, H Yuan, K Ji, Q Gu
arXiv preprint arXiv:2401.01335, 2024
4552024
Self-play preference optimization for language model alignment
Y Wu, Z Sun, H Yuan, K Ji, Y Yang, Q Gu
arXiv preprint arXiv:2405.00675, 2024
762024
Parameter-efficient prompt tuning makes generalized and calibrated neural text retrievers
WL Tam, X Liu, K Ji, L Xue, X Zhang, Y Dong, J Liu, M Hu, J Tang
arXiv preprint arXiv:2207.07087, 2022
352022
Reinforcement learning from human feedback with active queries
K Ji, J He, Q Gu
arXiv preprint arXiv:2402.09401, 2024
192024
Self-play fine-tuning of diffusion models for text-to-image generation
H Yuan, Z Chen, K Ji, Q Gu
Advances in Neural Information Processing Systems 37, 73366-73398, 2025
102025
Enhancing multi-step reasoning abilities of language models through direct q-function optimization
G Liu, K Ji, R Zheng, Z Wu, C Dun, Q Gu, L Yan
arXiv preprint arXiv:2410.09302, 2024
52024
Mastering the task of open information extraction with large language models and consistent reasoning environment
J Qi, K Ji, X Wang, J Yu, K Zeng, L Hou, J Li, B Xu
arXiv preprint arXiv:2310.10590, 2023
52023
Horizon-free reinforcement learning in adversarial linear mixture MDPs
K Ji, Q Zhao, J He, W Zhang, Q Gu
arXiv preprint arXiv:2305.08359, 2023
42023
Nearly Optimal Sample Complexity of Offline KL-Regularized Contextual Bandits under Single-Policy Concentrability
Q Zhao, K Ji, H Zhao, T Zhang, Q Gu
arXiv preprint arXiv:2502.06051, 2025
2025
VidCoM: Fast Video Comprehension through Large Language Models with Multimodal Tools
J Qi, K Ji, J Yu, D Wang, B Xu, L Hou, J Li
arXiv preprint arXiv:2310.10586, 2023
2023
Систем тренутно не може да изврши ову радњу. Пробајте поново касније.
Чланци 1–11