Advancing llm reasoning generalists with preference trees L Yuan, G Cui, H Wang, N Ding, X Wang, J Deng, B Shan, H Chen, R Xie, ... arXiv preprint arXiv:2404.02078, 2024 | 77 | 2024 |
Imitate, explore, and self-improve: A reproduction report on slow-thinking reasoning systems Y Min, Z Chen, J Jiang, J Chen, J Deng, Y Hu, Y Tang, J Wang, X Cheng, ... arXiv preprint arXiv:2412.09413, 2024 | 16 | 2024 |
Tell me more! towards implicit user intention understanding of language model driven agents C Qian, B He, Z Zhuang, J Deng, Y Qin, X Cong, Z Zhang, J Zhou, Y Lin, ... arXiv preprint arXiv:2402.09205, 2024 | 15 | 2024 |
Zero-shot generalization during instruction tuning: insights from similarity and granularity B He, N Ding, C Qian, J Deng, G Cui, L Yuan, H Gao, H Chen, Z Liu, ... arXiv preprint arXiv:2406.11721, 2024 | 3 | 2024 |
Technical report: Enhancing llm reasoning with reward-guided tree search J Jiang, Z Chen, Y Min, J Chen, X Cheng, J Wang, Y Tang, H Sun, J Deng, ... arXiv preprint arXiv:2411.11694, 2024 | 1 | 2024 |
YuLan-Mini: An Open Data-efficient Language Model Y Hu, H Song, J Deng, J Wang, J Chen, K Zhou, Y Zhu, J Jiang, Z Dong, ... arXiv preprint arXiv:2412.17743, 2024 | | 2024 |
Neuron-based Personality Trait Induction in Large Language Models J Deng, T Tang, Y Yin, W Yang, WX Zhao, JR Wen arXiv preprint arXiv:2410.12327, 2024 | | 2024 |