Yi: Open foundation models by 01.ai A Young, B Chen, C Li, C Huang, G Zhang, G Zhang, H Li, J Zhu, J Chen, ... arXiv preprint arXiv:2403.04652, 2024 | 397 | 2024 |
Data engineering for scaling language models to 128k context Y Fu, R Panda, X Niu, X Yue, H Hajishirzi, Y Kim, H Peng arXiv preprint arXiv:2402.10171, 2024 | 77 | 2024 |
Machine unlearning of pre-trained large language models J Yao, E Chien, M Du, X Niu, T Wang, Z Cheng, X Yue arXiv preprint arXiv:2402.15159, 2024 | 35 | 2024 |
Map-neo: Highly capable and transparent bilingual large language model series G Zhang, S Qu, J Liu, C Zhang, C Lin, CL Yu, D Pan, E Cheng, J Liu, ... arXiv preprint arXiv:2405.19327, 2024 | 32 | 2024 |
Aria: An open multimodal native mixture-of-experts model D Li, Y Liu, H Wu, Y Wang, Z Shen, B Qu, X Niu, G Wang, B Chen, J Li arXiv preprint arXiv:2410.05993, 2024 | 20 | 2024 |
Autokaggle: A multi-agent framework for autonomous data science competitions Z Li, Q Zang, D Ma, J Guo, T Zheng, M Liu, X Niu, Y Wang, J Yang, J Liu, ... arXiv preprint arXiv:2410.20424, 2024 | 6 | 2024 |
Yi-lightning technical report A Wake, B Chen, CX Lv, C Li, C Huang, C Cai, C Zheng, D Cooper, ... arXiv preprint arXiv:2412.01253, 2024 | 3 | 2024 |
Demystifying Long Chain-of-Thought Reasoning in LLMs E Yeo, Y Tong, M Niu, G Neubig, X Yue arXiv preprint arXiv:2502.03373, 2025 | 2 | 2025 |