Internlm2 technical report Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ... arXiv preprint arXiv:2403.17297, 2024 | 251 | 2024 |
Internlm: A multilingual language model with progressively enhanced capabilities ILM Team | 193 | 2023 |
Internlm-math: Open math large language models toward verifiable reasoning H Ying, S Zhang, L Li, Z Zhou, Y Shao, Z Fei, Y Ma, J Hong, K Liu, Z Wang, ... arXiv preprint arXiv:2402.06332, 2024 | 63 | 2024 |
Towards more effective and economic sparsely-activated model H Jiang, K Zhan, J Qu, Y Wu, Z Fei, X Zhang, L Chen, Z Dou, X Qiu, Z Guo, ... arXiv preprint arXiv:2110.07431, 2021 | 13 | 2021 |
Internlm-math: Open math large language models toward verifiable reasoning, 2024 H Ying, S Zhang, L Li, Z Zhou, Y Shao, Z Fei, Y Ma, J Hong, K Liu, Z Wang, ... URL https://arxiv. org/abs/2402.06332, 0 | 8 | |
Balanced data sampling for language model training with clustering Y Shao, L Li, Z Fei, H Yan, D Lin, X Qiu arXiv preprint arXiv:2402.14526, 2024 | 7 | 2024 |
Wanjuan-cc: A safe and high-quality open-sourced english webtext dataset J Qiu, H Lv, Z Jin, R Wang, W Ning, J Yu, CB Zhang, Z Li, P Chu, Y Qu, ... arXiv preprint arXiv:2402.19282, 2024 | 6 | 2024 |
Pre-training for Information Retrieval: Are Hyperlinks Fully Explored? J Wu, X Zhang, Y Zhu, Z Liu, Z Guo, Z Fei, R Lai, Y Wu, Z Cao, Z Dou arXiv preprint arXiv:2209.06583, 2022 | 6 | 2022 |
Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding Z Fei, Y Tian, Y Wu, X Zhang, Y Zhu, Z Liu, J Wu, D Kong, R Lai, Z Cao, ... the 29th International Conference on Computational Linguistics, 2022 | 4 | 2022 |
Query of cc: unearthing large scale domain-specific knowledge from public corpora Z Fei, Y Shao, L Li, Z Zeng, C He, H Yan, D Lin, X Qiu arXiv preprint arXiv:2401.14624, 2024 | 3 | 2024 |
Turn Waste into Worth: Rectifying Top- Router of MoE Z Zeng, Q Guo, Z Fei, Z Yin, Y Zhou, L Li, T Sun, H Yan, D Lin, X Qiu arXiv preprint arXiv:2402.12399, 2024 | 2 | 2024 |
VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks S Zhang, Z Xu, P Liu, X Yu, Y Li, Q Gao, Z Fei, Z Yin, Z Wu, YG Jiang, ... arXiv preprint arXiv:2412.18194, 2024 | 1 | 2024 |
基座模型训练中的数据与模型架构 (Data and Model Architecture in Base Model Training) H Yan, Y Gao, C Fei, X Yang, X Qiu Proceedings of the 22nd Chinese National Conference on Computational …, 2023 | | 2023 |
Unearthing Large Scale Domain-Specific Knowledge from Public Corpora Z Fei, Y Shao, L Li, Z Zeng, C He, H Yan, D Lin, X Qiu | | |