Internlm: A multilingual language model with progressively enhanced capabilities ILM Team https://github. com/InternLM/InternLM, 2023 | 195 | 2023 |
Anygpt: Unified multimodal llm with discrete sequence modeling J Zhan, J Dai, J Ye, Y Zhou, D Zhang, Z Liu, X Zhang, R Yuan, G Zhang, ... ACL 2024, 2024 | 84 | 2024 |
BBTv2: Towards a gradient-free future with large language models T Sun, Z He, H Qian, Y Zhou, X Huang, X Qiu EMNLP 2022, 2022 | 83 | 2022 |
MOSS: An Open Conversational Large Language Model T Sun, X Zhang, Z He, P Li, Q Cheng, X Liu, H Yan, Y Shao, Q Tang, ... Machine Intelligence Research, 1-18, 2024 | 80* | 2024 |
KNN-contrastive learning for out-of-domain intent classification Y Zhou, P Liu, X Qiu Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 77 | 2022 |
Early exiting with ensemble internal classifiers T Sun, Y Zhou, X Liu, X Zhang, H Jiang, Z Cao, X Huang, X Qiu arXiv preprint arXiv:2105.13792, 2021 | 33 | 2021 |
Mention recommendation in twitter with cooperative multi-agent reinforcement learning T Gui, P Liu, Q Zhang, L Zhu, M Peng, Y Zhou, X Huang Proceedings of the 42nd International ACM SIGIR Conference on Research and …, 2019 | 23 | 2019 |
Data mixing laws: Optimizing data mixtures by predicting language modeling performance J Ye, P Liu, T Sun, Y Zhou, J Zhan, X Qiu arXiv preprint arXiv:2403.16952, 2024 | 22 | 2024 |
A probabilistic framework for discovering new intents Y Zhou, G Quan, X Qiu Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 14* | 2023 |
UTC-IE: A unified token-pair classification architecture for information extraction H Yan, Y Sun, X Li, Y Zhou, XJ Huang, X Qiu Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 12 | 2023 |
Two birds one stone: Dynamic ensemble for ood intent classification Y Zhou, J Yang, P Wang, X Qiu Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 9 | 2023 |
Llama scope: Extracting millions of features from llama-3.1-8b with sparse autoencoders Z He, W Shu, X Ge, L Chen, J Wang, Y Zhou, F Liu, Q Guo, X Huang, ... arXiv preprint arXiv:2410.20526, 2024 | 3 | 2024 |
DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning X Wang, J He, P Wang, Y Zhou, T Sun, X Qiu Proceedings of the AAAI Conference on Artificial Intelligence 38 (17), 19180 …, 2024 | 3 | 2024 |
Code Needs Comments: Enhancing Code LLMs with Comment Augmentation D Song, H Guo, Y Zhou, S Xing, Y Wang, Z Song, W Zhang, Q Guo, H Yan, ... ACL 2024 Findings, 2024 | 3 | 2024 |
What dense graph do you need for self-attention? Y Wang, CT Lee, Q Guo, Z Yin, Y Zhou, X Huang, X Qiu International Conference on Machine Learning, 22752-22768, 2022 | 3 | 2022 |
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective Z Zeng, Q Cheng, Z Yin, B Wang, S Li, Y Zhou, Q Guo, X Huang, X Qiu arXiv preprint arXiv:2412.14135, 2024 | 2 | 2024 |
Memorize step by step: Efficient long-context prefilling with incremental memory and decremental chunk Z Zeng, Q Guo, X Liu, Z Yin, W Shu, M Huang, B Wang, Y Zhou, L Li, Q Liu, ... Proceedings of the 2024 Conference on Empirical Methods in Natural Language …, 2024 | 2 | 2024 |
Towards Universality: Studying mechanistic similarity across language model architectures J Wang, X Ge, W Shu, Q Tang, Y Zhou, Z He, X Qiu arXiv preprint arXiv:2410.06672, 2024 | 2 | 2024 |
Towards Open Environment Intent Prediction Y Zhou, J Hong, X Qiu Findings of the Association for Computational Linguistics: ACL 2023, 2226-2240, 2023 | 2 | 2023 |
BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments X Wang, P Wang, B Wang, D Zhang, Y Zhou, X Qiu arXiv preprint arXiv:2410.23918, 2024 | 1 | 2024 |