A survey of large language models WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou, Y Min, B Zhang, J Zhang, ... arXiv preprint arXiv:2303.18223 1 (2), 2023 | 4350* | 2023 |
A survey of large language models. arXiv 2023 WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou, Y Min, B Zhang, J Zhang, ... arXiv preprint arXiv:2303.18223 10, 0 | 212 | |
WenLan: Bridging vision and language by large-scale multi-modal pre-training Y Huo, M Zhang, G Liu, H Lu, Y Gao, G Yang, J Wen, H Zhang, B Xu, ... arXiv preprint arXiv:2103.06561, 2021 | 145 | 2021 |
Parameter-efficient mixture-of-experts architecture for pre-trained language models ZF Gao, P Liu, WX Zhao, ZY Lu, JR Wen arXiv preprint arXiv:2203.01104, 2022 | 41 | 2022 |
Enabling lightweight fine-tuning for pre-trained language model compression based on matrix product operators P Liu, ZF Gao, WX Zhao, ZY Xie, ZY Lu, JR Wen arXiv preprint arXiv:2106.02205, 2021 | 31 | 2021 |
Do emergent abilities exist in quantized large language models: An empirical study P Liu, Z Liu, ZF Gao, D Gao, WX Zhao, Y Li, B Ding, JR Wen arXiv preprint arXiv:2307.08072, 2023 | 29 | 2023 |
Small pre-trained language models can be fine-tuned as large models via over-parameterization ZF Gao, K Zhou, P Liu, WX Zhao, JR Wen Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 13 | 2023 |
TikTalk: a video-based dialogue dataset for multi-modal chitchat in real world H Lin, L Ruan, W Xia, P Liu, J Wen, Y Xu, D Hu, R Song, WX Zhao, Q Jin, ... Proceedings of the 31st ACM International Conference on Multimedia, 1303-1313, 2023 | 9 | 2023 |
Enhancing scalability of pre-trained language models via efficient parameter sharing P Liu, ZF Gao, Y Chen, WX Zhao, JR Wen Findings of the Association for Computational Linguistics: EMNLP 2023, 13771 …, 2023 | 5 | 2023 |
Unlocking data-free low-bit quantization with matrix decomposition for kv cache compression P Liu, ZF Gao, WX Zhao, Y Ma, T Wang, JR Wen arXiv preprint arXiv:2405.12591, 2024 | 4 | 2024 |
Tiktalk: A multi-modal dialogue dataset for real-world chitchat H Lin, L Ruan, W Xia, P Liu, J Wen, Y Xu, D Hu, R Song, WX Zhao, Q Jin arXiv preprint arXiv 2301, 2023 | 2 | 2023 |
Compression image dataset based on multiple matrix product states ZF Gao, P Liu, WX Zhao, ZY Xie, JR Wen, ZY Lu Future of Information and Communication Conference, 621-638, 2024 | 1 | 2024 |
Scaling pre-trained language models to deeper via parameter-efficient architecture P Liu, ZF Gao, Y Chen, WX Zhao, JR Wen arXiv preprint arXiv:2303.16753, 2023 | 1 | 2023 |
DoTA: Weight-Decomposed Tensor Adaptation for Large Language Models X Hu, X Cheng, P Liu, W Liu, J Luan, B Wang, Y Liu arXiv preprint arXiv:2412.20891, 2024 | | 2024 |
Enhancing Parameter-efficient Fine-tuning with Simple Calibration Based on Stable Rank P Liu, ZF Gao, X Zhang, WX Zhao, JR Wen Proceedings of the 2024 Joint International Conference on Computational …, 2024 | | 2024 |
Image Dataset Compression Based on Matrix Product States ZF Gao, P Liu, XH Zhang, X Zhao, ZY Xie, ZY Lu, JR Wen | | |