Understanding the rope extensions of long-context llms: An attention perspective M Zhong, C Zhang, Y Lei, X Liu, Y Gao, Y Hu, K Chen, M Zhang arXiv preprint arXiv:2406.13282, 2024 | 4 | 2024 |
Modification: Mixture of depths made easy C Zhang, M Zhong, Q Wang, X Lu, Z Ye, C Lu, Y Gao, Y Hu, K Chen, ... arXiv preprint arXiv:2410.14268, 2024 | 2 | 2024 |
Context consistency between training and inference in simultaneous machine translation M Zhong, L Liu, K Chen, M Yang, M Zhang Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024 | 1 | 2024 |
ZigZagkv: Dynamic KV Cache Compression for Long-context Modeling based on Layer Uncertainty M Zhong, X Liu, C Zhang, Y Lei, Y Gao, Y Hu, K Chen, M Zhang arXiv preprint arXiv:2412.09036, 2024 | | 2024 |
On the Hallucination in Simultaneous Machine Translation M Zhong, K Chen, Z Xue, L Liu, M Yang, M Zhang arXiv preprint arXiv:2406.07239, 2024 | | 2024 |
Context Consistency between Training and Testing in Simultaneous Machine Translation M Zhong, L Liu, K Chen, M Yang, M Zhang arXiv preprint arXiv:2311.07066, 2023 | | 2023 |