Kv cache compression, but what must we give in return? a comprehensive benchmark of long context capable approaches J Yuan, H Liu, S Zhong, YN Chuang, S Li, G Wang, D Le, H Jin, ... arXiv preprint arXiv:2407.01527, 2024 | 11 | 2024 |
Understanding Different Design Choices in Training Large Time Series Models YN Chuang, S Li, J Yuan, G Wang, KH Lai, L Yu, S Ding, CY Chang, ... arXiv preprint arXiv:2406.14045, 2024 | 2 | 2024 |