Scbench: A kv cache-centric analysis of long-context methods

Y Li, H Jiang, Q Wu, X Luo, S Ahn, C Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Long-context LLMs have enabled numerous downstream applications but also introduced
significant challenges related to computational and memory efficiency. To address these …