RULER: What's the Real Context Size of Your Long-Context Language Models?

CP Hsieh, S Sun, S Kriman, S Acharya… - arxiv preprint arxiv …, 2024 - arxiv.org
The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …

Video-xl: Extra-long vision language model for hour-scale video understanding

Y Shu, P Zhang, Z Liu, M Qin, J Zhou, T Huang… - arxiv preprint arxiv …, 2024 - arxiv.org
Although current Multi-modal Large Language Models (MLLMs) demonstrate promising
results in video understanding, processing extremely long videos remains an ongoing …

Longwriter: Unleashing 10,000+ word generation from long context llms

Y Bai, J Zhang, X Lv, L Zheng, S Zhu, L Hou… - arxiv preprint arxiv …, 2024 - arxiv.org
Current long context large language models (LLMs) can process inputs up to 100,000
tokens, yet struggle to generate outputs exceeding even a modest length of 2,000 words …

Longalign: A recipe for long context alignment of large language models

Y Bai, X Lv, J Zhang, Y He, J Qi, L Hou, J Tang… - arxiv preprint arxiv …, 2024 - arxiv.org
Extending large language models to effectively handle long contexts requires instruction fine-
tuning on input sequences of similar length. To address this, we present LongAlign--a recipe …

Found in the middle: How language models use long contexts better via plug-and-play positional encoding

Z Zhang, R Chen, S Liu, Z Yao, O Ruwase… - arxiv preprint arxiv …, 2024 - arxiv.org
This paper aims to overcome the" lost-in-the-middle" challenge of large language models
(LLMs). While recent advancements have successfully enabled LLMs to perform stable …

Eigen attention: Attention in low-rank space for kv cache compression

U Saxena, G Saha, S Choudhary, K Roy - arxiv preprint arxiv:2408.05646, 2024 - arxiv.org
Large language models (LLMs) represent a groundbreaking advancement in the domain of
natural language processing due to their impressive reasoning abilities. Recently, there has …

Novelqa: A benchmark for long-range novel question answering

C Wang, R Ning, B Pan, T Wu, Q Guo, C Deng… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid advancement of Large Language Models (LLMs) has introduced a new frontier in
natural language processing, particularly in understanding and processing long-context …

Training-free long-context scaling of large language models

C An, F Huang, J Zhang, S Gong, X Qiu, C Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org
The ability of Large Language Models (LLMs) to process and generate coherent text is
markedly weakened when the number of input tokens exceeds their pretraining length …

Triforce: Lossless acceleration of long sequence generation with hierarchical speculative decoding

H Sun, Z Chen, X Yang, Y Tian, B Chen - arxiv preprint arxiv:2404.11912, 2024 - arxiv.org
With large language models (LLMs) widely deployed in long content generation recently,
there has emerged an increasing demand for efficient long-sequence inference support …

CodeS: Natural Language to Code Repository via Multi-Layer Sketch

D Zan, A Yu, W Liu, D Chen, B Shen, W Li… - arxiv preprint arxiv …, 2024 - arxiv.org
The impressive performance of large language models (LLMs) on code-related tasks has
shown the potential of fully automated software development. In light of this, we introduce a …