History, development, and principles of large language models: an introductory survey
Abstract Language models serve as a cornerstone in natural language processing, utilizing
mathematical methods to generalize language laws and knowledge for prediction and …
mathematical methods to generalize language laws and knowledge for prediction and …
Fine-tuning aligned language models compromises safety, even when users do not intend to!
Optimizing large language models (LLMs) for downstream use cases often involves the
customization of pre-trained LLMs through further fine-tuning. Meta's open release of Llama …
customization of pre-trained LLMs through further fine-tuning. Meta's open release of Llama …
Large language monkeys: Scaling inference compute with repeated sampling
Scaling the amount of compute used to train language models has dramatically improved
their capabilities. However, when it comes to inference, we often limit the amount of compute …
their capabilities. However, when it comes to inference, we often limit the amount of compute …
Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency
The growing number of cases that require digital forensic analysis raises concerns about the
ability of law enforcement to conduct investigations promptly. Consequently, this paper …
ability of law enforcement to conduct investigations promptly. Consequently, this paper …
Infiagent-dabench: Evaluating agents on data analysis tasks
In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to
evaluate LLM-based agents on data analysis tasks. These tasks require agents to end-to …
evaluate LLM-based agents on data analysis tasks. These tasks require agents to end-to …
Distilling mathematical reasoning capabilities into small language models
This work addresses the challenge of democratizing advanced Large Language Models
(LLMs) by compressing their mathematical reasoning capabilities into sub-billion parameter …
(LLMs) by compressing their mathematical reasoning capabilities into sub-billion parameter …
Clex: Continuous length extrapolation for large language models
Transformer-based Large Language Models (LLMs) are pioneering advances in many
natural language processing tasks, however, their exceptional capabilities are restricted …
natural language processing tasks, however, their exceptional capabilities are restricted …
Magicpig: Lsh sampling for efficient llm generation
Large language models (LLMs) with long context windows have gained significant attention.
However, the KV cache, stored to avoid re-computation, becomes a bottleneck. Various …
However, the KV cache, stored to avoid re-computation, becomes a bottleneck. Various …
Making harmful behaviors unlearnable for large language models
Large language models (LLMs) have shown great potential as general-purpose AI
assistants in various domains. To meet the requirements of different applications, LLMs are …
assistants in various domains. To meet the requirements of different applications, LLMs are …
Just read twice: closing the recall gap for recurrent language models
Recurrent large language models that compete with Transformers in language modeling
perplexity are emerging at a rapid rate (eg, Mamba, RWKV). Excitingly, these architectures …
perplexity are emerging at a rapid rate (eg, Mamba, RWKV). Excitingly, these architectures …