History, development, and principles of large language models: an introductory survey

Z Wang, Z Chu, TV Doan, S Ni, M Yang, W Zhang - AI and Ethics, 2024 - Springer
Abstract Language models serve as a cornerstone in natural language processing, utilizing
mathematical methods to generalize language laws and knowledge for prediction and …

Fine-tuning aligned language models compromises safety, even when users do not intend to!

X Qi, Y Zeng, T **e, PY Chen, R Jia, P Mittal… - arxiv preprint arxiv …, 2023 - arxiv.org
Optimizing large language models (LLMs) for downstream use cases often involves the
customization of pre-trained LLMs through further fine-tuning. Meta's open release of Llama …

Large language monkeys: Scaling inference compute with repeated sampling

B Brown, J Juravsky, R Ehrlich, R Clark, QV Le… - arxiv preprint arxiv …, 2024 - arxiv.org
Scaling the amount of compute used to train language models has dramatically improved
their capabilities. However, when it comes to inference, we often limit the amount of compute …

Exploring the Potential of Large Language Models for Improving Digital Forensic Investigation Efficiency

A Wickramasekara, F Breitinger, M Scanlon - arxiv preprint arxiv …, 2024 - arxiv.org
The growing number of cases that require digital forensic analysis raises concerns about the
ability of law enforcement to conduct investigations promptly. Consequently, this paper …

Infiagent-dabench: Evaluating agents on data analysis tasks

X Hu, Z Zhao, S Wei, Z Chai, Q Ma, G Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to
evaluate LLM-based agents on data analysis tasks. These tasks require agents to end-to …

Distilling mathematical reasoning capabilities into small language models

X Zhu, J Li, Y Liu, C Ma, W Wang - Neural Networks, 2024 - Elsevier
This work addresses the challenge of democratizing advanced Large Language Models
(LLMs) by compressing their mathematical reasoning capabilities into sub-billion parameter …

Clex: Continuous length extrapolation for large language models

G Chen, X Li, Z Meng, S Liang, L Bing - arxiv preprint arxiv:2310.16450, 2023 - arxiv.org
Transformer-based Large Language Models (LLMs) are pioneering advances in many
natural language processing tasks, however, their exceptional capabilities are restricted …

Magicpig: Lsh sampling for efficient llm generation

Z Chen, R Sadhukhan, Z Ye, Y Zhou, J Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) with long context windows have gained significant attention.
However, the KV cache, stored to avoid re-computation, becomes a bottleneck. Various …

Making harmful behaviors unlearnable for large language models

X Zhou, Y Lu, R Ma, T Gui, Q Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have shown great potential as general-purpose AI
assistants in various domains. To meet the requirements of different applications, LLMs are …

Just read twice: closing the recall gap for recurrent language models

S Arora, A Timalsina, A Singhal, B Spector… - arxiv preprint arxiv …, 2024 - arxiv.org
Recurrent large language models that compete with Transformers in language modeling
perplexity are emerging at a rapid rate (eg, Mamba, RWKV). Excitingly, these architectures …