A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Evaluating large language models: A comprehensive survey

Z Guo, R **, C Liu, Y Huang, D Shi, L Yu, Y Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …

Sheared llama: Accelerating language model pre-training via structured pruning

M **a, T Gao, Z Zeng, D Chen - arxiv preprint arxiv:2310.06694, 2023 - arxiv.org
The popularity of LLaMA (Touvron et al., 2023a; b) and other recently emerged moderate-
sized large language models (LLMs) highlights the potential of building smaller yet powerful …

Agieval: A human-centric benchmark for evaluating foundation models

W Zhong, R Cui, Y Guo, Y Liang, S Lu, Y Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Evaluating the general abilities of foundation models to tackle human-level tasks is a vital
aspect of their development and application in the pursuit of Artificial General Intelligence …

The refinedweb dataset for falcon llm: Outperforming curated corpora with web data only

G Penedo, Q Malartic, D Hesslow… - Advances in …, 2023 - proceedings.neurips.cc
Large language models are commonly trained on a mixture of filtered web data and
curated``high-quality''corpora, such as social media conversations, books, or technical …

Making language models better reasoners with step-aware verifier

Y Li, Z Lin, S Zhang, Q Fu, B Chen… - Proceedings of the …, 2023 - aclanthology.org
Few-shot learning is a challenging task that requires language models to generalize from
limited examples. Large language models like GPT-3 and PaLM have made impressive …

Olmo: Accelerating the science of language models

D Groeneveld, I Beltagy, P Walsh, A Bhagia… - arxiv preprint arxiv …, 2024 - arxiv.org
Language models (LMs) have become ubiquitous in both NLP research and in commercial
product offerings. As their commercial importance has surged, the most powerful models …

Datasets for large language models: A comprehensive survey

Y Liu, J Cao, C Liu, K Ding, L ** - arxiv preprint arxiv:2402.18041, 2024 - arxiv.org
This paper embarks on an exploration into the Large Language Model (LLM) datasets,
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …

Measuring coding challenge competence with apps

D Hendrycks, S Basart, S Kadavath, M Mazeika… - arxiv preprint arxiv …, 2021 - arxiv.org
While programming is one of the most broadly applicable skills in modern society, modern
machine learning models still cannot code solutions to basic problems. Despite its …

Leveraging large language models for multiple choice question answering

J Robinson, CM Rytting, D Wingate - arxiv preprint arxiv:2210.12353, 2022 - arxiv.org
While large language models (LLMs) like GPT-3 have achieved impressive results on
multiple choice question answering (MCQA) tasks in the zero, one, and few-shot settings …