A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Gemma 2: Improving open language models at a practical size

G Team, M Riviere, S Pathak, PG Sessa… - arxiv preprint arxiv …, 2024 - arxiv.org
In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-
of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arxiv preprint arxiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Palm 2 technical report

R Anil, AM Dai, O Firat, M Johnson, D Lepikhin… - arxiv preprint arxiv …, 2023 - arxiv.org
We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and
reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is …

The llama 3 herd of models

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024 - arxiv.org
Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

Reflexion: Language agents with verbal reinforcement learning

N Shinn, F Cassano, A Gopinath… - Advances in …, 2024 - proceedings.neurips.cc
Large language models (LLMs) have been increasingly used to interact with external
environments (eg, games, compilers, APIs) as goal-driven agents. However, it remains …

Llama: Open and efficient foundation language models

H Touvron, T Lavril, G Izacard, X Martinet… - arxiv preprint arxiv …, 2023 - arxiv.org
We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B
parameters. We train our models on trillions of tokens, and show that it is possible to train …

Qwen2. 5 technical report

A Yang, B Yang, B Zhang, B Hui, B Zheng, B Yu… - arxiv preprint arxiv …, 2024 - arxiv.org
In this report, we introduce Qwen2. 5, a comprehensive series of large language models
(LLMs) designed to meet diverse needs. Compared to previous iterations, Qwen 2.5 has …

Phi-3 technical report: A highly capable language model locally on your phone

M Abdin, J Aneja, H Awadalla, A Awadallah… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion
tokens, whose overall performance, as measured by both academic benchmarks and …

Harnessing the power of llms in practice: A survey on chatgpt and beyond

J Yang, H **, R Tang, X Han, Q Feng, H Jiang… - ACM Transactions on …, 2024 - dl.acm.org
This article presents a comprehensive and practical guide for practitioners and end-users
working with Large Language Models (LLMs) in their downstream Natural Language …