A survey on evaluation of large language models

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM transactions on …, 2024 - dl.acm.org
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

Understanding llms: A comprehensive overview from training to inference

Y Liu, H He, T Han, X Zhang, M Liu, J Tian, Y Zhang… - Neurocomputing, 2024 - Elsevier
The introduction of ChatGPT has led to a significant increase in the utilization of Large
Language Models (LLMs) for addressing downstream tasks. There's an increasing focus on …

Qwen technical report

J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have revolutionized the field of artificial intelligence,
enabling natural language processing tasks that were previously thought to be exclusive to …

Yi: Open foundation models by 01. ai

A Young, B Chen, C Li, C Huang, G Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce the Yi model family, a series of language and multimodal models that
demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and …

Llamafactory: Unified efficient fine-tuning of 100+ language models

Y Zheng, R Zhang, J Zhang, Y Ye, Z Luo… - arxiv preprint arxiv …, 2024 - arxiv.org
Efficient fine-tuning is vital for adapting large language models (LLMs) to downstream tasks.
However, it requires non-trivial efforts to implement these methods on different models. We …

Deepseek llm: Scaling open-source language models with longtermism

X Bi, D Chen, G Chen, S Chen, D Dai, C Deng… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid development of open-source large language models (LLMs) has been truly
remarkable. However, the scaling law described in previous literature presents varying …

Aligning large language models with human: A survey

Y Wang, W Zhong, L Li, F Mi, X Zeng, W Huang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) trained on extensive textual corpora have emerged as
leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite …

Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models

D Dai, C Deng, C Zhao, RX Xu, H Gao, D Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
In the era of large language models, Mixture-of-Experts (MoE) is a promising architecture for
managing computational costs when scaling up model parameters. However, conventional …

Datasets for large language models: A comprehensive survey

Y Liu, J Cao, C Liu, K Ding, L ** - arxiv preprint arxiv:2402.18041, 2024 - arxiv.org
This paper embarks on an exploration into the Large Language Model (LLM) datasets,
which play a crucial role in the remarkable advancements of LLMs. The datasets serve as …

Shortgpt: Layers in large language models are more redundant than you expect

X Men, M Xu, Q Zhang, B Wang, H Lin, Y Lu… - arxiv preprint arxiv …, 2024 - arxiv.org
As Large Language Models (LLMs) continue to advance in performance, their size has
escalated significantly, with current LLMs containing billions or even trillions of parameters …