A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Evaluating large language models: A comprehensive survey

Z Guo, R **, C Liu, Y Huang, D Shi, L Yu, Y Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …

[PDF][PDF] A survey of large language models

WX Zhao, K Zhou, J Li, T Tang… - arxiv preprint arxiv …, 2023 - paper-notes.zhjwpku.com
Ever since the Turing Test was proposed in the 1950s, humans have explored the mastering
of language intelligence by machine. Language is essentially a complex, intricate system of …

C-pack: Packed resources for general chinese embeddings

S **ao, Z Liu, P Zhang, N Muennighoff, D Lian… - Proceedings of the 47th …, 2024 - dl.acm.org
We introduce C-Pack, a package of resources that significantly advances the field of general
text embeddings for Chinese. C-Pack includes three critical resources. 1) C-MTP is a …

NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji, GI Winata… - arxiv preprint arxiv …, 2022 - arxiv.org
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

Deepseek llm: Scaling open-source language models with longtermism

X Bi, D Chen, G Chen, S Chen, D Dai, C Deng… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid development of open-source large language models (LLMs) has been truly
remarkable. However, the scaling law described in previous literature presents varying …

Crosslingual generalization through multitask finetuning

N Muennighoff, T Wang, L Sutawika, A Roberts… - arxiv preprint arxiv …, 2022 - arxiv.org
Multitask prompted finetuning (MTF) has been shown to help large language models
generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused …

Glm-130b: An open bilingual pre-trained model

A Zeng, X Liu, Z Du, Z Wang, H Lai, M Ding… - arxiv preprint arxiv …, 2022 - arxiv.org
We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model
with 130 billion parameters. It is an attempt to open-source a 100B-scale model at least as …

Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models

D Dai, C Deng, C Zhao, RX Xu, H Gao, D Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
In the era of large language models, Mixture-of-Experts (MoE) is a promising architecture for
managing computational costs when scaling up model parameters. However, conventional …

C-eval: A multi-level multi-discipline chinese evaluation suite for foundation models

Y Huang, Y Bai, Z Zhu, J Zhang… - Advances in …, 2023 - proceedings.neurips.cc
New NLP benchmarks are urgently needed to align with the rapid development of large
language models (LLMs). We present C-Eval, the first comprehensive Chinese evaluation …