A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Ammus: A survey of transformer-based pretrained models in natural language processing

KS Kalyan, A Rajasekharan, S Sangeetha - arxiv preprint arxiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji, GI Winata… - arxiv preprint arxiv …, 2022 - arxiv.org
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

End-to-end transformer-based models in textual-based NLP

A Rahali, MA Akhloufi - Ai, 2023 - mdpi.com
Transformer architectures are highly expressive because they use self-attention
mechanisms to encode long-range dependencies in the input sequences. In this paper, we …

Semeval-2022 task 11: Multilingual complex named entity recognition (multiconer)

S Malmasi, A Fang, B Fetahu, S Kar… - Proceedings of the …, 2022 - aclanthology.org
We present the findings of SemEval-2022 Task 11 on Multilingual Complex Named Entity
Recognition MULTICONER. Divided into 13 tracks, the task focused on methods to identify …

JGLUE: Japanese general language understanding evaluation

K Kurihara, D Kawahara, T Shibata - Proceedings of the Thirteenth …, 2022 - aclanthology.org
To develop high-performance natural language understanding (NLU) models, it is
necessary to have a benchmark to evaluate and analyze NLU ability from various …

Language models are few-shot multilingual learners

GI Winata, A Madotto, Z Lin, R Liu, J Yosinski… - arxiv preprint arxiv …, 2021 - arxiv.org
General-purpose language models have demonstrated impressive capabilities, performing
on par with state-of-the-art approaches on a range of downstream natural language …

BLOOM+ 1: Adding language support to BLOOM for zero-shot prompting

ZX Yong, H Schoelkopf, N Muennighoff, AF Aji… - arxiv preprint arxiv …, 2022 - arxiv.org
The BLOOM model is a large publicly available multilingual language model, but its
pretraining was limited to 46 languages. To extend the benefits of BLOOM to other …

What changes can large-scale language models bring? intensive study on hyperclova: Billions-scale korean generative pretrained transformers

B Kim, HS Kim, SW Lee, G Lee, D Kwak… - arxiv preprint arxiv …, 2021 - arxiv.org
GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs)
trained on hundreds of billion scale data. Here we address some remaining issues less …

On the effect of pretraining corpora on in-context learning by a large-scale language model

S Shin, SW Lee, H Ahn, S Kim, HS Kim, B Kim… - arxiv preprint arxiv …, 2022 - arxiv.org
Many recent studies on large-scale language models have reported successful in-context
zero-and few-shot learning ability. However, the in-depth analysis of when in-context …