A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

A survey of knowledge enhanced pre-trained language models

L Hu, Z Liu, Z Zhao, L Hou, L Nie… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Pre-trained Language Models (PLMs) which are trained on large text corpus via self-
supervised learning method, have yielded promising performance on various tasks in …

NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji, GI Winata… - arxiv preprint arxiv …, 2022 - arxiv.org
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

Power hungry processing: Watts driving the cost of ai deployment?

S Luccioni, Y Jernite, E Strubell - … of the 2024 ACM conference on …, 2024 - dl.acm.org
Recent years have seen a surge in the popularity of commercial AI products based on
generative, multi-purpose AI systems promising a unified approach to building machine …

Evaluating large language models: A comprehensive survey

Z Guo, R **, C Liu, Y Huang, D Shi, L Yu, Y Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …

Byt5: Towards a token-free future with pre-trained byte-to-byte models

L Xue, A Barua, N Constant, R Al-Rfou… - Transactions of the …, 2022 - direct.mit.edu
Most widely used pre-trained language models operate on sequences of tokens
corresponding to word or subword units. By comparison, token-free models that operate …

mT5: A massively multilingual pre-trained text-to-text transformer

L Xue, N Constant, A Roberts, M Kale… - arxiv preprint arxiv …, 2020 - arxiv.org
The recent" Text-to-Text Transfer Transformer"(T5) leveraged a unified text-to-text format and
scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this …

Klue: Korean language understanding evaluation

S Park, J Moon, S Kim, WI Cho, J Han, J Park… - arxiv preprint arxiv …, 2021 - arxiv.org
We introduce Korean Language Understanding Evaluation (KLUE) benchmark. KLUE is a
collection of 8 Korean natural language understanding (NLU) tasks, including Topic …

Xtreme: A massively multilingual multi-task benchmark for evaluating cross-lingual generalisation

J Hu, S Ruder, A Siddhant, G Neubig… - International …, 2020 - proceedings.mlr.press
Much recent progress in applications of machine learning models to NLP has been driven
by benchmarks that evaluate models across a wide variety of tasks. However, these broad …

IndicNLPSuite: Monolingual corpora, evaluation benchmarks and pre-trained multilingual language models for Indian languages

D Kakwani, A Kunchukuttan, S Golla… - Findings of the …, 2020 - aclanthology.org
In this paper, we introduce NLP resources for 11 major Indian languages from two major
language families. These resources include:(a) large-scale sentence-level monolingual …