[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier
Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …

Cyberbullying detection for low-resource languages and dialects: Review of the state of the art

T Mahmud, M Ptaszynski, J Eronen, F Masui - Information Processing & …, 2023 - Elsevier
The struggle of social media platforms to moderate content in a timely manner, encourages
users to abuse such platforms to spread vulgar or abusive language, which, when …

NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji, GI Winata… - arxiv preprint arxiv …, 2022 - arxiv.org
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

Chatgpt beyond english: Towards a comprehensive evaluation of large language models in multilingual learning

VD Lai, NT Ngo, APB Veyseh, H Man… - arxiv preprint arxiv …, 2023 - arxiv.org
Over the last few years, large language models (LLMs) have emerged as the most important
breakthroughs in natural language processing (NLP) that fundamentally transform research …

A survey of text representation and embedding techniques in nlp

R Patil, S Boit, V Gudivada, J Nandigam - IEEE Access, 2023 - ieeexplore.ieee.org
Natural Language Processing (NLP) is a research field where a language in consideration
is processed to understand its syntactic, semantic, and sentimental aspects. The …

Ammus: A survey of transformer-based pretrained models in natural language processing

KS Kalyan, A Rajasekharan, S Sangeetha - arxiv preprint arxiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

Klue: Korean language understanding evaluation

S Park, J Moon, S Kim, WI Cho, J Han, J Park… - arxiv preprint arxiv …, 2021 - arxiv.org
We introduce Korean Language Understanding Evaluation (KLUE) benchmark. KLUE is a
collection of 8 Korean natural language understanding (NLU) tasks, including Topic …

mgpt: Few-shot learners go multilingual

O Shliazhko, A Fenogenova, M Tikhonova… - Transactions of the …, 2024 - direct.mit.edu
This paper introduces mGPT, a multilingual variant of GPT-3, pretrained on 61 languages
from 25 linguistically diverse language families using Wikipedia and the C4 Corpus. We …

When is multilinguality a curse? language modeling for 250 high-and low-resource languages

TA Chang, C Arnett, Z Tu, BK Bergen - arxiv preprint arxiv:2311.09205, 2023 - arxiv.org
Multilingual language models are widely used to extend NLP systems to low-resource
languages. However, concrete evidence for the effects of multilinguality on language …

Benchmarks for automated commonsense reasoning: A survey

E Davis - ACM Computing Surveys, 2023 - dl.acm.org
More than one hundred benchmarks have been developed to test the commonsense
knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems …