Ammus: A survey of transformer-based pretrained models in natural language processing

KS Kalyan, A Rajasekharan, S Sangeetha - arxiv preprint arxiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …

Pre-trained language models for text generation: A survey

J Li, T Tang, WX Zhao, JY Nie, JR Wen - ACM Computing Surveys, 2024 - dl.acm.org
Text Generation aims to produce plausible and readable text in human language from input
data. The resurgence of deep learning has greatly advanced this field, in particular, with the …

A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity

Y Bang, S Cahyawijaya, N Lee, W Dai, D Su… - arxiv preprint arxiv …, 2023 - arxiv.org
This paper proposes a framework for quantitatively evaluating interactive LLMs such as
ChatGPT using publicly available data sets. We carry out an extensive technical evaluation …

NusaCrowd: Open source initiative for Indonesian NLP resources

S Cahyawijaya, H Lovenia, AF Aji… - Findings of the …, 2023 - aclanthology.org
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …

Language models are few-shot multilingual learners

GI Winata, A Madotto, Z Lin, R Liu, J Yosinski… - arxiv preprint arxiv …, 2021 - arxiv.org
General-purpose language models have demonstrated impressive capabilities, performing
on par with state-of-the-art approaches on a range of downstream natural language …

End-to-end transformer-based models in textual-based NLP

A Rahali, MA Akhloufi - AI, 2023 - mdpi.com
Transformer architectures are highly expressive because they use self-attention
mechanisms to encode long-range dependencies in the input sequences. In this paper, we …

A systematic review of transformer-based pre-trained language models through self-supervised learning

E Kotei, R Thirunavukarasu - Information, 2023 - mdpi.com
Transfer learning is a technique utilized in deep learning applications to transmit learned
inference to a different target domain. The approach is mainly to solve the problem of a few …

One country, 700+ languages: NLP challenges for underrepresented languages and dialects in Indonesia

AF Aji, GI Winata, F Koto, S Cahyawijaya… - arxiv preprint arxiv …, 2022 - arxiv.org
NLP research is impeded by a lack of resources and awareness of the challenges presented
by underrepresented languages and dialects. Focusing on the languages spoken in …

Negative object presence evaluation (nope) to measure object hallucination in vision-language models

H Lovenia, W Dai, S Cahyawijaya, Z Ji… - arxiv preprint arxiv …, 2023 - arxiv.org
Object hallucination poses a significant challenge in vision-language (VL) models, often
leading to the generation of nonsensical or unfaithful responses with non-existent objects …

A few thousand translations go a long way! leveraging pre-trained models for african news translation

DI Adelani, JO Alabi, A Fan, J Kreutzer, X Shen… - arxiv preprint arxiv …, 2022 - arxiv.org
Recent advances in the pre-training of language models leverage large-scale datasets to
create multilingual models. However, low-resource languages are mostly left out in these …