Survey of hallucination in natural language generation

Z Ji, N Lee, R Frieske, T Yu, D Su, Y Xu, E Ishii… - ACM Computing …, 2023 - dl.acm.org
Natural Language Generation (NLG) has improved exponentially in recent years thanks to
the development of sequence-to-sequence deep learning technologies such as Transformer …

Repairing the cracked foundation: A survey of obstacles in evaluation practices for generated text

S Gehrmann, E Clark, T Sellam - Journal of Artificial Intelligence Research, 2023 - jair.org
Abstract Evaluation practices in natural language generation (NLG) have many known flaws,
but improved evaluation approaches are rarely widely adopted. This issue has become …

Glm-130b: An open bilingual pre-trained model

A Zeng, X Liu, Z Du, Z Wang, H Lai, M Ding… - arxiv preprint arxiv …, 2022 - arxiv.org
We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model
with 130 billion parameters. It is an attempt to open-source a 100B-scale model at least as …

Palm: Scaling language modeling with pathways

A Chowdhery, S Narang, J Devlin, M Bosma… - Journal of Machine …, 2023 - jmlr.org
Large language models have been shown to achieve remarkable performance across a
variety of natural language tasks using few-shot learning, which drastically reduces the …

Finetuned language models are zero-shot learners

J Wei, M Bosma, VY Zhao, K Guu, AW Yu… - arxiv preprint arxiv …, 2021 - arxiv.org
This paper explores a simple method for improving the zero-shot learning abilities of
language models. We show that instruction tuning--finetuning language models on a …

Adapting large language models via reading comprehension

D Cheng, S Huang, F Wei - The Twelfth International Conference on …, 2023 - openreview.net
We explore how continued pre-training on domain-specific corpora influences large
language models, revealing that training on the raw corpora endows the model with domain …

Spot: Better frozen model adaptation through soft prompt transfer

T Vu, B Lester, N Constant, R Al-Rfou, D Cer - arxiv preprint arxiv …, 2021 - arxiv.org
There has been growing interest in parameter-efficient methods to apply pre-trained
language models to downstream tasks. Building on the Prompt Tuning approach of Lester et …

Ext5: Towards extreme multi-task scaling for transfer learning

V Aribandi, Y Tay, T Schuster, J Rao, HS Zheng… - arxiv preprint arxiv …, 2021 - arxiv.org
Despite the recent success of multi-task learning and transfer learning for natural language
processing (NLP), few works have systematically studied the effect of scaling up the number …

Preventing verbatim memorization in language models gives a false sense of privacy

D Ippolito, F Tramèr, M Nasr, C Zhang… - arxiv preprint arxiv …, 2022 - arxiv.org
Studying data memorization in neural language models helps us understand the risks (eg, to
privacy or copyright) associated with models regurgitating training data and aids in the …

Unified demonstration retriever for in-context learning

X Li, K Lv, H Yan, T Lin, W Zhu, Y Ni, G **e… - arxiv preprint arxiv …, 2023 - arxiv.org
In-context learning is a new learning paradigm where a language model conditions on a few
input-output pairs (demonstrations) and a test input, and directly outputs the prediction. It has …