Survey of hallucination in natural language generation

Z Ji, N Lee, R Frieske, T Yu, D Su, Y Xu, E Ishii… - ACM computing …, 2023 - dl.acm.org
Natural Language Generation (NLG) has improved exponentially in recent years thanks to
the development of sequence-to-sequence deep learning technologies such as Transformer …

Evaluating large language models: A comprehensive survey

Z Guo, R **, C Liu, Y Huang, D Shi, L Yu, Y Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have demonstrated remarkable capabilities across a broad
spectrum of tasks. They have attracted significant attention and been deployed in numerous …

Trusting your evidence: Hallucinate less with context-aware decoding

W Shi, X Han, M Lewis, Y Tsvetkov… - Proceedings of the …, 2024 - aclanthology.org
Abstract Language models (LMs) often struggle to pay enough attention to the input context,
and generate texts that are unfaithful or contain hallucinations. To mitigate this issue, we …

Factuality enhanced language models for open-ended text generation

N Lee, W **, P Xu, M Patwary… - Advances in …, 2022 - proceedings.neurips.cc
Pretrained language models (LMs) are susceptible to generate text with nonfactual
information. In this work, we measure and improve the factual accuracy of large-scale LMs …

Evaluating human-language model interaction

M Lee, M Srivastava, A Hardy, J Thickstun… - arxiv preprint arxiv …, 2022 - arxiv.org
Many real-world applications of language models (LMs), such as writing assistance and
code autocomplete, involve human-LM interaction. However, most benchmarks are non …

Minicheck: Efficient fact-checking of llms on grounding documents

L Tang, P Laban, G Durrett - arxiv preprint arxiv:2404.10774, 2024 - arxiv.org
Recognizing if LLM output can be grounded in evidence is central to many tasks in NLP:
retrieval-augmented generation, summarization, document-grounded dialogue, and more …

Factually consistent summarization via reinforcement learning with textual entailment feedback

P Roit, J Ferret, L Shani, R Aharoni, G Cideron… - arxiv preprint arxiv …, 2023 - arxiv.org
Despite the seeming success of contemporary grounded text generation systems, they often
tend to generate factually inconsistent text with respect to their input. This phenomenon is …

Contrastive learning reduces hallucination in conversations

W Sun, Z Shi, S Gao, P Ren, M de Rijke… - Proceedings of the AAAI …, 2023 - ojs.aaai.org
Pre-trained language models (LMs) store knowledge in their parameters and can generate
informative responses when used in conversational systems. However, LMs suffer from the …

SummEdits: Measuring LLM ability at factual reasoning through the lens of summarization

P Laban, W Kryściński, D Agarwal… - Proceedings of the …, 2023 - aclanthology.org
With the recent appearance of LLMs in practical settings, having methods that can effectively
detect factual inconsistencies is crucial to reduce the propagation of misinformation and …

FaithDial: A Faithful Benchmark for Information-Seeking Dialogue

N Dziri, E Kamalloo, S Milton, O Zaiane… - Transactions of the …, 2022 - direct.mit.edu
The goal of information-seeking dialogue is to respond to seeker queries with natural
language utterances that are grounded on knowledge sources. However, dialogue systems …