Survey of hallucination in natural language generation

Z Ji, N Lee, R Frieske, T Yu, D Su, Y Xu, E Ishii… - ACM Computing …, 2023 - dl.acm.org
Natural Language Generation (NLG) has improved exponentially in recent years thanks to
the development of sequence-to-sequence deep learning technologies such as Transformer …

A review: Knowledge reasoning over knowledge graph

X Chen, S Jia, Y **ang - Expert systems with applications, 2020 - Elsevier
Mining valuable hidden knowledge from large-scale data relies on the support of reasoning
technology. Knowledge graphs, as a new type of knowledge representation, have gained …

SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

P Laban, T Schnabel, PN Bennett… - Transactions of the …, 2022 - direct.mit.edu
In the summarization domain, a key requirement for summaries is to be factually consistent
with the input document. Previous work has found that natural language inference (NLI) …

On faithfulness and factuality in abstractive summarization

J Maynez, S Narayan, B Bohnet… - arxiv preprint arxiv …, 2020 - arxiv.org
It is well known that the standard likelihood training and approximate decoding objectives in
neural text generation models lead to less human-like responses for open-ended tasks such …

Re3: Generating longer stories with recursive reprompting and revision

K Yang, Y Tian, N Peng, D Klein - arxiv preprint arxiv:2210.06774, 2022 - arxiv.org
We consider the problem of automatically generating longer stories of over two thousand
words. Compared to prior work on shorter stories, long-range plot coherence and relevance …

How can we know what language models know?

Z Jiang, FF Xu, J Araki, G Neubig - Transactions of the Association for …, 2020 - direct.mit.edu
Recent work has presented intriguing results examining the knowledge contained in
language models (LMs) by having the LM fill in the blanks of prompts such as “Obama is a …

Understanding factuality in abstractive summarization with FRANK: A benchmark for factuality metrics

A Pagnoni, V Balachandran, Y Tsvetkov - arxiv preprint arxiv:2104.13346, 2021 - arxiv.org
Modern summarization models generate highly fluent but often factually unreliable outputs.
This motivated a surge of metrics attempting to measure the factuality of automatically …

Knowledge graphs

A Hogan, E Blomqvist, M Cochez, C d'Amato… - ACM Computing …, 2021 - dl.acm.org
In this article, we provide a comprehensive introduction to knowledge graphs, which have
recently garnered significant attention from both industry and academia in scenarios that …

Matching the blanks: Distributional similarity for relation learning

LB Soares, N FitzGerald, J Ling… - arxiv preprint arxiv …, 2019 - arxiv.org
General purpose relation extractors, which can model arbitrary relations, are a core
aspiration in information extraction. Efforts have been made to build general purpose …

DROP: A reading comprehension benchmark requiring discrete reasoning over paragraphs

D Dua, Y Wang, P Dasigi, G Stanovsky, S Singh… - arxiv preprint arxiv …, 2019 - arxiv.org
Reading comprehension has recently seen rapid progress, with systems matching humans
on the most popular datasets for the task. However, a large body of work has highlighted the …