TRUE: Re-evaluating factual consistency evaluation

O Honovich, R Aharoni, J Herzig, H Taitelbaum… - arxiv preprint arxiv …, 2022 - arxiv.org
Grounded text generation systems often generate text that contains factual inconsistencies,
hindering their real-world applicability. Automatic factual consistency evaluation may help …

Faithfulness in natural language generation: A systematic survey of analysis, evaluation and optimization methods

W Li, W Wu, M Chen, J Liu, X **ao, H Wu - arxiv preprint arxiv:2203.05227, 2022 - arxiv.org
Natural Language Generation (NLG) has made great progress in recent years due to the
development of deep learning techniques such as pre-trained language models. This …

DialFact: A benchmark for fact-checking in dialogue

P Gupta, CS Wu, W Liu, C **ong - arxiv preprint arxiv:2110.08222, 2021 - arxiv.org
Fact-checking is an essential tool to mitigate the spread of misinformation and
disinformation. We introduce the task of fact-checking in dialogue, which is a relatively …

Autoregressive entity generation for end-to-end task-oriented dialog

G Huang, X Quan, Q Wang - arxiv preprint arxiv:2209.08708, 2022 - arxiv.org
Task-oriented dialog (TOD) systems often require interaction with an external knowledge
base to retrieve necessary entity (eg, restaurant) information to support the response …

CDConv: A benchmark for contradiction detection in Chinese conversations

C Zheng, J Zhou, Y Zheng, L Peng, Z Guo… - arxiv preprint arxiv …, 2022 - arxiv.org
Dialogue contradiction is a critical issue in open-domain dialogue systems. The
contextualization nature of conversations makes dialogue contradiction detection rather …

A plug-and-play adapter for consistency identification in task-oriented dialogue systems

Z Ding, Z Yang, H Lin - Information Processing & Management, 2024 - Elsevier
Abstract Task-oriented Dialogue system (ToD) has gained significant attention due to its aim
to assist users in accomplishing various tasks. However, the neural network-based dialogue …

Instruct once, chat consistently in multiple rounds: An efficient tuning framework for dialogue

J Wang, CT Leong, J Wang, D Lin, W Li… - arxiv preprint arxiv …, 2024 - arxiv.org
Tuning language models for dialogue generation has been a prevalent paradigm for
building capable dialogue agents. Yet, traditional tuning narrowly views dialogue generation …

Can We Catch the Elephant? A Survey of the Evolvement of Hallucination Evaluation on Natural Language Generation

S Qi, Y He, Z Yuan - arxiv preprint arxiv:2404.12041, 2024 - arxiv.org
Hallucination in Natural Language Generation (NLG) is like the elephant in the room,
obvious but often overlooked until recent achievements significantly improved the fluency …

MPFToD: a modularized pre-training framework for consistency identification in task-oriented dialogue

L Qin, S Huang, Q Chen, Q Liu, W Che, R Xu - Frontiers of Computer …, 2025 - Springer
Consistency identification in task-oriented dialogue (CI-ToD) can prevent inconsistent
dialogue response generation, which has recently emerged as an important and growing …

Revealing user familiarity bias in task-oriented dialogue via interactive evaluation

T Kim, J Shin, YH Kim, S Bae, S Kim - arxiv preprint arxiv:2305.13857, 2023 - arxiv.org
Most task-oriented dialogue (TOD) benchmarks assume users that know exactly how to use
the system by constraining the user behaviors within the system's capabilities via strict user …