Recent advances in deep learning based dialogue systems: A systematic survey

J Ni, T Young, V Pandelea, F Xue… - Artificial intelligence review, 2023 - Springer
Dialogue systems are a popular natural language processing (NLP) task as it is promising in
real-life applications. It is also a complicated task since many NLP tasks deserving study are …

A survey of evaluation metrics used for NLG systems

AB Sai, AK Mohankumar, MM Khapra - ACM Computing Surveys (CSUR …, 2022 - dl.acm.org
In the last few years, a large number of automatic evaluation metrics have been proposed for
evaluating Natural Language Generation (NLG) systems. The rapid development and …

: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering

O Honovich, L Choshen, R Aharoni, E Neeman… - arxiv preprint arxiv …, 2021 - arxiv.org
Neural knowledge-grounded generative models for dialogue often produce content that is
factually inconsistent with the source text they rely on. As a consequence, such models are …

InstructDial: Improving zero and few-shot generalization in dialogue through instruction tuning

P Gupta, C Jiao, YT Yeh, S Mehri, M Eskenazi… - arxiv preprint arxiv …, 2022 - arxiv.org
Instruction tuning is an emergent paradigm in NLP wherein natural language instructions
are leveraged with language models to induce zero-shot performance on unseen tasks …

State-of-the-art generalisation research in NLP: a taxonomy and review

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - arxiv preprint arxiv …, 2022 - arxiv.org
The ability to generalise well is one of the primary desiderata of natural language
processing (NLP). Yet, what'good generalisation'entails and how it should be evaluated is …

Masked graph learning with recurrent alignment for multimodal emotion recognition in conversation

T Meng, F Zhang, Y Shou, H Shao… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
Since Multimodal Emotion Recognition in Conversation (MERC) can be applied to public
opinion monitoring, intelligent dialogue robots, and other fields, it has received extensive …

Simple LLM prompting is state-of-the-art for robust and multilingual dialogue evaluation

J Mendonça, P Pereira, H Moniz, JP Carvalho… - arxiv preprint arxiv …, 2023 - arxiv.org
Despite significant research effort in the development of automatic dialogue evaluation
metrics, little thought is given to evaluating dialogues other than in English. At the same time …

Evaluating open-domain dialogues in latent space with next sentence prediction and mutual information

K Zhao, B Yang, C Lin, W Rong, A Villavicencio… - arxiv preprint arxiv …, 2023 - arxiv.org
The long-standing one-to-many issue of the open-domain dialogues poses significant
challenges for automatic evaluation methods, ie, there may be multiple suitable responses …

Automatic evaluation and moderation of open-domain dialogue systems

C Zhang, J Sedoc, LF D'Haro, R Banchs… - arxiv preprint arxiv …, 2021 - arxiv.org
The development of Open-Domain Dialogue Systems (ODS) is a trending topic due to the
large number of research challenges, large societal and business impact, and advances in …

Synthesizing adversarial negative responses for robust response ranking and evaluation

P Gupta, Y Tsvetkov, JP Bigham - arxiv preprint arxiv:2106.05894, 2021 - arxiv.org
Open-domain neural dialogue models have achieved high performance in response ranking
and evaluation tasks. These tasks are formulated as a binary classification of responses …