- Academic Search

J Ni, T Young, V Pandelea, F Xue… - Artificial intelligence review, 2023 - Springer

Dialogue systems are a popular natural language processing (NLP) task as it is promising in
real-life applications. It is also a complicated task since many NLP tasks deserving study are …

Simpan Kutip Dirujuk 302 kali Artikel terkait 15 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey of evaluation metrics used for NLG systems

AB Sai, AK Mohankumar, MM Khapra - ACM Computing Surveys (CSUR …, 2022 - dl.acm.org

In the last few years, a large number of automatic evaluation metrics have been proposed for
evaluating Natural Language Generation (NLG) systems. The rapid development and …

Simpan Kutip Dirujuk 285 kali Artikel terkait 4 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering

O Honovich, L Choshen, R Aharoni, E Neeman… - arxiv preprint arxiv …, 2021 - arxiv.org

Neural knowledge-grounded generative models for dialogue often produce content that is
factually inconsistent with the source text they rely on. As a consequence, such models are …

Simpan Kutip Dirujuk 189 kali Artikel terkait 8 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

InstructDial: Improving zero and few-shot generalization in dialogue through instruction tuning

P Gupta, C Jiao, YT Yeh, S Mehri, M Eskenazi… - arxiv preprint arxiv …, 2022 - arxiv.org

Instruction tuning is an emergent paradigm in NLP wherein natural language instructions
are leveraged with language models to induce zero-shot performance on unseen tasks …

Simpan Kutip Dirujuk 77 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

State-of-the-art generalisation research in NLP: a taxonomy and review

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - arxiv preprint arxiv …, 2022 - arxiv.org

The ability to generalise well is one of the primary desiderata of natural language
processing (NLP). Yet, what'good generalisation'entails and how it should be evaluated is …

Simpan Kutip Dirujuk 61 kali Artikel terkait 7 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Masked graph learning with recurrent alignment for multimodal emotion recognition in conversation

T Meng, F Zhang, Y Shou, H Shao… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org

Since Multimodal Emotion Recognition in Conversation (MERC) can be applied to public
opinion monitoring, intelligent dialogue robots, and other fields, it has received extensive …

Simpan Kutip Dirujuk 16 kali Artikel terkait 5 versi

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Simple LLM prompting is state-of-the-art for robust and multilingual dialogue evaluation

J Mendonça, P Pereira, H Moniz, JP Carvalho… - arxiv preprint arxiv …, 2023 - arxiv.org

Despite significant research effort in the development of automatic dialogue evaluation
metrics, little thought is given to evaluating dialogues other than in English. At the same time …

Simpan Kutip Dirujuk 16 kali Artikel terkait 3 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Evaluating open-domain dialogues in latent space with next sentence prediction and mutual information

K Zhao, B Yang, C Lin, W Rong, A Villavicencio… - arxiv preprint arxiv …, 2023 - arxiv.org

The long-standing one-to-many issue of the open-domain dialogues poses significant
challenges for automatic evaluation methods, ie, there may be multiple suitable responses …

Simpan Kutip Dirujuk 16 kali Artikel terkait 7 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Automatic evaluation and moderation of open-domain dialogue systems

C Zhang, J Sedoc, LF D'Haro, R Banchs… - arxiv preprint arxiv …, 2021 - arxiv.org

The development of Open-Domain Dialogue Systems (ODS) is a trending topic due to the
large number of research challenges, large societal and business impact, and advances in …

Simpan Kutip Dirujuk 28 kali Artikel terkait 3 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Synthesizing adversarial negative responses for robust response ranking and evaluation

P Gupta, Y Tsvetkov, JP Bigham - arxiv preprint arxiv:2106.05894, 2021 - arxiv.org

Open-domain neural dialogue models have achieved high performance in response ranking
and evaluation tasks. These tasks are formulated as a binary classification of responses …

Simpan Kutip Dirujuk 27 kali Artikel terkait 7 versi Versi HTML

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Designing precise and robust dialogue response evaluators

Recent advances in deep learning based dialogue systems: A systematic survey

A survey of evaluation metrics used for NLG systems

: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering

InstructDial: Improving zero and few-shot generalization in dialogue through instruction tuning

State-of-the-art generalisation research in NLP: a taxonomy and review

Masked graph learning with recurrent alignment for multimodal emotion recognition in conversation

Simple LLM prompting is state-of-the-art for robust and multilingual dialogue evaluation

Evaluating open-domain dialogues in latent space with next sentence prediction and mutual information

Automatic evaluation and moderation of open-domain dialogue systems

Synthesizing adversarial negative responses for robust response ranking and evaluation