Google Acadèmic

Report from the nsf future directions workshop on automatic evaluation of dialog: Research directions and challenges

S Mehri, J Choi, LF D'Haro, J Deriu, M Eskenazi… - arxiv preprint arxiv …, 2022 - arxiv.org

This is a report on the NSF Future Directions Workshop on Automatic Evaluation of Dialog.
The workshop explored the current state of the art along with its limitations and suggested …

Desa Cita Citat per 33 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

BiSyn-GAT+: Bi-syntax aware graph attention network for aspect-based sentiment analysis

S Liang, W Wei, XL Mao, F Wang, Z He - arxiv preprint arxiv:2204.03117, 2022 - arxiv.org

Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task that aims
to align aspects and corresponding sentiments for aspect-specific sentiment polarity …

Desa Cita Citat per 74 Articles relacionats Totes les 12 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

DynaEval: Unifying turn and dialogue level evaluation

C Zhang, Y Chen, LF D'Haro, Y Zhang… - arxiv preprint arxiv …, 2021 - arxiv.org

A dialogue is essentially a multi-turn interaction among interlocutors. Effective evaluation
metrics should reflect the dynamics of such interaction. Existing automatic metrics are …

Desa Cita Citat per 71 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A comprehensive assessment of dialog evaluation metrics

YT Yeh, M Eskenazi, S Mehri - arxiv preprint arxiv:2106.03706, 2021 - arxiv.org

Automatic evaluation metrics are a crucial component of dialog systems research. Standard
language evaluation metrics are known to be ineffective for evaluating dialog. As such …

Desa Cita Citat per 107 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Deep learning for dialogue systems: Chit-chat and beyond

R Yan, J Li, Z Yu - Foundations and Trends® in Information …, 2022 - nowpublishers.com

With the rapid progress of deep neural models and the explosion of available data
resources, dialogue systems that supports extensive topics and chit-chat conversations are …

Desa Cita Citat per 28 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek Cerca de biblioteques Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] xhu.edu.cn

A novel adaptive marker segmentation graph convolutional network for aspect-level sentiment analysis

P Wang, L Tao, M Tang, M Zhao, L Wang, Y Xu… - Knowledge-Based …, 2023 - Elsevier

Aspect-level sentiment analysis is a fine-grained sentiment classification task that aims to
identify the sentiment polarity of specific aspects in online reviews. Attention mechanisms …

Desa Cita Citat per 14 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Approximating online human evaluation of social chatbots with prompting

E Svikhnushina, P Pu - arxiv preprint arxiv:2304.05253, 2023 - arxiv.org

As conversational models become increasingly available to the general public, users are
engaging with this technology in social interactions. Such unprecedented interaction …

Desa Cita Citat per 13 Articles relacionats Totes les 5 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deconstruct to reconstruct a configurable evaluation metric for open-domain dialogue systems

V Phy, Y Zhao, A Aizawa - arxiv preprint arxiv:2011.00483, 2020 - arxiv.org

Many automatic evaluation metrics have been proposed to score the overall quality of a
response in open-domain dialogue. Generally, the overall quality is comprised of various …

Desa Cita Citat per 52 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Which prompts make the difference? Data prioritization for efficient human LLM evaluation

M Boubdir, E Kim, B Ermis, M Fadaee… - arxiv preprint arxiv …, 2023 - arxiv.org

Human evaluation is increasingly critical for assessing large language models, capturing
linguistic nuances, and reflecting user preferences more accurately than traditional …

Desa Cita Citat per 7 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

xDial-eval: A multilingual open-domain dialogue evaluation benchmark

C Zhang, LF D'Haro, C Tang, K Shi, G Tang… - arxiv preprint arxiv …, 2023 - arxiv.org

Recent advancements in reference-free learned metrics for open-domain dialogue
evaluation have been driven by the progress in pre-trained language models and the …

Desa Cita Citat per 9 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek Versió HTML

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

Pone: A novel automatic evaluation metric for open-domain generative dialogue systems

Report from the nsf future directions workshop on automatic evaluation of dialog: Research directions and challenges

BiSyn-GAT+: Bi-syntax aware graph attention network for aspect-based sentiment analysis

DynaEval: Unifying turn and dialogue level evaluation

A comprehensive assessment of dialog evaluation metrics

Deep learning for dialogue systems: Chit-chat and beyond

A novel adaptive marker segmentation graph convolutional network for aspect-level sentiment analysis

Approximating online human evaluation of social chatbots with prompting

Deconstruct to reconstruct a configurable evaluation metric for open-domain dialogue systems

Which prompts make the difference? Data prioritization for efficient human LLM evaluation

xDial-eval: A multilingual open-domain dialogue evaluation benchmark