[PDF][PDF] Who Says What to Whom: A Survey of Multi-Party Conversations.

JC Gu, C Tao, ZH Ling - IJCAI, 2022 - ijcai.org
Multi-party conversations (MPCs) are a more practical and challenging scenario involving
more than two interlocutors. This research topic has drawn significant attention from both …

Overview of the ninth dialog system technology challenge: Dstc9

C Gunasekara, S Kim, LF D'haro… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
This paper introduces the Ninth Dialog System Technology Challenge (DSTC-9). This
edition of the DSTC focuses on applying end-to-end dialog technologies for four distinct …

Question-aware global-local video understanding network for audio-visual question answering

Z Chen, L Wang, P Wang, P Gao - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
As a newly emerging task, audio-visual question answering (AVQA) has attracted research
attention. Compared with traditional single-modality (eg, audio or visual) QA tasks, it poses …

Overview of the Tenth Dialog System Technology Challenge: DSTC10

K Yoshino, YN Chen, P Crook, S Kottur… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
This article introduces the Tenth Dialog System Technology Challenge (DSTC-10). This
edition of the DSTC focuses on applying end-to-end dialog technologies for five distinct …

Dialogmcf: Multimodal context flow for audio visual scene-aware dialog

Z Chen, H Liu, Y Wang - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
In recent years, Audio Visual Scene-Aware Dialog (AVSD) has been an active research task
in the multimodal dialogue community and has also been a core part of the Dialog System …

Audio-visual scene-aware dialog and reasoning using audio-visual transformers with joint student-teacher learning

A Shah, S Geng, P Gao, A Cherian… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
In previous work, we have proposed the Audio-Visual Scene-Aware Dialog (AVSD) task,
collected an AVSD dataset, developed AVSD technologies, and hosted an AVSD challenge …

[PDF][PDF] Investigation on transformer-based multi-modal fusion for audio-visual scene-aware dialog

X Huang, HL Tan, MC Leong, Y Sun, L Li… - Proc. DSTC10 …, 2022 - oar.a-star.edu.sg
In this report, we present our submissions to the DSTC10 Audio Visual Scene Dialog
(AVSD) challenge. We investigated variants of an encoder-decoder model, including those …

Humans in (digital) space: representing humans in virtual environments

M Lycett, A Reppel - Proceedings of the 2022 International Conference …, 2022 - dl.acm.org
Technology continues to pervade social and organizational life (eg, immersive, and artificial
intelligence) and our environments become increasingly virtual. In this context we examine …

Multi-View Contrastive Parsing Network for Emotion Recognition in Multi-Party Conversations

Y **e, C Sun, B Liu, Z Ji - 2024 International Joint Conference …, 2024 - ieeexplore.ieee.org
Recent Emotion Recognition in Conversation (ERC) works significantly outperform large
language models, represented by ChatGPT, in the dyadic conversation environment by …

[PDF][PDF] Overview of audio visual scene-aware dialog with reasoning track for natural language generation in DSTC10

C Hori, AP Shah, S Geng, P Gao… - Proc. DSTC10 …, 2022 - shadow.merl.com
Abstract The Audio-Visual Scene-Aware Dialog (AVSD) task was proposed in the Dialog
System Technology Challenge (DSTC), where an AVSD dataset was collected and AVSD …