A survey on recent advances and challenges in reinforcement learning methods for task-oriented dialogue policy learning

WC Kwan, HR Wang, HM Wang, KF Wong - Machine Intelligence …, 2023 - Springer
Dialogue policy learning (DPL) is a key component in a task-oriented dialogue (TOD)
system. Its goal is to decide the next action of the dialogue system, given the dialogue state …

Building and evaluating open-domain dialogue corpora with clarifying questions

M Aliannejadi, J Kiseleva, A Chuklin, J Dalton… - ar** a successful dialogue policy for a multi-domain task-oriented dialogue (MDTD)
system is a challenging task. Basically, a desirable dialogue policy acts as the decision …

Learning knowledge bases with parameters for task-oriented dialogue systems

A Madotto, S Cahyawijaya, GI Winata, Y Xu… - arxiv preprint arxiv …, 2020 - arxiv.org
Task-oriented dialogue systems are either modularized with separate dialogue state
tracking (DST) and management steps or end-to-end trainable. In either case, the …

Fantastic rewards and how to tame them: A case study on reward learning for task-oriented dialogue systems

Y Feng, S Yang, S Zhang, J Zhang, C **ong… - arxiv preprint arxiv …, 2023 - arxiv.org
When learning task-oriented dialogue (ToD) agents, reinforcement learning (RL) techniques
can naturally be utilized to train dialogue strategies to achieve user-specific goals. Prior …

Hierarchical reinforcement learning with guidance for multi-domain dialogue policy

M Rohmatillah, JT Chien - IEEE/ACM Transactions on Audio …, 2023 - ieeexplore.ieee.org
Achieving high performance in a multi-domain dialogue system with low computation is
undoubtedly challenging. Previous works applying an end-to-end approach have been very …

Transforming human-centered ai collaboration: Redefining embodied agents capabilities through interactive grounded language instructions

S Mohanty, N Arabzadeh, J Kiseleva, A Zholus… - arxiv preprint arxiv …, 2023 - arxiv.org
Human intelligence's adaptability is remarkable, allowing us to adjust to new tasks and multi-
modal environments swiftly. This skill is evident from a young age as we acquire new …

Alleviating the long-tail problem in conversational recommender systems

Z Zhao, K Zhou, X Wang, WX Zhao, F Pan… - Proceedings of the 17th …, 2023 - dl.acm.org
Conversational recommender systems (CRS) aim to provide the recommendation service
via natural language conversations. To develop an effective CRS, high-quality CRS datasets …

JoTR: A Joint Transformer and Reinforcement Learning Framework for Dialog Policy Learning

WC Kwan, H Wang, H Wang, Z Wang, X Wu… - arxiv preprint arxiv …, 2023 - arxiv.org
Dialogue policy learning (DPL) is a crucial component of dialogue modelling. Its primary role
is to determine the appropriate abstract response, commonly referred to as the" dialogue …

An emotion-sensitive dialogue policy for task-oriented dialogue system

H Zhu, X Wang, Z Wang, K Xv - Scientific Reports, 2024 - nature.com
Reinforcement learning (RL) is an effective method in training dialogue policies to steer the
conversation towards successful task completion. However, most RL-based methods only …