Offline reinforcement learning: Tutorial, review, and perspectives on open problems

S Levine, A Kumar, G Tucker, J Fu - arxiv preprint arxiv:2005.01643, 2020 - arxiv.org
In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …

Neural approaches to conversational AI

J Gao, M Galley, L Li - The 41st international ACM SIGIR conference on …, 2018 - dl.acm.org
This tutorial surveys neural approaches to conversational AI that were developed in the last
few years. We group conversational systems into three categories:(1) question answering …

Batch policy learning under constraints

H Le, C Voloshin, Y Yue - International Conference on …, 2019 - proceedings.mlr.press
When learning policies for real-world domains, two important questions arise:(i) how to
efficiently use pre-collected off-policy, non-optimal behavior data; and (ii) how to mediate …

Deal or no deal? end-to-end learning for negotiation dialogues

M Lewis, D Yarats, YN Dauphin, D Parikh… - arxiv preprint arxiv …, 2017 - arxiv.org
Much of human dialogue occurs in semi-cooperative settings, where agents with different
goals attempt to agree on common decisions. Negotiations require complex communication …

A survey of available corpora for building data-driven dialogue systems

IV Serban, R Lowe, P Henderson, L Charlin… - arxiv preprint arxiv …, 2015 - arxiv.org
During the past decade, several areas of speech and language understanding have
witnessed substantial breakthroughs from the use of data-driven models. In the area of …

Rl unplugged: A suite of benchmarks for offline reinforcement learning

C Gulcehre, Z Wang, A Novikov… - Advances in …, 2020 - proceedings.neurips.cc
Offline methods for reinforcement learning have a potential to help bridge the gap between
reinforcement learning research and real-world applications. They make it possible to learn …

Pomdp-based statistical spoken dialog systems: A review

S Young, M Gašić, B Thomson… - Proceedings of the …, 2013 - ieeexplore.ieee.org
Statistical dialog systems (SDSs) are motivated by the need for a data-driven framework that
reduces the cost of laboriously handcrafting complex dialog managers and that provides …

Benchmarking batch deep reinforcement learning algorithms

S Fujimoto, E Conti, M Ghavamzadeh… - arxiv preprint arxiv …, 2019 - arxiv.org
Widely-used deep reinforcement learning algorithms have been shown to fail in the batch
setting--learning from a fixed data set without interaction with the environment. Following this …

Frames: a corpus for adding memory to goal-oriented dialogue systems

LE Asri, H Schulz, S Sharma, J Zumer, J Harris… - arxiv preprint arxiv …, 2017 - arxiv.org
This paper presents the Frames dataset (Frames is available at http://datasets. maluuba.
com/Frames), a corpus of 1369 human-human dialogues with an average of 15 turns per …

Chai: A chatbot ai for task-oriented dialogue with offline reinforcement learning

S Verma, J Fu, M Yang, S Levine - arxiv preprint arxiv:2204.08426, 2022 - arxiv.org
Conventionally, generation of natural language for dialogue agents may be viewed as a
statistical learning problem: determine the patterns in human-provided data and generate …