Open Access System for Information Sharing

Thesis

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Recurrent Action Policy Optimization for Multi-domain Task-oriented Dialogue System

Title: Recurrent Action Policy Optimization for Multi-domain Task-oriented Dialogue System

Abstract: In this thesis, we propose Dialogue System with Optimizing a Recurrent Action Policy using Efficient Context (DORA), a multi-domain task-oriented dialogue system that uses supervised learning (SL) with subsequently applied reinforcement learning (RL) for optimization by using a recurrent dialogue policy. This dialogue policy recurrently generates explicit system actions as a both word-level and high-level policy. The system actions are interpretable and controllable; therefore, we propose approaches to the system actions by using them for rewards and by controlling them. As a result, DORA is clearly optimized during both SL and RL steps by using the explicit system action policy that considers an efficient input context instead of the entire dialogue history. In the experiments, DORA achieved state-of-the-art success rate with improvement by 6.6 points on MultiWOZ 2.0 and by 10.9 points on MultiWOZ 2.1.

URI: http://postech.dcollection.net/common/orgView/200000506809
https://oasis.postech.ac.kr/handle/2014.oak/114201

qr_code