Repository logo
 

Research data supporting "On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems"


No Thumbnail Available

Type

Dataset

Change log

Authors

Su, Pei-Hao 
Gasic, Milica 
Mrksic, Nikola 
Rojas-Barahona, Lina 
Ultes, Stefan 

Description

This repository contains the data presented in the paper "On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems" in ACL 2016. Two separate datasets as described in section 4 of the paper are presented: 1. DialogueEmbedding/ It contains the [train|valid|test] data for the unsupervised dialogue embedding creation, each with *.[feature|reward|turn|subjsuc]. Note that *.turn includes the lines to be read for each dialogue in *.[feature|reward|subjsuc], and *.subjsuc is the user's subjective rating. The feature size is 74. 2. DialoguePolicy/ It includes four contrasting systems with different reward models: [GP|RNN|ObjSubj|Subj]. Inside each system directory is the data obtained in interaction with Amazon Mechanical Turk users while training three policies with same config: policy_[1|2|3]. and a .csv for the evaluation result along with the trainig process. In each policy_[1|2|3]/ there is a list of calls with a time stamp in the name which contains session.xml file for dialogue log and feedback.xml file for user feedback


This research data supports "On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems" which has been published in "Proceedings of Association for Computational Linguistics (ACL)".

Version

Software / Usage instructions

csv, xml, README

Keywords

spoken dialogue systems, deep learning, reward modelling, reinforcement learning, Gaussian process

Publisher

University of Cambridge
Sponsorship
This work was supported by the EPSRC [grant number Cambridge Trust].