Research Data Supporting "Multi-domain Neural Network Language Generation for Spoken Dialogue Systems"

Name: Research Data Supporting "Multi-domain Neural Network Language Generation for Spoken Dialogue Systems"
Published: 2016-05-09T11:54:15Z
Keywords: NLG, natural language generation, dialogue system

Tsung-Hsien, Wen; Gasic, Milica; Mrksic, Nikola; R.-Barahona, Lina M.; Pei-Hao, Su; Vandyke, David; Young, Steve

Research Data Supporting "Multi-domain Neural Network Language Generation for Spoken Dialogue Systems"

Repository URI

https://www.repository.cam.ac.uk/handle/1810/255930

Files

naacl16-multi-nlg.zip (1.02 MB)

Type

Dataset

Authors

Description

This is a natural language generation dataset collected from Amazon Mechanical Turk used in this paper "Multi-domain Neural Network Language Generation for Spoken Dialogue Systems" in NAACL-HLT 2016. It contains two domains regarding to consumer electronics: laptop and TV. Each file is in JSON format and contains a list of tuples. The three elements are dialogue act (semantic representation), sentence generated from AMT workers, sentence generated from our handcrafted template generator. There are 13K and 7K distinct DA and sentence pairs in laptop and TV domain, respectively. All products are anonymous.

Software / Usage instructions

JSON

Keywords

NLG, natural language generation, dialogue system

Publisher

University of Cambridge

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 2.0 UK: England & Wales

Sponsorship

Toshiba Research Europe Ltd, Cambridge Research Laboratory

Collections

Research Data - Engineering
Symplectic mapped items for data match

Research Data Supporting "Multi-domain Neural Network Language Generation for Spoken Dialogue Systems"

Repository URI

Repository DOI

Files

Type

Change log

Authors

Description

Version

Software / Usage instructions

Keywords

Publisher

Rights and licensing

Sponsorship

Collections