Research data supporting "Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing"
Citation
Budzianowski, P., Ramadan, O., & Gasic, M. (2018). Research data supporting "Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing" [Dataset]. https://doi.org/10.17863/CAM.26059
Description
The Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a collection of human-human written conversations spanning over multiple domains and topics. The dataset was collected based on the Wizard of Oz experiment on Amazon MTurk. Each dialogue contains a goal label and several exchanges between a visitor and the system. Each system turn has labels from the set of slot-value pairs representing a coarse representation of dialogue state. There are in total 9855 dialogues.
Format
Dataset contains the following seven json files:
1. data.json: the woz dialogue dataset, which contains the conversion from users and wizards, as well as a set of coarse labels for each user turn.
2. restaurant_db.json: the Cambridge restaurant database file, containing restaurants in the Cambridge UK area and a set of attributes.
3. attraction_db.json: the Cambridge attraction database file, contining attractions in the Cambridge UK area and a set of attributes.
4. hotel_db.json: the Cambridge hotel database file, containing hotels in the Cambridge UK area and a set of attributes.
5. train_db.json: the Cambridge train (with artificial connections) database file, containing trains in the Cambridge UK area and a set of attributes.
6. valListFile.json: list of dialogues for validation.
7. testListFile.json: list of dialogues for testing.
Keywords
dialogue system, belief tracking, wizard of oz
Relationships
Publication Reference: http://aclweb.org/anthology/P18-2069
Sponsorship
The data collection was funded through Google Faculty Award.
Identifiers
This record's DOI: https://doi.org/10.17863/CAM.26059
Rights
Attribution 4.0 International (CC BY 4.0)
Licence URL: https://creativecommons.org/licenses/by/4.0/
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.