A hierarchical attention based model for off-topic spontaneous spoken response detection
View / Open Files
Publication Date
2018-01-24Journal Title
2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings
ISBN
9781509047888
Publisher
IEEE
Volume
2018-January
Pages
397-403
Type
Conference Object
Metadata
Show full item recordCitation
Malinin, A., Knill, K., & Gales, M. (2018). A hierarchical attention based model for off-topic spontaneous spoken response detection. 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings, 2018-January 397-403. https://doi.org/10.1109/ASRU.2017.8268963
Abstract
Automatic spoken language assessment and training systems are becoming increasingly popular to handle the growing demand to learn languages. However, current systems often assess only fluency and pronunciation, with limited content-based features being used. This paper examines one particular aspect of content-assessment, off-topic response detection. This is important for deployed systems as it ensures that candidates understood the prompt, and are able to generate an appropriate answer. Previously proposed approaches typically require a set of prompt-response training pairs, which lim- its flexibility as example responses are required whenever a new test prompt is introduced. Recently, the attention based neural topic model (ATM) was presented, which can assess the relevance of prompt-response pairs regardless of whether the prompt was seen in training. This model uses a bidirectional Recurrent Neural Network (BiRNN) embedding of the prompt combined with an attention mechanism to attend over the hidden states of a BiRNN embedding of the response to compute a fixed-length embedding used to predict relevance. Unfortunately, performance on prompts not seen in the training data is lower than on seen prompts.
Thus, this paper adds the following contributions: several im- provements to the ATM are examined; a hierarchical variant of the ATM (HATM) is proposed, which explicitly uses prompt similarity to further improve performance on unseen prompts by interpolating over prompts seen in training data given a prompt of interest via a second attention mechanism; an in-depth analysis of both models is conducted and main failure mode identified. On spontaneous spo- ken data, taken from BULATS tests, these systems are able to assess relevance to both seen and unseen prompts
Sponsorship
Cambridge Assessment (unknown)
EPSRC (1464018)
Identifiers
External DOI: https://doi.org/10.1109/ASRU.2017.8268963
This record's URL: https://www.repository.cam.ac.uk/handle/1810/279179
Rights
Licence:
http://www.rioxx.net/licenses/all-rights-reserved