Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition
View / Open Files
Publication Date
2022Journal Title
IEEE/ACM Transactions on Audio Speech and Language Processing
ISSN
2329-9290
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Type
Article
This Version
AM
Metadata
Show full item recordCitation
Ragni, A., Gales, M., Rose, O., Knill, K., Kastanos, A., Li, Q., & Ness, P. (2022). Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition. IEEE/ACM Transactions on Audio Speech and Language Processing https://doi.org/10.1109/TASLP.2022.3161153
Abstract
Accurate confidence measures for predictions from machine learning techniques play a critical role in the deployment and training of many speech and language processing applications. For example, confidence scores are important when making use of automatically generated transcriptions in training automatic speech recognition (ASR) systems, as well as down-stream applications, such as information retrieval and conversational assistants. Previous work on improving confidence scores for these systems has focused on two main directions: designing features correlated with improved confidence prediction; and employing sequence models to account for the importance of contextual information. Few studies, however, have explored incorporating contextual information more broadly, such as from the future, in addition to the past, or making use of alternative multiple hypotheses in addition to the most likely one. This article introduces two general approaches for encapsulating contextual information from lattices. Experimental results illustrating the importance of increasing contextual information for estimating confidence scores are presented on a range of limited resource languages where word error rates range between 30% and 60%. The results show that the novel approaches provide significant gains in the accuracy of confidence estimation.
Keywords
Speech recognition, confidence, recurrent neural network, attention, graph structures
Sponsorship
All authors were supported in part by the ALTA Institute, Cambridge University. A. Ragni and M. Gales were also supported in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via Air Force Research Laboratory (AFRL) contract # FA8650-17-C-9117.
Funder references
Cambridge Assessment (unknown)
Cambridge Assessment (Unknown)
Identifiers
External DOI: https://doi.org/10.1109/TASLP.2022.3161153
This record's URL: https://www.repository.cam.ac.uk/handle/1810/335029
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk