Automatic speech recognition system development in the “wild“
View / Open Files
Authors
Ragni, A
Gales, MJF
Publication Date
2018Journal Title
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Conference Name
Interspeech 2018
ISSN
2308-457X
ISBN
978-1-5108-7221-9
Publisher
ISCA
Volume
2018-September
Pages
2217-2221
Type
Conference Object
Metadata
Show full item recordCitation
Ragni, A., & Gales, M. (2018). Automatic speech recognition system development in the “wild“. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2018-September 2217-2221. https://doi.org/10.21437/Interspeech.2018-1085
Abstract
The standard framework for developing an automatic speech recognition (ASR) system is to generate training and development data for building the system, and evaluation data for the final performance analysis. All the data is assumed to come from the domain of interest. Though this framework is matched to some tasks, it is more challenging for systems that are required to operate over broad domains, or where the ability to collect the required data is limited. This paper discusses ASR work performed under the IARPA MATERIAL program, which is aimed at cross-language information retrieval, and examines this challenging scenario. In terms of available data, only limited narrow-band conversational telephone speech data was provided. However, the system is required to operate over a range of domains, including broadcast data. As no data is available for the broadcast domain, this paper proposes an approach for system development based on scraping "related" data from the web, and using ASR system confidence scores as the primary metric for developing the acoustic and language model components. As an initial evaluation of the approach, the Swahili development language is used, with the final system performance assessed on the IARPA MATERIAL Analysis Pack 1 data.
Keywords
cross domain development, confidence, web data, speech recognition
Sponsorship
The Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via Air Force Research Laboratory (AFRL)
Identifiers
External DOI: https://doi.org/10.21437/Interspeech.2018-1085
This record's URL: https://www.repository.cam.ac.uk/handle/1810/286261
Rights
Licence:
http://www.rioxx.net/licenses/all-rights-reserved
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk