Repository logo
 

Morph-to-word transduction for accurate and efficient automatic speech recognition and keyword search

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

Ragni, A 
Zahemszky, P 
Vasilakes, J 
Gales, MJF 

Abstract

© 2017 IEEE. Word units are a popular choice in statistical language modelling. For inflective and agglutinative languages this choice may result in a high out of vocabulary rate. Subword units, such as morphs, provide an interesting alternative to words. These units can be derived in an unsupervised fashion and empirically show lower out of vocabulary rates. This paper proposes a morph-to-word transduction to convert morph sequences into word sequences. This enables powerful word language models to be applied. In addition, it is expected that techniques such as pruning, confusion network decoding, keyword search and many others may benefit from word rather than morph level decision making. However, word or morph systems alone may not achieve optimal performance in tasks such as keyword search so a combination is typically employed. This paper proposes a single index approach that enables word, morph and phone searches to be performed over a single morph index. Experiments are conducted on IARPA Babel program languages including the surprise languages of the OpenKWS 2015 and 2016 competitions.

Description

Keywords

morph-to-word transduction, speech recognition, keyword search, single index

Journal Title

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Conference Name

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Journal ISSN

1520-6149

Volume Title

Publisher

IEEE
Sponsorship
IARPA (4912046943)