A language space representation for speech recognition
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
© 2015 IEEE. The number of languages for which speech recognition systems have become available is growing each year. This paper proposes to view languages as points in some rich space, termed language space, where bases are eigen-languages and a particular selection of the projection determines points. Such an approach could not only reduce development costs for each new language but also provide automatic means for language analysis. For the initial proof of the concept, this paper adopts cluster adaptive training (CAT) known for inducing similar spaces for speaker adaptation needs. The CAT approach used in this paper builds on the previous work for language adaptation in speech synthesis and extends it to Gaussian mixture modelling more appropriate for speech recognition. Experiments conducted on IARPA Babel program languages show that such language space representations can outperform language independent models and discover closely related languages in an automatic way.