Repository logo
 

Log-linear system combination using structured support vector machines

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

Yang, J 
Ragni, A 
Gales, MJF 
Knill, KM 

Abstract

Building high accuracy speech recognition systems with limited language resources is a highly challenging task. Although the use of multi-language data for acoustic models yields improvements, performance is often unsatisfactory with highly limited acoustic training data. In these situations, it is possible to consider using multiple well trained acoustic models and combine the system outputs together. Unfortunately, the computational cost associated with these approaches is high as multiple decoding runs are required. To address this problem, this paper examines schemes based on log-linear score combination. This has a number of advantages over standard combination schemes. Even with limited acoustic training data, it is possible to train, for example, phone-specific combination weights, allowing detailed relationships between the available well trained models to be obtained. To ensure robust parameter estimation, this paper casts log-linear score combination into a structured support vector machine (SSVM) learning task. This yields a method to train model parameters with good generalisation properties. Here the SSVM feature space is a set of scores from well-trained individual systems. The SSVM approach is compared to lattice rescoring and confusion network combination using language packs released within the IARPA Babel program.

Description

Keywords

system combination, structured support vectorm machines, speech recognition, keyword spotting

Journal Title

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

Conference Name

Interspeech 2016

Journal ISSN

2308-457X
1990-9772

Volume Title

08-12-September-2016

Publisher

ISCA
Sponsorship
IARPA (4912046943)