Show simple item record

dc.contributor.authorYang, Jen
dc.contributor.authorZhang, Cen
dc.contributor.authorRagni, Antonen
dc.contributor.authorGales, Marken
dc.contributor.authorWoodland, Philipen
dc.date.accessioned2016-01-22T16:52:40Z
dc.date.available2016-01-22T16:52:40Z
dc.date.issued2016-05-19en
dc.identifier.citationProceedings of ICASSP 2016.en
dc.identifier.isbn9781479999880en
dc.identifier.issn1520-6149
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/253447
dc.description.abstractImproved speech recognition performance can often be obtained by combining multiple systems together. Joint decoding, where scores from multiple systems are combined during decoding rather than combining hypotheses, is one efficient approach for system combination. In standard joint decoding the frame log-likelihoods from each system are used as the scores. These scores are then weighted and summed to yield the final score for a frame. The system combination weights for this process are usually empirically set. In this paper, a recently proposed scheme for learning these system weights is investigated for a standard noise-robust speech recognition task, AURORA 4. High performance tandem and hybrid systems for this task are described. By applying state-of-the-art training approaches and configurations for the bottleneck features of the tandem system, the difference in performance between the tandem and hybrid systems is significantly smaller than usually observed on this task. A log-linear model is then used to estimate system weights between these systems. Training the system weights yields additional gains over empirically set system weights when used for decoding. Furthermore, when used in a lattice rescoring fashion, further gains can be obtained.
dc.languageEnglishen
dc.language.isoenen
dc.titleSystem combination with log-linear modelsen
dc.typeConference Object
prism.endingPage5679
prism.publicationDate2016en
prism.publicationNameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedingsen
prism.startingPage5675
prism.volume2016-Mayen
datacite.cites.urlhttps://www.repository.cam.ac.uk/handle/1810/253409
dcterms.dateAccepted2015-12-21en
rioxxterms.versionofrecord10.1109/ICASSP.2016.7472764en
rioxxterms.versionAM
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserveden
rioxxterms.licenseref.startdate2016-05-19en
dc.contributor.orcidGales, Mark [0000-0002-5311-8219]
dc.contributor.orcidWoodland, Philip [0000-0001-9069-0225]
dc.identifier.eissn2379-190X
rioxxterms.typeConference Paper/Proceeding/Abstracten
pubs.funder-project-idEPSRC (EP/I006583/1)
pubs.funder-project-idIARPA (4912046943)
pubs.conference-nameAcoustics, Speech, and Signal Processing (ICASSP), International Conference onen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record