Show simple item record

dc.contributor.authorDrugman, Thomasen
dc.contributor.authorStylianou, Yannisen
dc.contributor.authorChen, Langzhouen
dc.contributor.authorChen, Xieen
dc.contributor.authorGales, Marken
dc.date.accessioned2015-04-22T13:07:26Z
dc.date.available2015-04-22T13:07:26Z
dc.date.issued2015en
dc.identifier.citationDrugman et al. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2015) pp. 4664-4668. doi: 10.1109/ICASSP.2015.7178855en
dc.identifier.issn1520-6149
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/247427
dc.description.abstractIn this paper we investigate the use of robust to noise features characterizing the speech excitation signal as complementary features to the usually considered vocal tract based features for automatic speech recognition (ASR). The features are tested in a state-of-the-art Deep Neural Network (DNN) based hybrid acoustic model for speech recognition. The suggested excitation features expands the set of excitation features previously considered for ASR, expecting that these features help in a better discrimination of the broad phonetic classes (e.g., fricatives, nasal, vowels, etc.). Relative improvements in the word error rate are observed in the AMI meeting transcription system with greater gains (about 5%) if PLP features are combined with the suggested excitation features. For Aurora 4, significant improvements are observed as well. Combining the suggested excitation features with filter banks, a word error rate of 9.96% is achieved.
dc.languageEnglishen
dc.language.isoenen
dc.publisherIEEE
dc.subjectneural networksen
dc.subjectautomatic speech recognitionen
dc.subjectspeech excitation signalen
dc.titleRobust Excitation-based Feature for Automatic Speech Recognitionen
dc.typeConference Object
dc.description.versionThis is the author accepted manuscript. The final version is available from IEEE via http://dx.doi.org/10.1109/ICASSP.2015.7178855en
prism.endingPage4668
prism.publicationDate2015en
prism.publicationName2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)en
prism.startingPage4664
rioxxterms.versionofrecord10.1109/ICASSP.2015.7178855en
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserveden
rioxxterms.licenseref.startdate2015en
dc.contributor.orcidGales, Mark [0000-0002-5311-8219]
rioxxterms.typeConference Paper/Proceeding/Abstracten


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record