Fast, low-artifact speech synthesis considering global variance
IEEE (Institute of Electrical and Electronics Engineers)
MetadataShow full item record
Shannon, M., & Byrne, W. (2013). Fast, low-artifact speech synthesis considering global variance. http://mi.eng.cam.ac.uk/~sms46/papers/shannon2013fast.pdf
Copyright 2013 IEEE.
Speech parameter generation considering global variance (GV generation) is widely acknowledged to dramatically improve the quality of synthetic speech generated by HMM-based systems. However it is slower and has higher latency than the standard speech parameter generation algorithm. In addition it is known to produce artifacts, though existing approaches to prevent artifacts are effective. We present a simple new theoretical analysis of speech parameter generation considering global variance based on Lagrange multipliers. This analysis sheds light on one source of artifacts and suggests a way to reduce their occurrence. It also suggests an approximation to exact GV generation that allows fast, low latency synthesis. In a subjective evaluation our fast approximation shows no degradation in naturalness compared to conventional GV generation.
speech synthesis, speech parameter generation considering global variance, artifact, low latency
This work was supported in part by EPSRC Programme Grant EP/I031022/1 (Natural Speech Technology).
External link: http://mi.eng.cam.ac.uk/~sms46/papers/shannon2013fast.pdf
This record's URL: http://www.dspace.cam.ac.uk/handle/1810/244408
All Rights Reserved
Licence URL: https://www.rioxx.net/licenses/all-rights-reserved/