Repository logo
 

A Spectrally Weighted Mixture of Least Square Error and Wasserstein Discriminator Loss for Generative SPSS

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Type

Conference Object

Change log

Authors

Degottex, G 

Abstract

Generative networks can create an artificial spectrum based on its conditional distribution estimate instead of predicting only the mean value, as the Least Square (LS) solution does. This is promising since the LS predictor is known to oversmooth features leading to muffling effects. However, modeling a whole distribution instead of a single mean value requires more data and thus also more computational resources. With only one hour of recording, as often used with LS approaches, the resulting spectrum is noisy and sounds full of artifacts. In this paper, we suggest a new loss function, by mixing the LS error and the loss of a discriminator trained with Wasserstein GAN, while weighting this mixture differently through the frequency domain. Using listening tests, we show that, using this mixed loss, the generated spectrum is smooth enough to obtain a decent perceived quality. While making our source code available online, we also hope to make generative networks more accessible with lower the necessary resources.

Description

Keywords

46 Information and Computing Sciences, 4006 Communications Engineering, 40 Engineering

Journal Title

2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings

Conference Name

2018 IEEE Spoken Language Technology Workshop (SLT)

Journal ISSN

Volume Title

Publisher

IEEE

Rights

All rights reserved
Sponsorship
European Commission Horizon 2020 (H2020) Marie Sk?odowska-Curie actions (655764)