Comparison of effects on subjective intelligibility and quality of speech in babble for two algorithms: A deep recurrent neural network and spectral subtraction.

Keshavarzi, Mahmoud; Goehring, Tobias; Turner, Richard E; Moore, Brian CJ

Comparison of effects on subjective intelligibility and quality of speech in babble for two algorithms: A deep recurrent neural network and spectral subtraction.

Accepted version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/304436

Repository DOI

https://doi.org/10.17863/CAM.51516

Files

Accepted version (366.5 KB)

Type

Article

Authors

Keshavarzi, Mahmoud

Goehring, Tobias

https://orcid.org/0000-0002-9038-3310

Turner, Richard E

Moore, Brian CJ

Abstract

The effects on speech intelligibility and sound quality of two noise-reduction algorithms were compared: a deep recurrent neural network (RNN) and spectral subtraction (SS). The RNN was trained using sentences spoken by a large number of talkers with a variety of accents, presented in babble. Different talkers were used for testing. Participants with mild-to-moderate hearing loss were tested. Stimuli were given frequency-dependent linear amplification to compensate for the individual hearing losses. A paired-comparison procedure was used to compare all possible combinations of three conditions. The conditions were: speech in babble with no processing (NP) or processed using the RNN or SS. In each trial, the same sentence was played twice using two different conditions. The participants indicated which one was better and by how much in terms of speech intelligibility and (in separate blocks) sound quality. Processing using the RNN was significantly preferred over NP and over SS processing for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. SS processing was not significantly preferred over NP for either subjective intelligibility or sound quality. Objective computational measures of speech intelligibility predicted better intelligibility for RNN than for SS or NP.

Keywords

Aged, Female, Hearing Aids, Humans, Male, Middle Aged, Neural Networks, Computer, Speech Intelligibility, Speech Perception, Speech Recognition Software

Journal Title

J Acoust Soc Am

Journal ISSN

0001-4966
1520-8524

Volume Title

145

Publisher

Acoustical Society of America (ASA)

Publisher DOI

https://doi.org/10.1121/1.5094765

Rights

Sponsorship

Engineering and Physical Sciences Research Council (EP/L000776/1)
Engineering and Physical Sciences Research Council (EP/M026957/1)

Collections

Cambridge University Research Outputs