Repository logo
 

Evaluating OpenAI's Whisper ASR: Performance analysis across diverse accents and speaker traits.

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Change log

Abstract

This study investigates Whisper's automatic speech recognition (ASR) system performance across diverse native and non-native English accents. Results reveal superior recognition in American compared to British and Australian English accents with similar performance in Canadian English. Overall, native English accents demonstrate higher accuracy than non-native accents. Exploring connections between speaker traits [sex, native language (L1) typology, and second language (L2) proficiency] and word error rate uncovers notable associations. Furthermore, Whisper exhibits enhanced performance in read speech over conversational speech with modifications based on speaker gender. The implications of these findings are discussed.

Description

Journal Title

JASA Express Lett

Conference Name

Journal ISSN

2691-1191
2691-1191

Volume Title

Publisher

Acoustical Society of America (ASA)

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 4.0 International

Relationships

Is derived from: