Evaluating OpenAI's Whisper ASR: Performance analysis across diverse accents and speaker traits.
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
This study investigates Whisper's automatic speech recognition (ASR) system performance across diverse native and non-native English accents. Results reveal superior recognition in American compared to British and Australian English accents with similar performance in Canadian English. Overall, native English accents demonstrate higher accuracy than non-native accents. Exploring connections between speaker traits [sex, native language (L1) typology, and second language (L2) proficiency] and word error rate uncovers notable associations. Furthermore, Whisper exhibits enhanced performance in read speech over conversational speech with modifications based on speaker gender. The implications of these findings are discussed.
Description
Keywords
Journal Title
JASA Express Lett
Conference Name
Journal ISSN
2691-1191
2691-1191
2691-1191
Volume Title
Publisher
Acoustical Society of America (ASA)
Publisher DOI
Rights and licensing
Except where otherwised noted, this item's license is described as Attribution 4.0 International

