Impact of ASR performance on free speaking language assessment

Conference Object
Change log
Knill, KM 
Gales, MJF 
Kyriakopoulos, Konstantinos  ORCID logo
Malinin, A 
Ragni, A 

In free speaking tests candidates respond in spontaneous speech to prompts. This form of test allows the spoken language proficiency of a non-native speaker of English to be assessed more fully than read aloud tests. As the candidate's responses are unscripted, transcription by automatic speech recognition (ASR) is essential for automated assessment. ASR will never be 100% accurate so any assessment system must seek to minimise and mitigate ASR errors. This paper considers the impact of ASR errors on the performance of free speaking test auto-marking systems. Firstly rich linguistically related features, based on part-of-speech tags from statistical parse trees, are investigated for assessment. Then, the impact of ASR errors on how well the system can detect whether a learner's answer is relevant to the question asked is evaluated. Finally, the impact that these errors may have on the ability of the system to provide detailed feedback to the learner is analysed. In particular, pronunciation and grammatical errors are considered as these are important in helping a learner to make progress. As feedback resulting from an ASR error would be highly confusing, an approach to mitigate this problem using confidence scores is also analysed.

speech recognition, spoken language assessment
Journal Title
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Conference Name
Interspeech 2018
Journal ISSN
Volume Title
Cambridge Assessment (unknown)