Investigating the effect of auxiliary objectives for the automated grading of learner english speech transcriptions

Craighead, H; Caines, A; Buttery, P; Yannakoudakis, H

Investigating the effect of auxiliary objectives for the automated grading of learner english speech transcriptions

Accepted version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/305082

Repository DOI

https://doi.org/10.17863/CAM.52164

Files

Accepted version (512.93 KB)

Type

Conference Object

Authors

Craighead, H

Caines, Andrew

https://orcid.org/0000-0001-9647-4902

Buttery, P

Yannakoudakis, H

Abstract

We address the task of automatically grading the language proficiency of spontaneous speech based on textual features from automatic speech recognition transcripts. Motivated by recent advances in multi-task learning, we develop neural networks trained in a multi-task fashion that learn to predict the proficiency level of non-native English speakers by taking advantage of inductive transfer between the main task (grading) and auxiliary prediction tasks: morpho-syntactic labeling, language modeling, and native language identification (L1). We encode the transcriptions with both bi-directional recurrent neural networks and with bi-directional representations from transformers, compare against a feature-rich baseline, and analyse performance at different proficiency levels and with transcriptions of varying error rates. Our best performance comes from a transformer encoder with L1 prediction as an auxiliary task. We discuss areas for improvement and potential applications for text-only speech scoring.

Journal Title

Proceedings of the Annual Meeting of the Association for Computational Linguistics

Conference Name

2020 Annual Conference of the Association for Computational Linguistics

Journal ISSN

0736-587X

Publisher DOI

https://doi.org/10.17863/CAM.52164

Rights

Sponsorship

Cambridge Assessment (unknown)

Cambridge Assessment

Collections

Cambridge University Research Outputs