Deep Learning for Automatic Assessment and Feedback of Spoken English

Kyriakopoulos, Konstantinos

Deep Learning for Automatic Assessment and Feedback of Spoken English

cam.depositDate	2022-03-12
cam.restriction	thesis_access_open
cam.supervisor	Gales, Mark JF
cam.supervisor.orcid	Gales, Mark [0000-0002-5311-8219]
dc.contributor.author	Kyriakopoulos, Konstantinos
dc.contributor.orcid	Kyriakopoulos, Konstantinos [0000-0002-7659-4541]
dc.date.accessioned	2022-03-30T13:42:43Z
dc.date.available	2022-03-30T13:42:43Z
dc.date.submitted	2021-10
dc.date.updated	2022-03-12T09:52:51Z
dc.description.abstract	Growing global demand for learning a second language (L2), particularly English, has led to considerable interest in automatic spoken language assessment, whether for use in computerassisted language learning (CALL) tools or for grading candidates for formal qualifications. This thesis presents research conducted into the automatic assessment of spontaneous nonnative English speech, with a view to be able to provide meaningful feedback to learners. One of the challenges in automatic spoken language assessment is giving candidates feedback on particular aspects, or views, of their spoken language proficiency, in addition to the overall holistic score normally provided. Another is detecting pronunciation and other types of errors at the word or utterance level and feeding them back to the learner in a useful way. It is usually difficult to obtain accurate training data with separate scores for different views and, as examiners are often trained to give holistic grades, single-view scores can suffer issues of consistency. Conversely, holistic scores are available for various standard assessment tasks such as Linguaskill. An investigation is thus conducted into whether assessment scores linked to particular views of the speaker’s ability can be obtained from systems trained using only holistic scores. End-to-end neural systems are designed with structures and forms of input tuned to single views, specifically each of pronunciation, rhythm, intonation and text. By training each system on large quantities of candidate data, individual-view information should be possible to extract. The relationships between the predictions of each system are evaluated to examine whether they are, in fact, extracting different information about the speaker. Three methods of combining the systems to predict holistic score are investigated, namely averaging their predictions and concatenating and attending over their intermediate representations. The combined graders are compared to each other and to baseline approaches. The tasks of error detection and error tendency diagnosis become particularly challenging when the speech in question is spontaneous and particularly given the challenges posed by the inconsistency of human annotation of pronunciation errors. An approach to these tasks is presented by distinguishing between lexical errors, wherein the speaker does not know how a particular word is pronounced, and accent errors, wherein the candidate’s speech exhibits consistent patterns of phone substitution, deletion and insertion. Three annotated corpora x of non-native English speech by speakers of multiple L1s are analysed, the consistency of human annotation investigated and a method presented for detecting individual accent and lexical errors and diagnosing accent error tendencies at the speaker level.
dc.identifier.doi	10.17863/CAM.82947
dc.identifier.uri	https://www.repository.cam.ac.uk/handle/1810/335514
dc.language.iso	eng
dc.publisher.college	Queens
dc.publisher.institution	University of Cambridge
dc.rights	All Rights Reserved
dc.rights.uri	https://www.rioxx.net/licenses/all-rights-reserved/
dc.subject	deep learning
dc.subject	CALL
dc.subject	automatic assessment
dc.subject	pronunciation assessment
dc.subject	rhythm assessment
dc.subject	prosody assessment
dc.subject	text assessment
dc.subject	spontaneous speech
dc.subject	attention mechanisms
dc.title	Deep Learning for Automatic Assessment and Feedback of Spoken English
dc.type	Thesis
dc.type.qualificationlevel	Doctoral
dc.type.qualificationname	Doctor of Philosophy (PhD)
pubs.licence-display-name	Apollo Repository Deposit Licence Agreement
pubs.licence-identifier	apollo-deposit-licence-2-1
rioxxterms.licenseref.uri	https://www.rioxx.net/licenses/all-rights-reserved/
rioxxterms.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: PhD_Thesis_Deep_Learning_for_Automatic_Assessment_and_Feedback_of_Spoken_English.pdf
Size:: 4.06 MB
Format:: Adobe Portable Document Format
Description:: Thesis
Licence: https://www.rioxx.net/licenses/all-rights-reserved/

Download

Collections

Theses - Engineering