Relating Dynamic Brain States to Dynamic Machine States: Human and Machine Solutions to the Speech Recognition Problem
PLoS Computational Biology
Public Library of Science (PLoS)
MetadataShow full item record
Wingfield, C., Su, L., Liu, X., Zhang, C., Woodland, P., Thwaites, A., Fonteneau, E., & et al. (2017). Relating Dynamic Brain States to Dynamic Machine States: Human and Machine Solutions to the Speech Recognition Problem. PLoS Computational Biology, 13 (9. e1005617)https://doi.org/10.1371/journal.pcbi.1005617
There is widespread interest in the relationship between the neurobiological systems supporting human cognition and emerging computational systems capable of emulating these capacities. Human speech comprehension, poorly understood as a neurobiological process, is an important case in point. Automatic Speech Recognition (ASR) systems with near-human levels of performance are now available, which provide a computationally explicit solution for the recognition of words in continuous speech. This research aims to bridge the gap between speech recognition processes in humans and machines, using novel multivariate techniques to compare incremental ‘machine states’, generated as the ASR analysis progresses over time, to the incremen- tal ‘brain states’, measured using combined electro- and magneto-encephalography (EMEG), generated as the same inputs are heard by human listeners. This direct comparison of dynamic human and machine internal states, as they respond to the same incrementally delivered sensory input, revealed a significant correspondence between neural response patterns in human superior temporal cortex and the structural properties of ASR-derived phonetic models. Spatially coherent patches in human temporal cortex responded selectively to individual phonetic features defined on the basis of machine-extracted regularities in the speech to lexicon mapping process. These results demonstrate the feasibility of relating human and ASR solutions to the problem of speech recognition, and suggest the potential for further studies relating complex neural computations in human speech comprehension to the rapidly evolving ASR systems that address the same problem domain.
language, neuroimaging, multivariate pattern analysis, Representational similarity analysis, automatic speech recognition, electroencephalography, magnetoencephalography
Is supplemented by: https://doi.org/10.6084/m9.figshare.5313484
This research was supported financially by an Advanced Investigator grant to WMW from the European Research Council (AdG 230570 NEUROLEX), by MRC Cognition and Brain Sciences Unit (CBSU) funding to WMW (U.1055.04.002.00001.01), and by a European Research Council Advanced Investigator grant under the European Community’s Horizon 2020 Research and Innovation Programme (2014-2020 ERC Grant agreement no 669820) to Lorraine K. Tyler. LS was partly supported by the NIHR Biomedical Research Centre and Biomedical Unit in Dementia based at Cambridge University Hospital NHS Foundation Trust.
EPSRC (via University of Edinburgh) (EP/I031022/1 ERI016379)
European Research Council (230570)
ECH2020 EUROPEAN RESEARCH COUNCIL (ERC) (669820)
Cambridge University Hospitals NHS Foundation Trust (CUH) (unknown)
External DOI: https://doi.org/10.1371/journal.pcbi.1005617
This record's URL: https://www.repository.cam.ac.uk/handle/1810/267410
Attribution 4.0 International, Attribution 4.0 International, Attribution 4.0 International, Attribution 4.0 International, Attribution 4.0 International, Attribution 4.0 International