Repository logo
 

Problem-solving recognition in scientific text


Loading...
Thumbnail Image

Type

Thesis

Change log

Authors

Heffernan, Kevin 

Abstract

As far back as Aristotle, problems and solutions have been recognised as a core pattern of thought, and in particular of the scientific method. Therefore, they play a significant role in the understanding of academic texts from the scientific domain. Capturing knowledge of such problem-solving utterances would provide a deep insight into text understanding. In this dissertation, I present the task of problem-solving recognition in scientific text.

To date, work on problem-solving recognition has received both theoretical and computational treatment. However, theories of problem-solving put forward by applied linguists lack practical adaptation to the domain of scientific text, and computational analyses have been narrow in scope.

This dissertation provides a new model of problem-solving. It is an adaptation of Hoey's (2001) model, tailored to the scientific domain. As far as modelling problems is concerned, I divided the text string expressing the statement of a problem into sub-components; this is one of my main contributions. I have mapped these sub-components to functional roles, and thus operationalised the model in such a way that it can be annotated by humans reliably. As far as the problem-solving relationship between problems and solutions is concerned, my model takes into account the local network of relationships existing between problems.

In order to validate this new model, a large-scale annotation study was conducted. The annotation study shows significant agreement amongst the annotators. The model is automated in two stages using a blend of classical machine learning and state-of-the-art deep learning methods. The first stage involves the implementation of problem and solution recognisers which operate at the sentence level. The second stage is more complex in that it recognises problems and solutions jointly at the token-level, and also establishes whether there is a problem-solving relationship between each of them. One of the best performers at this stage was a Neural Relational Topic Model. The results from automation show that the model is able to recognise problem-solving utterances in text to a high degree of accuracy.

My work has already shown a positive impact in both industry and academia. One start-up is currently using the model for representing academic articles, and a Japanese collaborator has received a grant to adapt my model to Japanese text.

Description

Date

2020-10-01

Advisors

Teufel, Simone

Keywords

machine learning

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Sponsorship
EPSRC (1641528)
EPSRC (1641528)
Relationships
Is supplemented by: