Machine learning methods for vector-based compositional semantics

Maillard, Jean

Machine learning methods for vector-based compositional semantics

Repository URI

https://www.repository.cam.ac.uk/handle/1810/294356

Repository DOI

https://doi.org/10.17863/CAM.41454

Files

Thesis (2.67 MB)

Type

Thesis

Authors

Maillard, Jean

https://orcid.org/0000-0003-0025-1021

Abstract

Rich semantic representations of linguistic data are an essential component to the development of machine learning algorithms for natural language processing. This thesis explores techniques to model the meaning of phrases and sentences as dense vectors, which can then be further analysed and manipulated to perform any number of tasks involving the understanding of human language. Rather than seeing this task purely as an engineering problem, this thesis will focus on linguistically-motivated approaches, based on the principle of compositionality.

The first half of the thesis will be dedicated to categorial compositional models, which are based on the observation that certain types of grammars share the structure of the algebra of vector spaces. This leads to an approach where the meanings of words are modelled as multilinear maps, encoded as tensors. In this framework, the meaning of a composite linguistic phrase can be computed via the tensor multiplication of its constituents, according to the phrase's syntactic structure. I contribute two categorial compositional models: the first, an extension of a popular method for learning semantic representation of words, models the meanings of adjective-noun phrases as matrix-vector multiplications; the second uses higher-order tensors to represent the meaning of relative clauses.

In contrast, the models presented in the second half of the thesis do away with traditional syntactic structures. Rather than using the standard syntax trees of linguistics to drive the compositional process, these models treat the compositional structure as a latent variable. I contribute two models that automatically induce trees for a downstream task, without ever being shown a `real' syntax tree: one model based on chart parsing, and one based on shift-reduce parsing. While these proposed approaches induce trees that do not resemble traditional syntax trees, they do lead to models with higher performance on downstream tasks – opening up avenues for future research.

Date

2019-01-30

Advisors

Clark, Stephen
Vlachos, Andreas

Keywords

natural language processing, nlp, computational linguistics, compositionality, distributional semantics, compositional semantics

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights

Sponsorship

EPSRC

Collections

Theses - Computer Science and Technology