Specialising Distributional Vectors of All Words for Lexical Entailment

Kamath, Aishwarya; Pfeiffer, Jonas; Ponti, Edoardo; Glavas, Goran; Vulic, Ivan

Specialising Distributional Vectors of All Words for Lexical Entailment

Accepted version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/296964

Repository DOI

https://doi.org/10.17863/CAM.44005

Files

Accepted version (602.71 KB)

Type

Conference Object

Authors

Kamath, Aishwarya

Pfeiffer, Jonas

Ponti, Edoardo

https://orcid.org/0000-0002-6308-1050

Glavas, Goran

Vulic, Ivan

Abstract

Semantic specialization methods fine-tune distributional word vectors using lexical knowledge from external resources (e.g., WordNet) to accentuate a particular relation between words. However, such post-processing methods suffer from limited coverage as they affect only vectors of words \textit{seen} in the external resources. We present the first post-processing method that specializes vectors of \textit{all vocabulary words} -- including those \textit{unseen} in the resources -- for the \textit{asymmetric} relation of lexical entailment (\textsc{le}) (i.e., hyponymy-hypernymy relation). Leveraging a partially \textsc{le}-specialized distributional space, our \textsc{postle} (i.e., \textit{post-specialization} for \textsc{le}) model learns an explicit global specialization function, allowing for specialization of vectors of unseen words, as well as word vectors from other languages via cross-lingual transfer. We capture the function as a deep feed-forward neural network: its objective re-scales vector norms to reflect the concept hierarchy while simultaneously attracting hyponymy-hypernymy pairs to better reflect semantic similarity. An extended model variant augments the basic architecture with an adversarial discriminator. We demonstrate the usefulness and versatility of \textsc{postle} models with different input distributional spaces in different scenarios (monolingual \textsc{le} and zero-shot cross-lingual \textsc{le} transfer) and tasks (binary and graded \textsc{le}). We report consistent gains over state-of-the-art \textsc{le}-specialization methods, and successfully \textsc{le}-specialize word vectors for languages without any external lexical knowledge.

Conference Name

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP, collocated with ACL 2019)

Publisher DOI

https://doi.org/10.17863/CAM.44005

Rights

Sponsorship

European Research Council (648909)

Collections

Cambridge University Research Outputs