Repository logo
 

Non-linear instance-based cross-lingual mapping for non-isomorphic embedding spaces

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

Glavaš, G 
Vulić, I 

Abstract

We present InstaMap, an instance-based method for learning projection-based cross-lingual word embeddings. Unlike prior work, it deviates from learning a single global linear projection. InstaMap is a non-parametric model that learns a non-linear projection by iteratively: (1) finding a globally optimal rotation of the source embedding space relying on the Kabsch algorithm, and then (2) moving each point along an instance-specific translation vector estimated from the translation vectors of the point's nearest neighbours in the training dictionary. We report performance gains with InstaMap over four representative state-of-the-art projection-based models on bilingual lexicon induction across a set of 28 diverse language pairs. We note prominent improvements, especially for more distant language pairs (i.e., languages with non-isomorphic monolingual spaces).

Description

Keywords

Journal Title

Proceedings of the Annual Meeting of the Association for Computational Linguistics

Conference Name

ACL 2020: 58th Annual Meeting of the Association for Computational Linguistics

Journal ISSN

0736-587X

Volume Title

Publisher

Association for Computational Linguistics

Rights

All rights reserved
Sponsorship
European Research Council (648909)