Repository logo
 

Non-Linear Instance-Based Cross-Lingual Mapping for Non-Isomorphic Embedding Spaces

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Change log

Abstract

We present InstaMap, an instance-based method for learning projection-based cross-lingual word embeddings. Unlike prior work, it deviates from learning a single global linear projection. InstaMap is a non-parametric model that learns a non-linear projection by iteratively: (1) finding a globally optimal rotation of the source embedding space relying on the Kabsch algorithm, and then (2) moving each point along an instance-specific translation vector estimated from the translation vectors of the point's nearest neighbours in the training dictionary. We report performance gains with InstaMap over four representative state-of-the-art projection-based models on bilingual lexicon induction across a set of 28 diverse language pairs. We note prominent improvements, especially for more distant language pairs (i.e., languages with non-isomorphic monolingual spaces).

Description

Keywords

Journal Title

Proceedings of the Annual Meeting of the Association for Computational Linguistics

Conference Name

ACL 2020: 58th Annual Meeting of the Association for Computational Linguistics

Journal ISSN

0736-587X

Volume Title

Publisher

Association for Computational Linguistics

Rights and licensing

Except where otherwised noted, this item's license is described as All rights reserved
Sponsorship
European Research Council (648909)