Repository logo
 

Classification-based self-learning for weakly supervised bilingual lexicon induction

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

Karan, M 
Vulić, I 
Korhonen, A 
Glavaš, G 

Abstract

Effective projection-based cross-lingual word embedding (CLWE) induction critically relies on the iterative self-learning procedure. It gradually expands the initial small seed dictionary to learn improved cross-lingual mappings. In this work, we present ClassyMap, a classification-based approach to self-learning, yielding a more robust and a more effective induction of projection-based CLWEs. Unlike prior self-learning methods, our approach allows for integration of diverse features into the iterative process. We show the benefits of ClassyMap for bilingual lexicon induction: we report consistent improvements in a weakly supervised setup (500 seed translation pairs) on a benchmark with 28 language pairs.

Description

Keywords

Journal Title

Proceedings of the Annual Meeting of the Association for Computational Linguistics

Conference Name

ACL 2020: 58th Annual Meeting of the Association for Computational Linguistics

Journal ISSN

0736-587X

Volume Title

Publisher

Association for Computational Linguistics

Rights

All rights reserved
Sponsorship
European Research Council (648909)