Classification-based self-learning for weakly supervised bilingual lexicon induction
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Karan, M
Vulić, I
Korhonen, A
Glavaš, G
Abstract
Effective projection-based cross-lingual word embedding (CLWE) induction critically relies on the iterative self-learning procedure. It gradually expands the initial small seed dictionary to learn improved cross-lingual mappings. In this work, we present ClassyMap, a classification-based approach to self-learning, yielding a more robust and a more effective induction of projection-based CLWEs. Unlike prior self-learning methods, our approach allows for integration of diverse features into the iterative process. We show the benefits of ClassyMap for bilingual lexicon induction: we report consistent improvements in a weakly supervised setup (500 seed translation pairs) on a benchmark with 28 language pairs.
Description
Keywords
Journal Title
Proceedings of the Annual Meeting of the Association for Computational Linguistics
Conference Name
ACL 2020: 58th Annual Meeting of the Association for Computational Linguistics
Journal ISSN
0736-587X
Volume Title
Publisher
Association for Computational Linguistics
Publisher DOI
Rights
All rights reserved
Sponsorship
European Research Council (648909)