Classification-based self-learning for weakly supervised bilingual lexicon induction
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
Effective projection-based cross-lingual word embedding (CLWE) induction critically relies on the iterative self-learning procedure. It gradually expands the initial small seed dictionary to learn improved cross-lingual mappings. In this work, we present ClassyMap, a classification-based approach to self-learning, yielding a more robust and a more effective induction of projection-based CLWEs. Unlike prior self-learning methods, our approach allows for integration of diverse features into the iterative process. We show the benefits of ClassyMap for bilingual lexicon induction: we report consistent improvements in a weakly supervised setup (500 seed translation pairs) on a benchmark with 28 language pairs.
Description
Keywords
Journal Title
Proceedings of the Annual Meeting of the Association for Computational Linguistics
Conference Name
ACL 2020: 58th Annual Meeting of the Association for Computational Linguistics
Journal ISSN
0736-587X
Volume Title
Publisher
Association for Computational Linguistics
Publisher DOI
Rights and licensing
Except where otherwised noted, this item's license is described as All rights reserved
Sponsorship
European Research Council (648909)