Repository logo
 

From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers

Published version
Peer-reviewed

Type

Conference Object

Change log

Authors

Lauscher, Anne 
Ravishankar, Vinit 
Vulic, Ivan 
Glavas, Goran 

Abstract

Massively multilingual transformers (MMTs) pretrained via language modeling (e.g., mBERT, XLM-R) have become a default paradigm for zero-shot language transfer in NLP, offering unmatched transfer performance. Current evaluations, however, verify their efficacy in transfers (a) to languages with sufficiently large pretraining corpora, and (b) between close languages. In this work, we analyze the limitations of downstream language transfer with MMTs, showing that, much like cross-lingual word embeddings, they are substantially less effective in resource-lean scenarios and for distant languages. Our experiments, encompassing three lower-level tasks (POS tagging, dependency parsing, NER) and two high-level tasks (NLI, QA), empirically correlate transfer performance with linguistic proximity between source and target languages, but also with the size of target language corpora used in MMT pretraining. Most importantly, we demonstrate that the inexpensive few-shot transfer (i.e., additional fine-tuning on a few target-language instances) is surprisingly effective across the board, warranting more research efforts reaching beyond the limiting zero-shot conditions.

Description

Keywords

Journal Title

Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)

Conference Name

Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)

Journal ISSN

Volume Title

Publisher

Association for Computational Linguistics

Rights

All rights reserved
Sponsorship
European Research Council (648909)