Repository logo
 

Zero-Shot Language Transfer for Cross-Lingual Sentence Retrieval Using Bidirectional Attention Model

Accepted version
Peer-reviewed

Loading...
Thumbnail Image

Change log

Abstract

We present a neural architecture for cross-lingual mate sentence retrieval which encodes sentences in a joint multilingual space and learns to distinguish true translation pairs from semantically related sentences across languages. The proposed model combines a recurrent sequence encoder with a bidirectional attention layer and an intra-sentence attention mechanism. This way the final fixed-size sentence representations in each training sentence pair depend on the selection of contextualized token representations from the other sentence. The representations of both sentences are then combined using the bilinear product function to predict the relevance score. We show that, coupled with a shared multilingual word embedding space, the proposed model strongly outperforms unsupervised cross-lingual ranking functions, and that further boosts can be achieved by combining the two approaches. Most importantly, we demonstrate the model’s effectiveness in zero-shot language transfer settings: our multilingual framework boosts cross-lingual sentence retrieval performance for unseen language pairs without any training examples. This enables robust cross-lingual sentence retrieval also for pairs of resource-lean languages, without any parallel data.

Description

Journal Title

Lecture Notes in Computer Science

Conference Name

Proceedings of the 41st European Conference on Information Retrieval (ECIR 2019)

Journal ISSN

0302-9743
1611-3349

Volume Title

11437 LNCS

Publisher

Springer Nature

Rights and licensing

Except where otherwised noted, this item's license is described as All rights reserved
Sponsorship
European Research Council (648909)