On NMT Search Errors and Model Errors: Cat Got Your Tongue?

Stahlberg, Felix; Byrne, Bill

On NMT Search Errors and Model Errors: Cat Got Your Tongue?

Accepted version

Peer-reviewed

Repository URI

https://www.repository.cam.ac.uk/handle/1810/296232

Repository DOI

https://doi.org/10.17863/CAM.43277

Files

Accepted version (241.49 KB)

Type

Conference Object

Authors

Stahlberg, Felix

https://orcid.org/0000-0002-0430-5704

Byrne, Bill

Abstract

We report on search errors and model errors in neural machine translation (NMT). We present an exact inference procedure for neural sequence models based on a combination of beam search and depth-first search. We use our exact search to find the global best model scores under a Transformer base model for the entire WMT15 English-German test set. Surprisingly, beam search fails to find these global best model scores in most cases, even with a very large beam size of 100. For more than 50% of the sentences, the model in fact assigns its global best score to the empty translation, revealing a massive failure of neural models in properly accounting for adequacy. We show by constraining search with a minimum translation length that at the root of the problem of empty translations lies an inherent bias towards shorter translations. We conclude that vanilla NMT in its current form requires just the right amount of beam search errors, which, from a modelling perspective, is a highly unsatisfactory conclusion indeed, as the model often prefers an empty translation.

Keywords

cs.CL, cs.CL

Conference Name

2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing.

Publisher DOI

https://doi.org/10.17863/CAM.43277

Rights

Sponsorship

EPSRC (1632937)
EPSRC (1632937)

This work was supported by the U.K. Engineering and Physical Sciences Research Council (EPSRC) grant EP/L027623/1 and has been performed using resources provided by the Cambridge Tier-2 system operated by the University of Cambridge Research Computing Service funded by EPSRC Tier-2 capital grant EP/P020259/1.

Collections

Cambridge University Research Outputs