The Geometry of Statistical Machine Translation

Waite, Aurelian; Byrne, William

The Geometry of Statistical Machine Translation

Repository URI

https://www.repository.cam.ac.uk/handle/1810/247940

Files

Waite & Byrne 2015 Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies.pdf (2.8 MB)

Type

Article

Authors

Waite, Aurelian

Byrne, William

Abstract

Most modern statistical machine translation systems are based on linear statistical models. One extremely effective method for estimating the model parameters is minimum error rate training (MERT), which is an efficient form of line optimisation adapted to the highly nonlinear objective functions used in machine translation. We describe a polynomial-time generalisation of line optimisation that computes the error surface over a plane embedded in parameter space. The description of this algorithm relies on convex geometry, which is the mathematics of polytopes and their faces. Using this geometric representation of MERT we investigate whether the optimisation of linear models is tractable in general. Previous work on finding optimal solutions in MERT (Galley and Quirk, 2011) established a worstcase complexity that was exponential in the number of sentences, in contrast we show that exponential dependence in the worst-case complexity is mainly in the number of features. Although our work is framed with respect to MERT, the convex geometric description is also applicable to other error-based training methods for linear models. We believe our analysis has important ramifications because it suggests that the current trend in building statistical machine translation systems by introducing a very large number of sparse features is inherently not robust.

Description

This is the author accepted manuscript. The final version is available from ACL via http://www.aclweb.org/anthology/N15-1041

Journal Title

Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Publisher

ACL

Publisher URL

http://www.aclweb.org/anthology/N15-1041

Rights

Sponsorship

This research was supported by a doctoral training account from the Engineering and Physical Sciences Research Council.

Collections

Scholarly Works - Engineering
Symplectic mapped items for data match