Repository logo
 

The Geometry of Statistical Machine Translation

cam.orpheus.counter58
dc.contributor.authorByrne, WJ
dc.contributor.authorWaite, Aurelien
dc.date.accessioned2019-01-15T00:30:30Z
dc.date.available2019-01-15T00:30:30Z
dc.description.abstractMost modern statistical machine translation systems are based on linear statistical models. One extremely effective method for estimating the model parameters is minimum error rate training (MERT), which is an efficient form of line optimisation adapted to the highly non- linear objective functions used in machine translation. We describe a polynomial-time generalisation of line optimisation that computes the error surface over a plane embedded in parameter space. The description of this algorithm relies on convex geometry, which is the mathematics of polytopes and their faces. Using this geometric representation of MERT we investigate whether the optimisation of linear models is tractable in general. Previous work on finding optimal solutions in MERT (Galley and Quirk, 2011) established a worst- case complexity that was exponential in the number of sentences, in contrast we show that exponential dependence in the worst-case complexity is mainly in the number of features. Although our work is framed with respect to MERT, the convex geometric description is also applicable to other error-based training methods for linear models. We believe our analysis has important ramifications because it suggests that the current trend in building sta- tistical machine translation systems by introducing a very large number of sparse features is inherently not robust.
dc.identifier.doi10.17863/CAM.35285
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/287965
dc.language.isoeng
dc.titleThe Geometry of Statistical Machine Translation
dc.typeConference Object
dcterms.dateAccepted2015-02-20
pubs.conference-finish-date2015-06-15
pubs.conference-nameHuman Language Technologies: The 2015 Annual Conference of the North American Chapter of the ACL
pubs.conference-start-date2015-05-31
rioxxterms.licenseref.startdate2015-02-20
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserved
rioxxterms.typeConference Paper/Proceeding/Abstract
rioxxterms.versionAM
rioxxterms.versionofrecord10.17863/CAM.35285

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
N15-1041.pdf
Size:
2.8 MB
Format:
Adobe Portable Document Format
Description:
Accepted version
Licence
http://www.rioxx.net/licenses/all-rights-reserved
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
DepositLicenceAgreementv2.1.pdf
Size:
150.9 KB
Format:
Adobe Portable Document Format