Show simple item record

dc.contributor.authorStahlberg, Felixen
dc.contributor.authorByrne, Williamen
dc.date.accessioned2017-09-25T11:27:02Z
dc.date.available2017-09-25T11:27:02Z
dc.date.issued2017-09-09en
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/267367
dc.description.abstractEnsembling is a well-known technique in neural machine translation (NMT) to improve system performance. Instead of a single neural net, multiple neural nets with the same topology are trained separately, and the decoder generates predictions by averaging over the individual models. Ensembling often improves the quality of the generated translations drastically. However, it is not suitable for production systems because it is cumbersome and slow. This work aims to reduce the runtime to be on par with a single system without compromising the translation quality. First, we show that the ensemble can be unfolded into a single large neural network which imitates the output of the ensemble system. We show that unfolding can already improve the runtime in practice since more work can be done on the GPU. We proceed by describing a set of techniques to shrink the unfolded network by reducing the dimensionality of layers. On Japanese-English we report that the resulting network has the size and decoding speed of a single NMT network but performs on the level of a 3-ensemble system.
dc.description.sponsorshipThis work was supported by the U.K. Engineering and Physical Sciences Research Council (EPSRC grant EP/L027623/1).
dc.language.isoenen
dc.publisherAssociation for Computational Linguistics
dc.titleUnfolding and Shrinking Neural Machine Translation Ensemblesen
dc.typeConference Object
prism.publicationDate2017en
dc.identifier.doi10.17863/CAM.12004
dcterms.dateAccepted2017-06-30en
rioxxterms.versionAMen
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserveden
rioxxterms.licenseref.startdate2017-09-09en
dc.contributor.orcidStahlberg, Felix [0000-0002-0430-5704]
rioxxterms.typeConference Paper/Proceeding/Abstracten
pubs.funder-project-idEPSRC (EP/L027623/1)
pubs.conference-nameConference on Empirical Methods in Natural Language Processingen
pubs.conference-start-date2017-09-09en


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record