A Joint Model of Orthography and Morphological Segmentation
Accepted version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Cotterell, Ryan
Vieira, Tim
Schütze, Hinrich
Abstract
We present a model of morphological seg- mentation that jointly learns to segment and restore orthographic changes, e.g., funniest → fun-y-est. We term this form of analysis canon- ical segmentation and contrast it with the tra- ditional surface segmentation, which segments a surface form into a sequence of substrings, e.g., funniest → funn-i-est. We derive an im- portance sampling algorithm for approximate inference in the model and report experimental results on English, German and Indonesian.
Description
Keywords
Journal Title
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Conference Name
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Journal ISSN
Volume Title
Publisher
Association for Computational Linguistics
Publisher DOI
Rights
All rights reserved