Automatic detection and correction of context-dependent dt-mistakes using neural networks


Type
Article
Change log
Authors
Heyman, G 
Vulić, I 
Laevaert, Y 
Moens, MF 
Abstract

We introduce a novel approach to correcting context-dependent dt-mistakes, one of the most frequent spelling errors in the Dutch language. We show that by using a neural network to estimate the probability distribution of a verb's suffix conditioned jointly on its stem and context, we obtain large improvements over state-of-the-art spell checkers on three different benchmarking datasets, achieving a perfect score on a verb spelling test from \emph{de Standaard}, a Flemish newspaper. The method is unsupervised and only relies on basic preprocessing tools to tokenize the text and identify verbs, which enables training on millions of sentences. Furthermore, we propose a method to determine which words in a sentence cause the system to make corrections, which is valuable for providing feedback to the user.

Description
Keywords
Journal Title
Computational Linguistics in the Netherlands Journal
Conference Name
Journal ISSN
2211-4009
Volume Title
8
Publisher
Publisher DOI
Publisher URL
Sponsorship
European Research Council (648909)