Repository logo
 

Automatic detection and correction of context-dependent dt-mistakes using neural networks

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Heyman, G 
Vulić, I 
Laevaert, Y 
Moens, MF 

Abstract

We introduce a novel approach to correcting context-dependent dt-mistakes, one of the most frequent spelling errors in the Dutch language. We show that by using a neural network to estimate the probability distribution of a verb's suffix conditioned jointly on its stem and context, we obtain large improvements over state-of-the-art spell checkers on three different benchmarking datasets, achieving a perfect score on a verb spelling test from \emph{de Standaard}, a Flemish newspaper. The method is unsupervised and only relies on basic preprocessing tools to tokenize the text and identify verbs, which enables training on millions of sentences. Furthermore, we propose a method to determine which words in a sentence cause the system to make corrections, which is valuable for providing feedback to the user.

Description

Keywords

Journal Title

Computational Linguistics in the Netherlands Journal

Conference Name

Journal ISSN

2211-4009

Volume Title

8

Publisher

Publisher DOI

Publisher URL

Sponsorship
European Research Council (648909)