Repository logo
 

Building natural language processing tools for Runyakitara

Published version
Peer-reviewed

Type

Article

Change log

Authors

Katushemererwe, F 
Buttery, P 

Abstract

jats:titleAbstract</jats:title> jats:pThis paper describes an endeavour to build natural language processing (NLP) tools for Runyakitara, a group of four closely related Bantu languages spoken in western Uganda. In contrast with major world languages such as English, for which corpora are comparatively abundant and NLP tools are well developed, computational linguistic resources for Runyakitara are in short supply. First therefore, we need to collect corpora for these languages, before we can proceed to the design of a spell-checker, grammar-checker and applications for computer-assisted language learning (CALL). We explain how we are collecting primary data for a new Runya Corpus of speech and writing, we outline the design of a morphological analyser, and discuss how we can use these new resources to build NLP tools. We are initially working with Runyankore–Rukiga, a closely-related pair of Runyakitara languages, and we frame our project in the context of NLP for low-resource languages, as well as CALL for the preservation of endangered languages. We put our project forward as a test case for the revitalization of endangered languages through education and technology.</jats:p>

Description

Keywords

natural language processing, endangered languages, language corpus, morphological analyser, CALL

Journal Title

Applied Linguistics Review

Conference Name

Journal ISSN

1868-6303
1868-6311

Volume Title

Publisher

Walter de Gruyter GmbH

Rights

All rights reserved