A statistical approach to error correction for isiZulu spellcheckers.

Mjaria, Frida and Keet, C. Maria (2018) A statistical approach to error correction for isiZulu spellcheckers., Proceedings of IST-Africa 2018, 9-11 May 2018, Gaborone, Botswana, 1-9, IIMC International Information Management Corporation.

[img] PDF

Download (509kB)


Spellcheckers have become important with the increase of text-based communication at work and in society on social media. There is, however, very little support for spellchecking in agglutinating Sub-Saharan African (Bantu) languages. While error detection has shown to yield acceptable results for at least isiZulu, correction of spelling errors has not even been investigated. The aim of this paper is to solve the spelling correction problem by means of a statistical approach such that it can accurately provide candidate corrections to misspelled isiZulu words (nonword error correction). Trigrams learned from a corpus, their probabilities, minimum edit distance, and additional optimisations are used to construct the error corrector. The corrector was evaluated for the four types of non-word errors (substitution, insertions, deletions, and transpositions). It achieved a language recall rate of 89%, an error recall of 84%, a language precision of 85%, and an error precision of 88% for error correction. The error corrector was found to have an overall suggestions accuracy rate of 95% and relevance of 61%, performing best for transposition errors (99% and 89%, respectively). The error corrector has been added to an existing open source isiZulu error detector. This facilitates uptake and, moreover, fills a feature gap that has numerous benefits for society, both for isiZulu speakers and learners in South Africa, and for bootstrapping spellcheckers for related languages.

Item Type: Conference paper
Uncontrolled Keywords: spellchekcer
Subjects: Applied computing > Document management and text processing
Information systems
Alternate Locations: http://www.meteck.org/files/CorrectorisiZuluISTAfrica18.pdf, https://ieeexplore.ieee.org/document/8417192
Date Deposited: 09 Nov 2018
Last Modified: 10 Oct 2019 15:31
URI: http://pubs.cs.uct.ac.za/id/eprint/1269

Actions (login required)

View Item View Item