Pluralizing Nouns across Agglutinating Bantu Languages

Byamugisha, Joan and Keet, C. Maria and DeRenzi, Brian (2018) Pluralizing Nouns across Agglutinating Bantu Languages, Proceedings of 27th International Conference on Computational Linguistics (COLING'18), August 20-26, 2018, Santa Fe, New Mexico, USA, 2633-2643, ACL.

[img] PDF
COLING18finalversion.pdf

Download (210kB)

Abstract

Text generation may require the pluralization of nouns, such as in context-sensitive user interfaces and in natural language generation more broadly. While this has been solved for the widely used languages, this is still a challenge for the languages in the Bantu language family. Pluralization results obtained for isiZulu and Runyankore showed there were similarities in approach, including the need to combine morphology with syntax and semantics, despite belonging to different language zones. This suggests that bootstrapping and generalizability might be feasible. We investigated this systematically for seven languages across three different Guthrie language zones. The first outcome is that Meinhof’s 1948 specification of the noun classes are indeed inadequate for computational purposes for all examined languages, due to non-determinism in prefixes, and we thus redefined the characteristic noun class tables of 29 noun classes into 53. The second main result is that the generic pluralizer achieved over 93% accuracy in coverage testing and over 94% on a random sample. This is comparable to the language-specific isiZulu and Runyankore pluralizers.

Item Type: Conference paper
Uncontrolled Keywords: computational linguistics, Bantu languages, pluraliser
Subjects: Applied computing > Document management and text processing
Alternate Locations: http://www.meteck.org/files/COLING18GenericPluralizer.pdf, http://aclweb.org/anthology/C18-1223
Date Deposited: 09 Nov 2018
Last Modified: 10 Oct 2019 15:31
URI: http://pubs.cs.uct.ac.za/id/eprint/1271

Actions (login required)

View Item View Item