Creating a Handwriting Recognition Corpus for Bushman Languages

Williams, Kyle and Suleman, Hussein (2011) Creating a Handwriting Recognition Corpus for Bushman Languages, Proceedings of 13th International Conference on Asia-Pacific Digital Libraries, 24-27 October 2011, Beijing, P.R. China, 222-231, Springer.

[img] PDF

Download (686kB)


Handwriting recognition systems rely on the existence of a corpus for training recognition models and evaluating accuracy. Creating a handwriting recognition corpus for the Bushman languages of southern Africa is difficult due to the complexities of the script used to represent them and the fact that this script cannot be represented using Unicode. To solve this problem, a semi-automatic Web-based tool was developed to segment, capture and encode the Bushman text. A case study demonstrated how the tool could be used to create a Bushman handwriting corpus with few errors.

Item Type: Conference paper
Uncontrolled Keywords: Corpus creation, transcription, digital libraries
Subjects: Applied computing > Document management and text processing
Information systems
Alternate Locations:
Date Deposited: 18 Nov 2011
Last Modified: 10 Oct 2019 15:33

Actions (login required)

View Item View Item