UCT CS Research Document Archive

Using A Hidden Markov Model to Transcribe Handwritten Bushman Texts

Williams, Kyle and Hussein Suleman (2011) Using A Hidden Markov Model to Transcribe Handwritten Bushman Texts. In Proceedings 11th Annual ACM/IEEE Joint Conference on Digital Libraries, pages 445-446, Ottawa, Canada.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

The Bushman texts in the Bleek and Lloyd Collection contain complex diacritics that make automatic transcription difficult. Transcriptions of these texts would allow for enhanced digital library services to be created for interacting with the collection. In this study, an investigation into automatic transcription of the Bushman texts was performed using the popular method of using a Hidden Markov Model for text line recognition. The results show that while this technique may be well suited to well-constrained and understood scripts, its application to more complex scripts introduces a number of difficulties that need to be overcome.

EPrint Type:Conference Poster
Keywords:OCR, handwriting recognition, Hidden Markov Model, digital libraries
Subjects:I Computing Methodologies: I.7 DOCUMENT AND TEXT PROCESSING
H Information Systems: H.3 INFORMATION STORAGE AND RETRIEVAL
ID Code:696
Deposited By:Williams, Kyle Mark
Deposited On:22 June 2011