2024-03-28T12:27:39Z
https://pubs.cs.uct.ac.za/cgi/oai2
oai:pubs.cs.uct.ac.za:913
2019-10-10T15:33:10Z
7375626A656374733D3130303032393531:3130303033323237
7375626A656374733D3130303032393531:3130303032393532
7375626A656374733D3130303032393531:3130303032393532:3130303032393533
7375626A656374733D3130303033313230
7375626A656374733D3130303032393531:3130303033333137
74797065733D746865736973
https://pubs.cs.uct.ac.za/id/eprint/913/
Transcription of the Bleek and Lloyd Collection using the Bossa Volunteer Thinking Framework
Munyaradzi, Mr Ngoni
Information systems applications
Data management systems
Database design and models
Human-centered computing
Information retrieval
The digital Bleek and Lloyd Collection is a rare collection that contains artwork, notebooks and dictionaries of the earliest habitants of Southern Africa. Previous attempts have been made to recognize the complex text in the notebooks using machine learning techniques, but due to the complexity of the manuscripts the recognition accuracy was low. In this research, a crowdsourcing based method is proposed to transcribe the historical handwritten manuscripts, where volunteers transcribe the notebooks online. An online crowdsourcing transcription tool was developed and deployed. Experiments were conducted to determine the quality of transcriptions and accuracy of the volunteers compared with a gold standard. The results show that volunteers are able to produce reliable transcriptions of high quality. The inter-transcriber agreement is 80% for |Xam text and 95% for English text. When the |Xam text transcriptions produced by the volunteers are compared with the gold standard, the volunteers achieve an average accuracy of 69.69%. Findings show that there exists a positive linear correlation between the inter-transcriber agreement and the accuracy of transcriptions. The user survey revealed that volunteers found the transcription process enjoyable, though it was difficult. Results indicate that volunteer thinking can be used to crowdsource intellectually-intensive tasks in digital libraries like transcription of handwritten manuscripts. Volunteer thinking outperforms machine learning techniques at the task of transcribing notebooks from the Bleek and Lloyd Collection.
2013
Electronic thesis or dissertation
application/pdf
en
https://pubs.cs.uct.ac.za/id/eprint/913/1/Ngoni_Munyaradzi_Thesis_2013.pdf
Munyaradzi, Mr Ngoni (2013) Transcription of the Bleek and Lloyd Collection using the Bossa Volunteer Thinking Framework, MSc.