BantuWeb: A Digital Library for Resource Scarce South African Languages

von Holy, Andreas and Bresler, Alon and Shuman, Osher and Chavula, Catherine and Suleman, Hussein (2017) BantuWeb: A Digital Library for Resource Scarce South African Languages, Proceedings of Annual Conference of the South African Institute of Computer Scientists and Information Technologists (SAICSIT 2017), 26-28 September 2017, Thaba 'Nchu, South Africa, ACM.

[img] PDF
bantuweb-digital-library-camera-ready.pdf

Download (840kB)

Abstract

South Africa is a linguistically diverse country: it is a home to 11 official languages of which nine, excluding English and Afrikaans, are Resource Scarce Languages (RSLs). Accordingly, many South Africans struggle to access information written in their native languages on the Web. Unfortunately, lack of access to information hinders social economic growth. This paper proposes a Web based digital library to act as a central repository for content written in these languages that is crawled from the Web, and generated or contributed by a community of users. Gamification features have been incorporated into the digital library to motivate users to contribute content to strengthen the collection of resources and to increase community participation. Specifically, the paper: (i) proposes a ranking algorithm, smart interleaving, to aggregate and rank multilingual search results effectively from collections of varying size; and (ii) investigates which gamification features, among leaderboard, notifications, virtual points and level, motivate users to contribute content in the context of South African RSLs. The results show that users were motivated to contribute more content to reach the next level than improving their leaderboard ranking or virtual points. Further, the overall results on merging and ranking multilingual search results show no significant improvement in using smart interleaving.

Item Type: Conference paper
Uncontrolled Keywords: Digital Libraries, Gamification, Crowdsourcing, Multilingual Information Retrieval, Search Engines, Information Retrieval Evaluation, Web Crawling, Language Preservation
Subjects: Information systems > Information retrieval
Alternate Locations: https://dl.acm.org/citation.cfm?id=3129446
Date Deposited: 25 Nov 2017
Last Modified: 10 Oct 2019 15:32
URI: http://pubs.cs.uct.ac.za/id/eprint/1226

Actions (login required)

View Item View Item