Benchmarking IsiXhosa Automatic Speech Recognition and Machine Translation for Digital Health Provision

Blocker, Abby and Meyer, Francois and Biyabani, Ahmed and Mwangama, Joyce and Datay, Mohammed Ishaaq and Malila, Bessie (2025) Benchmarking IsiXhosa Automatic Speech Recognition and Machine Translation for Digital Health Provision, Proceedings of Second Workshop on Patient-Oriented Language Processing (CL4Health).

Text
2025.cl4health-1.14.pdf
Download (631kB)

Abstract

As digital health becomes more ubiquitous, people from different geographic regions are connected and there is thus a need for accurate language translation services. South Africa presents opportunity and need for digital health innovation, but implementing indigenous translation systems for digital health is difficult due to a lack of language resources. Understanding the accuracy of current models for use in medical translation of indigenous languages is crucial for designers looking to build quality digital health solutions. This paper presents a new dataset with audio and text of primary health consultations for automatic speech recognition and machine translation in South African English and the indigenous South African language of isiXhosa. We then evaluate the performance of well-established pretrained models on this dataset. We found that isiXhosa had limited support in speech recognition models and showed high, variable character error rates for transcription (26-70%). For translation tasks, Google Cloud Translate and ChatGPT outperformed the other evaluated models, indicating large language models can have similar performance to dedicated machine translation models for low-resource language translation.

Item Type:	Conference paper
Subjects:	Computing methodologies > Artificial intelligence > Natural language processing > Machine translation
Date Deposited:	28 Jul 2025 12:37
Last Modified:	28 Jul 2025 12:37
URI:	https://pubs.cs.uct.ac.za/id/eprint/1735

Actions (login required)

View Item