Story Generation with Large Language Models for African Languages

Essuman, Catherine Nana Nyaah and Buys, Jan (2025) Story Generation with Large Language Models for African Languages, Proceedings of Sixth Workshop on African Natural Language Processing, July 31, 2025, Vienna, Austria, 115-125, Association for Computational Linguistics.

[thumbnail of PDF] Other (PDF)
2025.africanlp-1.16.pdf - Published Version

Download (395kB)

Abstract

The development of Large Language Models (LLMs) for African languages has been hindered by the lack of large-scale textual data. Previous research has shown that relatively small language models, when trained on synthetic data generated by larger models, can produce fluent, short English stories, providing a data-efficient alternative to large-scale pretraining. In this paper, we apply a similar approach to develop and evaluate small language models for generating children’s stories in isiZulu and Yoruba, using synthetic datasets created through translation and multilingual prompting. We train six language-specific models varying in dataset size and source, and based on the GPT-2 architecture. Our results show that models trained on synthetic low-resource data are capable of producing coherent and fluent short stories in isiZulu and Yoruba. Models trained on larger synthetic datasets generally perform better in terms of coherence and grammar, and also tend to generalize better, as seen by their lower evaluation perplexities. Models trained on datasets generated through prompting instead of translation generate similar or more coherent stories and display more creativity, but perform worse in terms of generalization to unseen data. In addition to the potential educational applications of the automated story generation, our approach has the potential to be used as the foundation for more data-efficient low-resource language models.

Item Type: Conference paper
Subjects: Computing methodologies > Artificial intelligence > Natural language processing
Computing methodologies > Artificial intelligence > Natural language processing > Natural language generation
Date Deposited: 17 Oct 2025 06:18
Last Modified: 17 Oct 2025 06:18
URI: https://pubs.cs.uct.ac.za/id/eprint/1761

Actions (login required)

View Item View Item