University of Cape Town’s WMT22 System: Multilingual Machine Translation for Southern African Languages

Elmadani, Khalid N. and Meyer, Francois and Buys, Jan (2022) University of Cape Town’s WMT22 System: Multilingual Machine Translation for Southern African Languages, Proceedings of Seventh Conference on Machine Translation (WMT), December 2022, Abu Dhabi, Association for Computational Linguistics.

[thumbnail of 2210.11757.pdf] Text
2210.11757.pdf - Accepted Version

Download (264kB)

Abstract

The paper describes the University of Cape Town's submission to the constrained track of the WMT22 Shared Task: Large-Scale Machine Translation Evaluation for African Languages. Our system is a single multilingual translation model that translates between English and 8 South / South East African Languages, as well as between specific pairs of the African languages. We used several techniques suited for low-resource machine translation (MT), including overlap BPE, back-translation, synthetic training data generation, and adding more translation directions during training. Our results show the value of these techniques, especially for directions where very little or no bilingual training data is available.

Item Type: Conference paper
Subjects: Computing methodologies > Artificial intelligence > Natural language processing
Computing methodologies > Artificial intelligence > Natural language processing > Machine translation
Date Deposited: 15 Nov 2022 08:08
Last Modified: 15 Nov 2022 08:08
URI: https://pubs.cs.uct.ac.za/id/eprint/1549

Actions (login required)

View Item View Item