On the Feasibility of LLM-based Automated Generation and Filtering of Competency Questions for Ontologies

Mahlaza, Zola and Keet, C. Maria and Chahinian, Nanée and Haydar, Batoul (2025) On the Feasibility of LLM-based Automated Generation and Filtering of Competency Questions for Ontologies, Proceedings of 5th Conference on Language, Data and Knowledge (LDK2025), 9-11 September 202, Naples, Italy, 136-146, UniorPress, Naples.

Full text not available from this repository. (Use alternate locations listed below)

Abstract

Competency questions for ontologies are used in a number of ontology development tasks. The questions’ sentences structure have been analysed to inform ontology authoring and validation. One of the problems to make this a seamless process is the hurdle of writing good CQs manually or offering automated assistance in writing CQs. In this paper, we propose an enhanced and automated pipeline where one can trace meticulously through each step, using a mini-corpus, T5, and the SQuAD dataset to generate questions, and the CLaRO controlled language, semantic similarity, and other steps for filtering. This was evaluated with two corpora of different genre in the same broad domain and evaluated with domain experts. The final output questions across the experiments were around 25% for scope and relevance and 45% of unproblematic quality. Technically, it provided ample insight into trade-offs in generation and filtering, where relaxing filtering increased sentence structure diversity but also led to more spurious sentences that required additional processing.

Item Type: Conference paper
Subjects: Computing methodologies > Artificial intelligence > Knowledge representation and reasoning
Computing methodologies > Machine learning
Information systems > Information retrieval > Document representation > Ontologies
Date Deposited: 13 Oct 2025 12:26
Last Modified: 13 Oct 2025 12:26
URI: https://pubs.cs.uct.ac.za/id/eprint/1753

Actions (login required)

View Item View Item