UCT CS Research Document Archive

Software Traceability using Latent Semantic Analysis and Relevance Feedback

Kritzinger, PS and H Krüger (2008) Software Traceability using Latent Semantic Analysis and Relevance Feedback. Technical Report CS08-01-00, Department of Computer Science, University of Cape Town.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

Software traceability (ST), in its broadest sense, is the process of tracking changes in the document corpus which are created throughout the software development life-cycle. However, traditional ST approaches require a lot of human effort to identify and consistently record inter-dependencies among software artifacts. In this paper we present an approach that reveals traceability links automatically using the information retrieval (IR) techniques of Latent Semantic Analysis (LSA) and Relevance Feedback and present a software tool to implement these ideas. We discuss in detail how software artifacts can be represented in a vector space model and how term extraction and weighting can be accomplished for UML artifacts, such as use-cases, interaction and state diagrams, as well as for source code and natural language text documents. We also explain how structural information which is always inherent in software artifacts can be preserved in the term extraction and weighting phase of creating traceable artifacts. Unlike other tools, we incorporate human knowledge through relevance feedback into the traceability link recovery process with the aim to improve the quality of traceability links. Finally, we illustrate the effectiveness of our tool-based approach and our proposals through a case study with a pilot software project and compare our results with those of a manual tracing process.

EPrint Type:Departmental Technical Report
Keywords:Requirements Engineering, Software Traceability, Software Development, Change Management, Latent Semantic Indexing, Relevance feedback, Information Retrieval, UML
Subjects:D Software: D.2 SOFTWARE ENGINEERING
ID Code:500
Deposited By:Pileggi, PP
Deposited On:27 October 2008