Metadata Management Tool Component

Developed by Michael Ferguson

The Metadata Management Tool offers a wide variety of user tasks, with the aim of producing descriptive metadata. Notably the tool offers persistence of metadata sets via its save and load functionality and batch modification.

Introduction

Meta-fy, a Metadata Management Tool was developed in order to assist with the difficult task of creating metadata for a large amount of data. The tool was developed using the requirements of the Zamani Project team. It is able to handle a wide variety of user tasks and offers the ability to expand in the future via its Javadoc. This component needed to enable the creation of descriptive metadata for the cultural heritage data that can be exported in FOXML and then ingested into the archive.

Implementation

Java

An object-oriented approach was used when designing the Metadata Management Tool. Object orientated programming is a programming paradigm that maps classes to objects. This approach was used as it provided a clear modular structure for the application and enabled easy maintenance and extension of code. The Metadata Management Tool was implemented using java and is run as a Java applet imbedded in an HTML document.

Java Applet (JApplet)

One of the requirements of the Metadata Management Tool was that it was Web-based, to this end, a Java Applet was chosen. An applet is launched from a user’s browser and executed by the JVM in a separate process to the Web browser. Applets are fast and can have similar performance to natively installed software. Additionally, Java applets are cross platform friendly and are supported by most Web browsers.

Fedora XML (FOXML) Design

Fedora XML (FOXML) is an XML format that is used to express the Fedora digital object model. The Fedora Digital Object Model follows a compound digital object design, aggregating one or more data streams into one digital object. Fedora's datastreams have a Control Group property that is used by the datastream to encapsulate its contents. The Control Group property chosen was Internal XML Content, as it provides the desired effect of storing the XML in-line within a digital objects XML file. Fedora has a DC (Dublin Core datastream) that is used to store metadata about the digital object. Fedora offers a RELS-EXT datastream that is used to describe relationships amongst digital objects. Additionally, custom datastreams were used to record further details about digital objects.

FOXML was used to create a relational structure within the Fedora repository with the use of the RELS-EXT datastream, this is illustrated below: Fedora repository RELS-EXT relationship A separate digital object (FOXML file) is used to represent each Collection and Site object. In order to create a hierarchy of relationships, the use of isMemberOf and isCollection RDF properties were used. This is illustrated in the two RELS-EXT relationships below: RELS-EXT root and country relationship All of the FOXML files produced for the Fedora repository consist of two standard datastreams, namely: DC and RELS-EXT. The DC datastream is used to store the majority of metadata relating to digital objects. The RELS-EXT datastream is used to create relationships amongst the digital objects in the repository. There are a number of additional custom datastreams that add additional information to the digital objects.These datastreams are vital to the functioning of other core componenents of the system. Site Collections have an additional SITE datastream used to store site related information, this is illustrated below: SITE datastream Site Objects consist of an additional two datastreams namely: FILE and POSITION. The FILE datastream stores information relating to the file that the digital object represents. This ensures that the DC datastream that contains the core information about the file is accompanied by the path to the file. The POSITION datastream is used to store additional information relating to where the file was captured. Site Objects can conditionally contain the CALIBRATION datastream if they represent an image type. The Site Object datastreams are illustrated below:

Features

  • Add Files & Directories
  • Open Files & Directories
  • Associate Files
  • Import Camera Calibration Settings
  • Import Co-ordinates
  • Remove Camera Calibration Settings
  • Sort Metadata
    • Contains
    • Identity
    • File
    • Tag
    • Status
  • Edit Metadata
  • Automation Fields
  • Automation Dictionary
  • Image Preview
    • Zoom-in
    • Zoom-out
    • Best fit
    • Normal size
  • Console Feedback
  • Persistence
    • Load Metadata
    • Save Metadata
  • Batch Modification of metadata
  • Automatic Title Generation
  • Transform Aluka Metadata
  • Export Metadata

Look and Feel

Console Feedback

Contains Sort

Edit Panel - Field Dictionary

Export Metadata Popup

Menu Category Items

Export Metadata Popup

Preset Automation Fields and Field Dictionary

Save Feedback

Table Selection Mechanism

Tag Sort

Transform Aluka Metadata

Validation and Progress Feedback

Batch Modify Popup

Results

Performance Testing

Performance testing was performed for the Metadata Management Tool as processing a large amount of data in a Java Applet proved to be significantly slow. The processing of data located on a shared drive is regarded as the most expensive process of the Metadata Management Tool.

Tasks that were identified as slow were making use of a recursive directory iterator in order to process data. Due to the processing of data being the source of performance issues, a number of alternative algorithms were considered in an attempt to optimize such tasks. One of the regularly used file directory count methods was examined and compared to these algorithms.

The algorithms were also compared whilst running in two different environments, namely a Java Applet and Java Application. This was done to identify if the cause of the problem was the Java Applet itself. Additionally, a comparison between using a simple heuristic as opposed to Java’s native File class’s isDirectory() method was evaluated. The simple heuristic consisted of a condition that considered digital objects with a “.” in their filename as files and the rest as directories.

The results obtained over 1000 iterations, resulting in a mean execution time are as follows:

Performance Test Results

Usability Testing

A usability test was conducted with 15 participants, consisting of University of Cape Town students. The System Usability Scale (SUS) was used to measure user’s perceived usability of the Metadata Management Tool.

The main results obtained are as follows:

Mean SUS Score Cronbach Alpha Std. Alpha Average R
73.12 0.85 0.84 5.5

User Acceptance Testing

In order to demonstrate that the project requirements had been met, a User Acceptance Test was conducted with the Zamani Project team. A list of requirement based test case criteria was compiled using the in-scope functionality of the Metadata Management Tool. This ensured that all of the functionality that was in-scope was demonstrated to the team. The requirement test case criteria are shown below.

ID Criteria
1.1 Open Meta-fy in the browser from the Web-based Backend
1.2 Automate fields using the Automate Tab
1.3 View the Console Tab after executing actions
1.4 Select a metadata item to view its preview image and use the Toolbar to manipulate the image as desired
1.5 Sort metadata by: Identity, File, Status, Tag, and Contains
1.6 Demonstrate Persistence by saving and loading a .metafy file
1.7 Open files & open directories (clears previously open files)
1.8 Add files & add directories (adds to previously open/added files)
1.9 Import coordinate triplets from a text file
1.10 Import camera calibration settings from a text file
1.11 Remove already added camera calibration settings
1.12 Calculate metadata titles for all metadata present in the Metadata Table
1.13 Associate new files from an Aluka transformed .metafy file
1.14 Validate metadata after associating new files
1.15 Modify a batch of metadata
1.16 Transform an Aluka generated .xml file into a .metafy file
1.17 Export a set of metadata

After the Metadata Management tool had been demonstrated, the team was asked to assess if the requirements had been met. The completed Test Results section is illustrated below:

IDs Pass/Fail Tested By Date Tested
1.1 - 1.17 Pass Zamani team 20/10/2014

Conclusion

The Metadata Management Tool component of the Zamani Data Archive was developed to assist with the difficult task of creating metadata for a large amount of data.

A subjective label for this component’s usability can be obtained with the use of the adjective rating scale and the component’s mean SUS score of 73.12. Thus the Metadata Management Tool’s usability is ‘Good’. There is a high level of internal consistency, this can be seen above, in the table in the Usability Testing section, with a Chronbach’s alpha of 0.7 ≤ α < 0.9.

The system was evaluated and determined to be a success. All of the functionality that was specified as in-scope and was a requirement of the Zamani team was successfully implemented and approved by the team. The team responded positively to the system and are interested in the system being implemented in a production environment in the future.