Using TDB in Greenstone to Support Scalable Digital Libraries

Thompson, John and Bainbridge, David and Suleman, Hussein (2011) Using TDB in Greenstone to Support Scalable Digital Libraries, Proceedings of Fourth Workshop on Very Large Digital Libraries (VLDL) 2011, 29 September 2011, Berlin, Germany, ISTI.

[img] PDF
VLDL2011-Paper_4.pdf

Download (106kB)

Abstract

This paper reports on performance improvements in the open source Greenstone digital library software that resulted from a more detailed understanding of the demands made of its database component, when building large collections. The work was undertaken as part of a larger drive to support parallel processing during the ingest of Very Large Digital Library (VLDL) collections using this software. In terms of a database requirement, Greenstone is set apart from many other digital library solutions, by default using only a flat-file database component to operate, as opposed to a full-blown relational database. However, despite the simplicity of this type of database, our review of the literature revealed that little is known about how this type of database performs in a digital library context when a high volume of data is processed. Through the work presented here we make some inroads that address this imbalance; we also show how utilizing a transaction-based flat-file database (that supports parallel reader/writer access) dramatically improves performance not only in parallel processing, but in the current non-parallel process as well.

Item Type: Conference paper
Date Deposited: 12 Dec 2011
Last Modified: 10 Oct 2019 15:33
URI: http://pubs.cs.uct.ac.za/id/eprint/745

Actions (login required)

View Item View Item