UCT CS Research Document Archive

Panopticon: A Scalable Monitoring System

Clough, Duncan, Stefano Riviera, Michelle Kuttel, Vincent Geddes and Patrick Marais (2010) Panopticon: A Scalable Monitoring System. In Proceedings Proceedings of South African Institute for Computer Scientists and Information Technologists Conference (SAICSIT 2010).

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

Monitoring systems are necessary for the management of anything beyond the smallest networks of computers. While specialised monitoring systems can be deployed to detect specific problems, more general systems are required to detect unexpected issues, and track performance trends.
While large fleets of computers are becoming more common, few existing, general monitoring systems have the capability to scale to monitor these very large networks. There is also an absence of systems in the literature that cater for visualisation of monitoring information on a large scale.
Scale is an issue in both the design and presentation of large-scale monitoring systems. We discuss Panopticon, a monitoring system that we have developed, which can scale to monitor tens of thousands of nodes, using only commodity equipment. In addition, we propose a novel method for visualising monitoring information on a large scale, based on general techniques for visualising massive multi-dimensional datasets.
The monitoring system is shown to be able to collect information from up to 100 000 nodes. The storage system is able to record and output information from up to 25 000 nodes, and the visualisation is able to simultaneously display all this information for up to 20 000 nodes. Optimisations to our storage system could allow it to scale a little further, but a distributed storage approach combined with intelligent filtering algorithms would be necessary for significant improvements in scalability.

EPrint Type:Conference Paper
Subjects:K Computing Milieux: K.6 MANAGEMENT OF COMPUTING AND INFORMATION SYSTEMS
C Computer Systems Organization: C.5 COMPUTER SYSTEM IMPLEMENTATION
C Computer Systems Organization: C.4 PERFORMANCE OF SYSTEMS
ID Code:618
Deposited By:Kuttel, Michelle
Deposited On:23 September 2010