UCT CS Research Document Archive

Efficient Compression of Molecular Dynamics Trajectory Files

Marais, Patrick, Julian Kenwood, Keegan Carruthers Smith, Michelle M Kuttel and James Gain (2012) Efficient Compression of Molecular Dynamics Trajectory Files. Journal of Computational Chemistry 33(27):2131-2141.

Full text available as:

Abstract

We investigate whether specific properties of molecular dynamics trajectory files can be exploited to achieve effective file compression. We explore two classes of lossy, quantized compression scheme: "interframe" predictors, which exploit temporal coherence between successive frames in a simulation, and more complex "intraframe" schemes, which compress each frame independently. Our interframe predictors are fast, memory-efficient and well suited to on-the-fly compression of massive simulation data sets, and significantly outperform the benchmark BZip2 application. Our schemes are configurable: atomic positional accuracy can be sacrificed to achieve greater compression. For high fidelity compression, our linear interframe predictor gives the best results at very little computational cost: at moderate levels of approximation (12-bit quantization, maximum error ~ 10−2 Å), we can compress a 1–2 fs trajectory file to 5–8% of its original size. For 200 fs time steps—typically used in fine grained water diffusion experiments—we can compress files to ~25% of their input size, still substantially better than BZip2. While compression performance degrades with high levels of quantization, the simulation error is typically much greater than the associated approximation error in such cases

EPrint Type:Journal (Paginated)
Keywords:MD trajectory files
compression
arithmetic coding
interframe prediction
Subjects:J Computer Applications: J.3 LIFE AND MEDICAL SCIENCES
I Computing Methodologies: I.m MISCELLANEOUS
E Data: E.4 CODING AND INFORMATION THEORY
ID Code:803
Deposited By:Marais, Patrick
Deposited On:22 October 2012