I redid the tests from Konstantin for our MEG experiment at PSI. The event structure is different, so it
is interesting how the two different experiments compare. We have an event size of 2.4 MB and a trigger
rate of ~10 Hz, so we produce a raw data rate of 24 MB/sec. A typical run contains 2000 events, so has a
size of 5 GB. Here are the results:
cat : time 7.8s, size 4960156030 4960156030, comp 0%, rate 639M/s 639M/s
gzip -1 : time 147.2s, size 4960156030 2468073901, comp 50%, rate 33M/s 16M/s
pbzip2 -p1 : time 679.6s, size 4960156030 1738127829, comp 65%, rate 7M/s 2M/s (1 CPU)
pbzip2 -p8 : time 96.1s, size 4960156030 1738127829, comp 65%, rate 51M/s 18M/s (8 CPU)
As one can see, our compression ratio is poorer (due to the quasi random noise in our waveforms), but the
difference between gzip -1 and pbzip2 is larger (15% instead 10% for DEAP). The single CPU version of
pbzip cannot sustain our DAQ rate of 24 MB, but the parallel version can. Actually we have a somehow old
dual-core dual-CPU board 2.5 GHz Xenon box, and make 8 hyper-threading CPUs out of the total 4 cores.
Interestingly the compression rate scales with 7.3 for 8 virtual cores, so hyper-threading does its job.
So we take all our data with the pbzip2 compression. The additional 15% as compared with gzip does
not sound much, but we produce raw 250 TB/year. So gzip gives us 132 TB/year and pbzip2 gives
us 98 TB/year, and we save quite some disks.
Note that you can run bzip2 (as all the other methods) already now with the current logger, if you specify
an external compression program in the ODB using the pipe functionality:
local:MEG:S]/>cd Logger/Channels/0/Settings/
[local:MEG:S]Settings>ls
Active y
Type Disk
Filename |pbzip2>/megdata/run%06d.mid.bz2
Format MIDAS
Compression 0
ODB dump y
Log messages 0
Buffer SYSTEM
Event ID -1
Trigger mask -1
Event limit 0
Byte limit 0
Subrun Byte limit 0
Tape capacity 0
Subdir format
Current filename /megdata/run197090.mid.bz2
</pre> |