Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  24 Jan 2013, Konstantin Olchanski, Info, Compression benchmarks 
    Reply  06 Feb 2013, Stefan Ritt, Info, Compression benchmarks 
Message ID: 854     Entry time: 24 Jan 2013     Reply to this: 858
Author: Konstantin Olchanski 
Topic: Info 
Subject: Compression benchmarks 
In the DEAP experiment, the normal MIDAS mlogger gzip compression  is not fast enough for some data 
taking modes, so I am doing tests of other compression programs. Here is the results.

Executive summary:

fastest compression is no compression (cat at 1800 Mbytes/sec - memcpy speed), next best are:
"lzf" at 300 Mbytes/sec and  "lzop" at 250 Mbytes/sec with 50% compression
"gzip -1" at around 70 Mbytes/sec with around 70% compression
"bzip2" at around 12 Mbytes/sec with around 80% compression
"pbzip2", as advertised, scales bzip2 compression linearly with the number of CPUs to 46 Mbytes/sec (4 
real CPUs), then slower to a maximum 60 Mbytes/sec (8 hyper-threaded CPUs).

This confirms that our original choice of "gzip -1" method for compression using zlib inside mlogger is 
still a good choice. bzip2 can gain an additional 10% compression at the cost of 6 times more CPU 
utilization. lzo/lzf can do 50% compression at GigE network speed and at "normal" disk speed.

I think these numbers make a good case for adding lzo/lzf compression to mlogger.

Comments about the data:

- time measured is the "elapsed" time of the compression program. it excludes the time spent flushing 
the compressed output file to disk.
- the relevant number is the first rate number (input data rate)
- test machine has 32GB of RAM, so all I/O is cached, disk speed does not affect these results
- "cat" gives a measure of overall machine "speed" (but test file is too small to give precise measurement)
- "gzip -1" is the recommended MIDAS mlogger compression setting
- "pbzip2 -p8" uses 8 "hyper-threaded" CPUs, but machine only has 4 "real" CPU cores

<pre>
cat                 : time   0.2s, size    431379371    431379371, comp   0%, rate 1797M/s 1797M/s
cat                 : time   0.6s, size   1013573981   1013573981, comp   0%, rate 1809M/s 1809M/s
cat                 : time   1.1s, size   2027241617   2027241617, comp   0%, rate 1826M/s 1826M/s

gzip -1             : time   6.4s, size    431379371    141008293, comp  67%, rate  67M/s  22M/s
gzip                : time  30.3s, size    431379371    131017324, comp  70%, rate  14M/s   4M/s
gzip -9             : time  94.2s, size    431379371    133071189, comp  69%, rate   4M/s   1M/s

gzip -1             : time  15.2s, size   1013573981    347820209, comp  66%, rate  66M/s  22M/s
gzip -1             : time  29.4s, size   2027241617    638495283, comp  69%, rate  68M/s  21M/s

bzip2 -1            : time  34.4s, size    431379371     91905771, comp  79%, rate  12M/s   2M/s
bzip2               : time  33.9s, size    431379371     86144682, comp  80%, rate  12M/s   2M/s
bzip2 -9            : time  34.2s, size    431379371     86144682, comp  80%, rate  12M/s   2M/s

pbzip2 -p1          : time  34.9s, size    431379371     86152857, comp  80%, rate  12M/s   2M/s (1 CPU)
pbzip2 -p1 -1       : time  34.6s, size    431379371     91935441, comp  79%, rate  12M/s   2M/s
pbzip2 -p1 -9       : time  34.8s, size    431379371     86152857, comp  80%, rate  12M/s   2M/s

pbzip2 -p2          : time  17.6s, size    431379371     86152857, comp  80%, rate  24M/s   4M/s (2 CPU)
pbzip2 -p3          : time  11.9s, size    431379371     86152857, comp  80%, rate  36M/s   7M/s (3 CPU)
pbzip2 -p4          : time   9.3s, size    431379371     86152857, comp  80%, rate  46M/s   9M/s (4 CPU)
pbzip2 -p4          : time  45.3s, size   2027241617    384406870, comp  81%, rate  44M/s   8M/s
pbzip2 -p8          : time  33.3s, size   2027241617    384406870, comp  81%, rate  60M/s  11M/s

lzop -1             : time   1.6s, size    431379371    213416336, comp  51%, rate 261M/s 129M/s
lzop                : time   1.7s, size    431379371    213328371, comp  51%, rate 249M/s 123M/s
lzop                : time   4.3s, size   1013573981    515317099, comp  49%, rate 234M/s 119M/s
lzop                : time   7.3s, size   2027241617    978374154, comp  52%, rate 277M/s 133M/s
lzop -9             : time 176.6s, size    431379371    157985635, comp  63%, rate   2M/s   0M/s

lzf                 : time   1.4s, size    431379371    210789363, comp  51%, rate 299M/s 146M/s
lzf                 : time   3.6s, size   1013573981    523007102, comp  48%, rate 282M/s 145M/s
lzf                 : time   6.7s, size   2027241617    972953255, comp  52%, rate 303M/s 145M/s

lzma -0             : time  27s, size    431379371    112406964, comp  74%, rate  15M/s   4M/s
lzma -1             : time  35s, size    431379371    111235594, comp  74%, rate  12M/s   3M/s
lzma: > 5 min, killed

xz -0               : time  28s, size    431379371    112424452, comp  74%, rate  15M/s   4M/s
xz -1               : time  35s, size    431379371    111252916, comp  74%, rate  12M/s   3M/s
xz: > 5 min, killed
</pre>

Columns are:
compression program
time: elapsed time of the compression program (excludes the time to flush output file to disk)
size: size of input file, size of output file
comp: compression ration (0%=no compression, 100%=file compresses into nothing)
rate: input data rate (size of input file divided by elapsed time), output data rate (size of output file 
divided by elapsed time)

Machine used for testing (from /proc/cpuinfo):
Intel(R) Core(TM) i7-3820 CPU @ 3.60GHz
quad core cpu with hyper-threading (8 CPU total)
32 GB quad-channel DDR3-1600.

Script used for testing:

#!/usr/bin/perl -w

my $x = join(" ", @ARGV);

my $in  = "test.mid";
my $out = "test.mid.out";
my $tout = "test.time";

my $cmd = "/usr/bin/time -o $tout -f \"%e\" /usr/bin/time $x < test.mid > test.mid.out";

print $cmd,"\n";

my $t0 = time();
system $cmd;
my $t1 = time();

my $c = `cat $tout`;
print "Elapsed time: $c";

my $t = $c;

#system "/bin/ls -l $in $out";

my $sin  = -s $in;
my $sout = -s $out;

my $xt = $t1-$t0;
$xt = 1 if $xt<1;

print "Total time: $xt\n";

print sprintf("%-20s: time %5.1fs, size %12d %12d, comp %3.0f%%, rate %3dM/s %3dM/s", $x, $t, $sin, 
$sout, 100*($sin-$sout)/$sin, ($sin/$t)/1e6, ($sout/$t)/1e6), "\n";

exit 0;
# end

Typical output:

[deap@deap00 pet]$ ./r.perl lzf    
/usr/bin/time -o test.time -f "%e" /usr/bin/time lzf < test.mid > test.mid.out
1.27user 0.15system 0:01.44elapsed 99%CPU (0avgtext+0avgdata 2800maxresident)k
0inputs+411704outputs (0major+268minor)pagefaults 0swaps
Elapsed time: 1.44
Total time: 3
lzf                 : time   1.4s, size    431379371    210789363, comp  51%, rate 299M/s 146M/s

K.O.
ELOG V3.1.4-2e1708b5