> > I am recording here the results from a test VME system using two VF48 waveform digitizers
(I now have 4 VF48 waveform digitizers, so the event rates are half of those reported before. Date rate
is up to 51 M/s - event size has doubled, per-event overhead is the same, so the effective data rate goes
up).
This message demonstrates the effects of tuning the MIDAS system for high rate data taking.
Attached is the history plot of the event rate counters which show the real-time performance of the MIDAS
system with better detail compared to the average event rate reported on the MIDAS status page. For an
ideal real-time system, the event rate should be a constant, without any drop-outs.
Seen on the plot:
run 75: the periodic dropouts in the event rate correspond to the lazylogger writing data into HADOOP
HDFS. Clearly the host computer cannot keep up with both data taking and data archiving at the same
time. (see the output of "top" "with HDFS" and "without HDFS" below)
run 76: SYSTEM buffer size increased from 100Mbytes to 300Mbytes. Maybe there is an improvement.
run 77-78: "event_buffer_size" inside the multithreaded (EQ_MULTITHREAD) VME frontend increased from
100Mbytes to 300Mbytes. (6 seconds of data at 50M/s). Much better, yes?
Conclusion: for improved real-time performance, there should be sufficient buffering between the VME
frontend readout thread and the mlogger data compression thread.
For benchmark hardware, at 50M/s, 4 seconds of buffer space (100M in the SYSTEM buffer and 100M in
the frontend) is not enough. 12 seconds of buffer space (300+300) is much better. (Or buy a faster
backend computer).
P.S. HDFS data rate as measured by lazylogger is around 20M/s for CDH3 HADOOP and around 30M/s for
CDH4 HADOOP.
P.S. Observe the ever present unexplained event rate fluctuations between 130-140 event/sec.
K.O.
---- "top" output during normal data taking, notice mlogger data compression consumes 99% CPU at 51
M/s data rate.
top - 08:55:22 up 72 days, 17:00, 5 users, load average: 2.47, 2.32, 2.27
Tasks: 206 total, 2 running, 204 sleeping, 0 stopped, 0 zombie
Cpu(s): 52.2%us, 6.1%sy, 0.0%ni, 34.4%id, 0.8%wa, 0.1%hi, 6.2%si, 0.0%st
Mem: 3925556k total, 3064928k used, 860628k free, 3788k buffers
Swap: 32766900k total, 200704k used, 32566196k free, 2061048k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5826 trinat 20 0 437m 291m 287m R 97.6 7.6 636:39.63 mlogger
27617 trinat 20 0 310m 288m 288m S 24.6 7.5 6:59.28 mserver
1806 ganglia 20 0 415m 62m 1488 S 0.9 1.6 668:43.55 gmond
--- "top" output during lazylogger/HDFS activity. Observe high CPU use by lazylogger and fuse_dfs (the
HADOOP HDFS client). Observe that CPU use adds up to 167% out of 200% available.
top - 08:57:16 up 72 days, 17:01, 5 users, load average: 2.65, 2.35, 2.29
Tasks: 206 total, 2 running, 204 sleeping, 0 stopped, 0 zombie
Cpu(s): 57.6%us, 23.1%sy, 0.0%ni, 8.1%id, 0.0%wa, 0.4%hi, 10.7%si, 0.0%st
Mem: 3925556k total, 3642136k used, 283420k free, 4316k buffers
Swap: 32766900k total, 200692k used, 32566208k free, 2597752k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5826 trinat 20 0 437m 291m 287m R 68.7 7.6 638:24.07 mlogger
23450 root 20 0 1849m 200m 4472 S 64.4 5.2 75:35.64 fuse_dfs
27617 trinat 20 0 310m 288m 288m S 18.5 7.5 7:22.06 mserver
26723 trinat 20 0 38720 11m 1172 S 17.9 0.3 22:37.38 lazylogger
7268 trinat 20 0 1007m 35m 4004 D 1.3 0.9 187:14.52 nautilus
1097 root 20 0 0 0 0 S 0.8 0.0 101:45.55 md3_raid1 |