Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  23 Feb 2007, Konstantin Olchanski, Info, RFC- history system improvements 
    Reply  26 Feb 2007, Stefan Ritt, Info, RFC- history system improvements 
    Reply  16 Mar 2007, Konstantin Olchanski, Info, RFC- history system improvements 
Message ID: 347     Entry time: 23 Feb 2007     Reply to this: 349   363
Author: Konstantin Olchanski 
Topic: Info 
Subject: RFC- history system improvements 
While running the ALPHA experiment at CERN, we stressed and broke the MIDAS history system. We 
generated about 0.5 GB of history data per day, and this killed the performance of the history plot 
system in mhttpd - we had to wait for *minutes* to look at any plots of any variables.

One way to address this problem could be by changing the way ALPHA slow controls data is collected.

Another way to address this problem could be by improving the midas history system by removing 
some of the existing limitations and inefficiencies, enabling it to handle the ever increasing data 
volumes we keep throwing at it.

I feel the second approach (improving midas) is more useful in general and it appears that big 
improvements can be made by small modifications of existing code. No rewrites of midas are required. 
Read on.

Issue 1: in the mlogger, history is recorded with fairly coarse granularity.

For an equipment, if any varible changes, *all* variables for that equipment are written into the history 
file.

Historically, this worked fairly well for experiments with low data rates (a few history changes per 
minute) and with variables equally distributed between different equipments. But even for a modest 
sized experiment like TRIUMF-E614-TWIST, recording many variables when only one has changed has 
been a visible inefficiency. Current experiments wish to record more history data more frequently, but 
even with latest and greatest hardware, in the case of ALPHA, this inefficiency has become a 
performance killer.

One could solve this problem by refactoring the data (one variable per equipment/one equipment per 
variable). I find this approach inelegant and contrary to the "midas way" (whatever that is).

An alternative would be to change the mlogger to record history with per-variable granularity. When 
one variable changes, only that variable is recorded. Preliminary examination of the existing code 
indicates that history writing in the mlogger is already structured in a way that makes it easy to 
implement, while the history reading code does not seem to need any changes at all.

Issue 2: all history data is recorded into a single file.

Again, this has worked well historically. In fact, until not so long ago, it was the only sane way to record 
history data because operating systems could not efficiently write data into multiple files at the same 
time. Insifficient data buffering, suboptimal storage allocation strategies - all leading to bad 
performance. Latest Linux kernels have largely resolved all such issues.

The present problem arises when recording large amounts of history data (say 100 variables) and then 
making a history plot of 1 variable. Because data for the one variable of interest is spread across the 
whole file, effectively, the whole file has to be read into memory, data for 1 variable collected and data 
for the other 99 variables skipped.

In this case, a speed up by a factor of 100 could be obtained by recording (say) one variable per history 
file. (Yes, the history code does use "lseek", but the seek granularity of modern disks is very coarse and 
in my tests, reading the whole file (streaming) is almost faster than seeking through it).

One has to be very careful when looking at these numbers and running benchmarks. Modern computers 
with fast disks and large RAM performs very well no matter how history data is stored and organized. 
Performance problems surface only under the load when running the production system, when the 
disks are busy recording the main data stream and all RAM is consumed by user applications doing 
data analysis.

The obvious solution to this problem is to record each variable into a separate data file. This will 
require modifications to the history writing code in the mlogger and to the history reading code in 
mhttpd, mhist & co.

An extra challenge in this tast is to minimize changes to the existing code and to keep compatibility 
with the existing data files - new code should be able to read existing data files.

I propose to organize data into subdirectores:
history/equipmentNNN/variableVVV/YYMMDD.hst

This scheme does two good things for the history plotting in mhttpd:

1) note that mhttpd always plots one variable at a time, and the variables are addressed by equipment 
(int) and variable name (string) (plus the array index). In the proposed scheme, the code would know 
exactly which history file to open to get the data, no scanning of directories or seeking inside the 
history file.

2) when setting up mhttpd history plots, the code can easily see what equipment and variables exist 
and *ever existed*. The present code only examines the latest history file and cannot see variables that 
have been deleted (or not yet written into the existing file). For example, one cannot see variables that 
existed in the 2005 history but were removed (or renamed) in 2006. (Yes, it can be done by an expert 
using mhist to examine the 2005 history files and odbedit to manually setup the history plots).

Over the next few weeks, I will proceed with implementing these two improvements: (1) mlogger write 
history with per-variable granularity; (2) history file split into one-file-per-variable. If my initial 
assessment is correct and the changes indeed are small, contained, non-intrusive and compatible with 
existing history files, I will submit them for inclusion into mainline midas.

K.O.
ELOG V3.1.4-2e1708b5