Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  17 Oct 2008, Konstantin Olchanski, Info, mlogger async transitions, etc 
    Reply  18 Oct 2008, Stefan Ritt, Info, mlogger async transitions, etc 
Message ID: 508     Entry time: 18 Oct 2008     In reply to: 507
Author: Stefan Ritt 
Topic: Info 
Subject: mlogger async transitions, etc 
> I suspect mlogger uses ASYNC transactions exactly to avoid
> this type of deadlock (mlogger used ASYNC transactions since svn revision 2, the
> beginning of time).

That's exactly the case. If you would have asked me, I would have told you 
immediately, but it is also good that you re-confirmed the deadlock behavior with 
the SYNC flag. I didn't check this for the last ten years or so.

Making the buffers bigger is only a partial solution. Assume that the disk gets 
slow for some reason, then any buffer will fill up and you get the dead lock.

The only real solution is to put the logic into a separate thread. So the thread 
does all the RPC communication with the clients, while the main logger thread logs 
data as usual in parallel. The problem is that the RPC layer is not yet completely 
tested to be thread safe. I put some mutex and you correctly realized that these 
are system wide, but you want a local mutex just for the logger process. You need 
also some basic communication between the "run stop thread" and the "logger main 
thread". Maybe Pierre remembers that once there was the problem that the logger did 
not know when all events "came down the pipe" and could close the file. He added 
some delay which helped most of the time. But if we would have some communication 
from the "run stop thread" telling the main thread that all programs except the 
logger have stopped the run, then the logger only has to empty the local system 
buffer and knows 100% that everything is done.

In the MEG experiment we have the same problem. We need a certain sequence 
(basically because we have 9 front-ends and one event builder, which has to be 
called after the front-ends). We realized quickly that the logger cannot stop the 
run, so we wrote a little tool "RunSubmit", which is a run sequence with scripting 
facility. So you write a XML file, telling RunSubmit to start 10 runs, each with 
5000 events. RunSubmit now watches the run statistics and stops the run. Since it's 
outside the logger process, there is no dead lock. Unfortunately RunSubmit was 
written by one of our students and contains some MEG specific code. Otherwise it 
could be committed to the distribution.

So I feel that a separate thread for run stop (and maybe even start) would be a 
good thing, but I'm not sure when I will have time to address this issue.

- Stefan
ELOG V3.1.4-2e1708b5