In the ALPHA experiment at CERN I found a problem in mfe.c handling of multithreaded equipments. This problem was in
some forms introduced around May 2013 and around Aug 2013 (commit
https://bitbucket.org/tmidas/midas/src/45984c35b4f7/src/mfe.c) (I hope I got it right).
The effect was very odd - if event rate of multithreaded equipment was more than 100 Hz, the event counters on the midas
status page would not increment and the frontend will crash on end of run. Other than that, all the events from the
multithreaded equipment seem to appear in the SYSTEM buffer and in the data file normally.
This happened: in mfe.c::receive_trigger_event() a loop was introduced (previously,
there was no loop there - there was and still is a loop outside of receive_trigger_event()):
while (1)
wait 10 ms for an event
process event, loop back
if there is no event, exit
}
Obviously, if the event rate is more than 100 Hz (repetition rate less than 10 ms),
the 10 ms wait will always return an event and we will never exit this loop.
So the mfe.c main loop is now stuck here and will not process any periodic activity
such as updating the equipment statistics (event counters on the midas status page)
or running periodic equipments in the same front end program.
The crash at the end of run will be caused by a timeout in responding to the "end of run" RPC call.
I have a patch in testing that solves this problem by restoring receive_trigger_event() to the original configuration, i.e.
https://bitbucket.org/tmidas/midas/src/6899b96a4f8177d4af92035cd84aadf5a7cbc875/src/mfe.c?at=develop
K.O. |