Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  23 Dec 2010, Konstantin Olchanski, Bug Report, odb corruption, odb race condition? 
    Reply  24 Dec 2010, Konstantin Olchanski, Bug Report, odb corruption, odb race condition? 
       Reply  24 Dec 2010, Konstantin Olchanski, Bug Report, odb corruption, odb race condition? 
          Reply  26 Dec 2010, Konstantin Olchanski, Bug Report, race condition and deadlock between ODB lock and SYSMSG lock in cm_msg() 
             Reply  29 Dec 2010, Konstantin Olchanski, Bug Report, use of nested locks in MIDAS 
          Reply  29 Dec 2010, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition? 
             Reply  11 Feb 2011, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition? 
                Reply  15 Feb 2011, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition? 
                   Reply  16 Feb 2011, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition? 
Message ID: 743     Entry time: 15 Feb 2011     In reply to: 741     Reply to this: 750
Author: Konstantin Olchanski 
Topic: Bug Report 
Subject: fixed. odb corruption, odb race condition? 
> Solution shall follow quickly, I have been hunting this deadlock for the last couple of weeks...

Over the last couple of days I made a series of commits to odb.c and midas.c to implement a buffer-based cm_msg()
and fix the latest deadlock problem, also to help with the race conditions in client creation and cleanup.

My torture test runs okey in my mac now, one remaining problem is spurious client removal caused
by semaphore starvation - I see 2-3-7-10 sec wait times for semaphores - probably caused by some
kind of unfairness in the MacOS SysV semaphore implementation (in a "fair" semaphore implementation,
the process that waited the longest would be woken up the first and one would never see semaphore wait
times measured in seconds). Probably worth investigating fairness of MacOS posix semaphores. On LInux
things are probably different and under normal running conditions one should not see any semaphore starvation.

I will be doing extensive tests of this update at TRIUMF, but I do not expect any problems. If you use this
version and see any anomalies, please report them as replies to this message or email me directly.

svn rev 4976
K.O.
ELOG V3.1.4-2e1708b5