29 Dec 2010, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition?
|
>
> The only remaining problem when running my script is some kind of deadlock between the ODB and SYSMSG semaphores...
>
|
11 Feb 2011, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition?
|
> >
> > The only remaining problem when running my script is some kind of deadlock between the ODB and SYSMSG semaphores...
> >
|
15 Feb 2011, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition?
|
> Solution shall follow quickly, I have been hunting this deadlock for the last couple of weeks...
Over the last couple of days I made a series of commits to odb.c and midas.c to implement a buffer-based cm_msg()
|
16 Feb 2011, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition?
|
> My torture test runs okey in my mac now, one remaining problem is spurious client removal caused
> by semaphore starvation...
|
25 Oct 2013, Konstantin Olchanski, Bug Fix, fixed mlogger run auto restart bug
|
A problem existed in midas for some time: when recording long data sets of time (or event) limited runs
with logger run auto restart set to "yes", the runs will automatically stop and restart as expected, but
sometimes the run will stop and never restart and beam will be lost until the experiment operator on shift
|
25 Oct 2013, Stefan Ritt, Bug Fix, fixed mlogger run auto restart bug
|
> A problem existed in midas for some time: when recording long data sets of time (or event) limited runs
> with logger run auto restart set to "yes", the runs will automatically stop and restart as expected, but
> sometimes the run will stop and never restart and beam will be lost until the experiment operator on shift
|
28 Oct 2013, Konstantin Olchanski, Bug Fix, fixed mlogger run auto restart bug
|
>
> More generally I kind of consider the mlogger auto restart facility as deprecated. It works in the background and the operator does not have a clue
> what is going on. We use now the sequencer to achieve exactly the same functionality.
|
28 Oct 2013, Stefan Ritt, Bug Fix, fixed mlogger run auto restart bug
|
> Does the sequencer survive a crash or a restart of mhttpd?
Yes. Of course runs will not be started/stopped when mhttpd is not running, but when you restart it gracefully continues where it stopped, since all variables |
05 May 2005, Konstantin Olchanski, Bug Fix, fix: minor bit rot in the example experiment
|
I fixed some minor bit rot in the example experiment: a few minor Makefile
problems, make the analyzer use the current histogram creation macros, etc. I
also added startup and shutdown scripts. These will be documented as we work
|
18 Aug 2005, Konstantin Olchanski, Bug Fix, fix race condition between clients on run start/stop, pause/resume
|
It turns out that the new priority sequencing of run state transitions had a
flaw: the frontends, the analyzer and the logger all registered at priority 500
and were invoked in essentially a random order. For example the frontend could
|
01 Sep 2005, Stefan Ritt, Bug Fix, fix race condition between clients on run start/stop, pause/resume
|
> It turns out that the new priority sequencing of run state transitions had a
> flaw: the frontends, the analyzer and the logger all registered at priority 500
> and were invoked in essentially a random order. For example the frontend could
|
02 Aug 2005, Konstantin Olchanski, Bug Fix, fix odb corruption when running analzer for the first time
|
I have been plagued by ODB corruption when I run the analyzer for the first time
after setting up the new experiment. Some time ago, I traced this to
mana.c::book_ttree() and now I found and fixed the bug, fix now commited to
|
20 Nov 2009, Konstantin Olchanski, Bug Fix, fix odb corruption from too long client names
|
odb.c rev 4622 fixes ODB corruption by db_connect_database() if client_name is
too long. Also fixed is potential ODB corruption by too long key names in
db_create_key(). Problem kindly reported by Tim Nichols of T2K/ND280 experiment.
|
22 Mar 2022, Konstantin Olchanski, Bug Fix, fix for event buffer corruption in bm_flush_cache()
|
multithreaded frontends have an unusual event buffer corruption if the write
cache is enabled. For a long time now I had to disable the write cache on
all multithreaded frontends in alpha-g, I was hitting this bug quite often.
|
22 Mar 2022, Stefan Ritt, Bug Fix, fix for event buffer corruption in bm_flush_cache()
|
Thanks Konstantin for your detailed description.
I wonder why we never saw this problem at PSI. Here is the reason: In multil-threaded environments, we never call bm_send_event() directly
|
23 Mar 2022, Ivo Schulthess, Bug Fix, fix for event buffer corruption in bm_flush_cache()
|
Thanks for the investigation. Back in 2020, we had some issues of losing data between the system buffer and the logger writing them to disk (https://daq00.triumf.ca/elog-midas/Midas/1966).
This was polled equipment but we had a multithreaded FE running at the same time. Could this be related to the same problem?
|
23 Mar 2022, Konstantin Olchanski, Bug Fix, fix for event buffer corruption in bm_flush_cache()
|
I confirm, there is no problem in single-threaded programs, and
there is no problem if all bm_send_event() and bm_flush_cache() are called
from the same thread.
|
24 Mar 2022, Stefan Ritt, Bug Fix, fix for event buffer corruption in bm_flush_cache()
|
> > ... instead of struggling with all your locks.
>
> it is better to have midas fully thread safe. ODB has been so for a long time,
|
24 Mar 2022, Konstantin Olchanski, Bug Fix, fix for event buffer corruption in bm_flush_cache()
|
> Thanks for the investigation. Back in 2020, we had some issues
> of losing data between the system buffer and the logger writing them
> to disk (https://daq00.triumf.ca/elog-midas/Midas/1966). This was polled equipment
|
25 May 2006, Konstantin Olchanski, Bug Fix, fix crash in xml odb load
|
There is a crash in odbedit when loading some xml odb files: a missing check for NULL pointer when
loading an array of strings and one of the array elements is blank. This check is present when loading
other string values. Here is the diff:
|