Back Midas Rome Roody Rootana
  Midas DAQ System, Page 43 of 142  Not logged in ELOG logo
    Reply  13 Jan 2020, Konstantin Olchanski, Forum, frontend issues with midas-2019-09 
(please use the "plain" text, much easier to answer).

Hi, Peter, I think you misread the error message. There is no error about initialize_equipment(), the error 
is about interrupt_configure(). initialize_equipment() is just one of the functions that calls it.

The cause of the error most likely is a mismatch between the declaration of interrupt_configure() in 
mfe.h and the definition of this function in your program (in feMotor.c, I guess).

Sometimes this mismatch is hard to identify just by looking at the code.

One fool-proof method to debug this is to extract the actual function prototypes from your object files, 
both the declaration ("U") in mfe.o and the definition ("T") in your program should be identical:

nm feMotor.o |  grep -i interrupt | c++filt
0000000000000f90 T interrupt_configure(int, int, long) <--- should be this

nm ~/packages/midas/lib/libmfe.a | grep -i interrupt | c++filt
U interrupt_configure(int, int, long)

If they are different, you adjust your program until they match. One way to ensure the match is to copy 
the declaration from mfe.h into your program.

K.O.




<p>&nbsp;</p>

<table align="center" cellspacing="1" style="border:1px solid #486090; width:98%">
	<tbody>
		<tr>
			<td style="background-color:#486090">Peter Kunz wrote:</td>
		</tr>
		<tr>
			<td style="background-color:#FFFFB0">
			<p>After upgrading to the lastes MIDAS version I got the DAQ frontend of my application 
running by changing all compiler directives from cc to g++ and using</p>

			<p>#include &quot;mfe.h&quot;</p>

			<p>extern HNDLE hDB</p>

			<p>&nbsp;extern &quot;C&quot; {&nbsp;<br />
			&nbsp;#include &lt;CAENComm.h&gt; &nbsp;<br />
			&nbsp;}</p>

			<p>With these changes everything seems to work fine.</p>

			<p>However, I&#39;m having trouble with a slow control frontend using a tcpip driver. It 
compiled well with the older MIDAS version. Even though all the functions in question are defined in the 
frontend code, the following error comes up:</p>

			<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;">g++ -o feMotor 
-DOS_LINUX -Dextname -g -O2 -fPIC -Wall -Wuninitialized &nbsp;-fpermissive -
I/home/pkunz/packages/midas/include -I. -I/home/pkunz/packages/midas/drivers/bus &nbsp; 
/home/pkunz/packages/midas/lib/libmidas.a feMotor.o 
/home/pkunz/packages/midas/drivers/bus/tcpip.o cd_Galil.o 
/home/pkunz/packages/midas/lib/libmidas.a /home/pkunz/packages/midas/lib/mfe.o -lm -lz -lutil -lnsl -
lpthread -lrt<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
`initialize_equipment()&#39;:<br />
			/home/pkunz/packages/midas/src/mfe.cxx:687: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
`readout_enable(unsigned int)&#39;:<br />
			/home/pkunz/packages/midas/src/mfe.cxx:1236: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:1238: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `main&#39;:<br />
			/home/pkunz/packages/midas/src/mfe.cxx:2791: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:2792: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			collect2: error: ld returned 1 exit status<br />
			make: *** [Makefile:36: feMotor] Error 1</div>

			<p>&nbsp;I guess the the aforementioned DAQ frontend compiles because its equipment 
definitions don&#39;t call on the function `initialize_equipment()&#39;, but I can&#39;t figure out why 
it doesn&#39;t work. Help is appreciated. P.K.</p>
			</td>
		</tr>
	</tbody>
</table>

<p>&nbsp;</p>
    Reply  13 Jan 2020, Konstantin Olchanski, Forum, frontend issues with midas-2019-09 
(please use the "plain" text, much easier to answer).

Hi, Peter, I think you misread the error message. There is no error about 
initialize_equipment(), the error is about interrupt_configure(). initialize_equipment() is just 
one of the functions that calls it.

The cause of the error most likely is a mismatch between the declaration of 
interrupt_configure() in mfe.h and the definition of this function in your program (in 
feMotor.c, I guess).

Sometimes this mismatch is hard to identify just by looking at the code.

One fool-proof method to debug this is to extract the actual function prototypes from your 
object files, both the declaration ("U") in mfe.o and the definition ("T") in your program 
should be identical:

nm feMotor.o |  grep -i interrupt | c++filt
0000000000000f90 T interrupt_configure(int, int, long) <--- should be this

nm ~/packages/midas/lib/libmfe.a | grep -i interrupt | c++filt
U interrupt_configure(int, int, long)

If they are different, you adjust your program until they match. One way to ensure the 
match is to copy the declaration from mfe.h into your program.

K.O.




<p>&nbsp;</p>

<table align="center" cellspacing="1" style="border:1px solid #486090; width:98%">
	<tbody>
		<tr>
			<td style="background-color:#486090">Peter Kunz wrote:</td>
		</tr>
		<tr>
			<td style="background-color:#FFFFB0">
			<p>After upgrading to the lastes MIDAS version I got the DAQ frontend of my 
application running by changing all compiler directives from cc to g++ and using</p>

			<p>#include &quot;mfe.h&quot;</p>

			<p>extern HNDLE hDB</p>

			<p>&nbsp;extern &quot;C&quot; {&nbsp;<br />
			&nbsp;#include &lt;CAENComm.h&gt; &nbsp;<br />
			&nbsp;}</p>

			<p>With these changes everything seems to work fine.</p>

			<p>However, I&#39;m having trouble with a slow control frontend using a 
tcpip driver. It compiled well with the older MIDAS version. Even though all the functions in 
question are defined in the frontend code, the following error comes up:</p>

			<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;">g++ 
-o feMotor -DOS_LINUX -Dextname -g -O2 -fPIC -Wall -Wuninitialized &nbsp;-fpermissive 
-I/home/pkunz/packages/midas/include -I. -I/home/pkunz/packages/midas/drivers/bus 
&nbsp; /home/pkunz/packages/midas/lib/libmidas.a feMotor.o 
/home/pkunz/packages/midas/drivers/bus/tcpip.o cd_Galil.o 
/home/pkunz/packages/midas/lib/libmidas.a /home/pkunz/packages/midas/lib/mfe.o -lm -lz 
-lutil -lnsl -lpthread -lrt<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
`initialize_equipment()&#39;:<br />
			/home/pkunz/packages/midas/src/mfe.cxx:687: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
`readout_enable(unsigned int)&#39;:<br />
			/home/pkunz/packages/midas/src/mfe.cxx:1236: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:1238: undefined 
reference to `interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `main&#39;:
<br />
			/home/pkunz/packages/midas/src/mfe.cxx:2791: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:2792: undefined 
reference to `interrupt_configure(int, int, long)&#39;<br />
			collect2: error: ld returned 1 exit status<br />
			make: *** [Makefile:36: feMotor] Error 1</div>

			<p>&nbsp;I guess the the aforementioned DAQ frontend compiles because 
its equipment definitions don&#39;t call on the function `initialize_equipment()&#39;, but I 
can&#39;t figure out why it doesn&#39;t work. Help is appreciated. P.K.</p>
			</td>
		</tr>
	</tbody>
</table>

<p>&nbsp;</p>
    Reply  13 Jan 2020, Peter Kunz, Forum, frontend issues with midas-2019-09 
Thanks for explaining this, Konstantin. 
After updating the function to

INT interrupt_configure(INT cmd, INT source, POINTER_T adr)
{
  return CM_SUCCESS;
}

it compiled without errors. In the original code the "INT source" variable was missing.



> (please use the "plain" text, much easier to answer).
> 
> Hi, Peter, I think you misread the error message. There is no error about initialize_equipment(), the error 
> is about interrupt_configure(). initialize_equipment() is just one of the functions that calls it.
> 
> The cause of the error most likely is a mismatch between the declaration of interrupt_configure() in 
> mfe.h and the definition of this function in your program (in feMotor.c, I guess).
> 
> Sometimes this mismatch is hard to identify just by looking at the code.
> 
> One fool-proof method to debug this is to extract the actual function prototypes from your object files, 
> both the declaration ("U") in mfe.o and the definition ("T") in your program should be identical:
> 
> nm feMotor.o |  grep -i interrupt | c++filt
> 0000000000000f90 T interrupt_configure(int, int, long) <--- should be this
> 
> nm ~/packages/midas/lib/libmfe.a | grep -i interrupt | c++filt
> U interrupt_configure(int, int, long)
> 
> If they are different, you adjust your program until they match. One way to ensure the match is to copy 
> the declaration from mfe.h into your program.
> 
> K.O.
> 
> 
> 
> 
> <p>&nbsp;</p>
> 
> <table align="center" cellspacing="1" style="border:1px solid #486090; width:98%">
> 	<tbody>
> 		<tr>
> 			<td style="background-color:#486090">Peter Kunz wrote:</td>
> 		</tr>
> 		<tr>
> 			<td style="background-color:#FFFFB0">
> 			<p>After upgrading to the lastes MIDAS version I got the DAQ frontend of my application 
> running by changing all compiler directives from cc to g++ and using</p>
> 
> 			<p>#include &quot;mfe.h&quot;</p>
> 
> 			<p>extern HNDLE hDB</p>
> 
> 			<p>&nbsp;extern &quot;C&quot; {&nbsp;<br />
> 			&nbsp;#include &lt;CAENComm.h&gt; &nbsp;<br />
> 			&nbsp;}</p>
> 
> 			<p>With these changes everything seems to work fine.</p>
> 
> 			<p>However, I&#39;m having trouble with a slow control frontend using a tcpip driver. It 
> compiled well with the older MIDAS version. Even though all the functions in question are defined in the 
> frontend code, the following error comes up:</p>
> 
> 			<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;">g++ -o feMotor 
> -DOS_LINUX -Dextname -g -O2 -fPIC -Wall -Wuninitialized &nbsp;-fpermissive -
> I/home/pkunz/packages/midas/include -I. -I/home/pkunz/packages/midas/drivers/bus &nbsp; 
> /home/pkunz/packages/midas/lib/libmidas.a feMotor.o 
> /home/pkunz/packages/midas/drivers/bus/tcpip.o cd_Galil.o 
> /home/pkunz/packages/midas/lib/libmidas.a /home/pkunz/packages/midas/lib/mfe.o -lm -lz -lutil -lnsl -
> lpthread -lrt<br />
> 			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
> `initialize_equipment()&#39;:<br />
> 			/home/pkunz/packages/midas/src/mfe.cxx:687: undefined reference to 
> `interrupt_configure(int, int, long)&#39;<br />
> 			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
> `readout_enable(unsigned int)&#39;:<br />
> 			/home/pkunz/packages/midas/src/mfe.cxx:1236: undefined reference to 
> `interrupt_configure(int, int, long)&#39;<br />
> 			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:1238: undefined reference to 
> `interrupt_configure(int, int, long)&#39;<br />
> 			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `main&#39;:<br />
> 			/home/pkunz/packages/midas/src/mfe.cxx:2791: undefined reference to 
> `interrupt_configure(int, int, long)&#39;<br />
> 			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:2792: undefined reference to 
> `interrupt_configure(int, int, long)&#39;<br />
> 			collect2: error: ld returned 1 exit status<br />
> 			make: *** [Makefile:36: feMotor] Error 1</div>
> 
> 			<p>&nbsp;I guess the the aforementioned DAQ frontend compiles because its equipment 
> definitions don&#39;t call on the function `initialize_equipment()&#39;, but I can&#39;t figure out why 
> it doesn&#39;t work. Help is appreciated. P.K.</p>
> 			</td>
> 		</tr>
> 	</tbody>
> </table>
> 
> <p>&nbsp;</p>
    Reply  14 Jan 2020, Stefan Ritt, Forum, frontend issues with midas-2019-09 
We updated midas/examples/experiment/frontend.cxx to correctly contain

/*-- Interrupt configuration ---------------------------------------*/

INT interrupt_configure(INT cmd, INT source, POINTER_T adr)
{
   switch (cmd) {
   case CMD_INTERRUPT_ENABLE:
      break;
   case CMD_INTERRUPT_DISABLE:
      break;
   case CMD_INTERRUPT_ATTACH:
      break;
   case CMD_INTERRUPT_DETACH:
      break;
   }
   return SUCCESS;
}

but if you upgrade from C to C++ from your own old frontend code you might be hit by that issue. 

Maybe Konstantin can update elog:1526 to contain a hint about "INT source".

Stefan
    Reply  14 Jan 2020, Stefan Ritt, Forum, frontend issues with midas-2019-09 
Actually now I see that 

a4) poll_event() and interrupt_configure() have "source" as "int[]" instead of "int" (why did this work before?)

mention this already, but maybe it's not completely clear that one has to change "int[] source" to "int source"

Stefan


> We updated midas/examples/experiment/frontend.cxx to correctly contain
> 
> /*-- Interrupt configuration ---------------------------------------*/
> 
> INT interrupt_configure(INT cmd, INT source, POINTER_T adr)
> {
>    switch (cmd) {
>    case CMD_INTERRUPT_ENABLE:
>       break;
>    case CMD_INTERRUPT_DISABLE:
>       break;
>    case CMD_INTERRUPT_ATTACH:
>       break;
>    case CMD_INTERRUPT_DETACH:
>       break;
>    }
>    return SUCCESS;
> }
> 
> but if you upgrade from C to C++ from your own old frontend code you might be hit by that issue. 
> 
> Maybe Konstantin can update elog:1526 to contain a hint about "INT source".
> 
> Stefan
Entry  05 Feb 2024, Pavel Murat, Forum, forbidden equipment names ? 
Dear MIDAS experts,

I have multiple daq nodes with two data receiving FPGAs on the PCIe bus each. 
The FPGAs come under the names of DTC0 and DTC1. Both FPGAs are managed by the same slow control frontend. 
To distinguish FPGAs of different nodes from each other, I included the hostname to the equipment name, 
so for node=mu2edaq09 the FPGA names are 'mu2edaq09:DTC0' and 'mu2edaq09:DTC1'. 

The history system didn't like the names, complaining that 

21:26:06.334 2024/02/05 [Logger,ERROR] [mlogger.cxx:5142:open_history,ERROR] Equipment name 'mu2edaq09:DTC1' 
contains characters ':', this may break the history system

So the question is : what are the safe equipment/driver naming rules and what characters 
are not allowed in them? - I think this is worth documenting, and the current MIDAS docs at 
 
https://daq00.triumf.ca/MidasWiki/index.php/Equipment_List_Parameters#Equipment_Name 

don't say much about it.

-- many thanks, regards, Pasha
    Reply  13 Feb 2024, Konstantin Olchanski, Forum, forbidden equipment names ? 
> equipment names are 'mu2edaq09:DTC0' and 'mu2edaq09:DTC1'

I think all names permitted for ODB keys are allowed as equipment names, any valid UTF-8,
forbidden chars are "/" (ODB path separator) and "\0" (C string terminator). Maximum length
is 31 byte (plus "\0" string terminator). (Fixed length 32-byte names with implied terminator
are no longer permitted).

The ":" character is used in history plot definitions and we are likely eventually change that,
history event names used to be pairs of "equipment_name:tag_name" but these days with per-variable
history, they are triplets "equipment_name,variable_name,tag_name". The history plot editor
and the corresponding ODB entries need to be updated for this. Then, ":" will again be a valid
equipment name.

I think if you disable the history for your equipments, MIDAS will stop complaining about ":" in the name.

K.O.
Entry  19 Aug 2006, Konstantin Olchanski, Bug Fix, fixes for minor mhttpd problems 
I commited fix for minor mhttpd problems (rev 3314):
- for a newly created experiment, the "history" button gave the error [history
panel "" does not exist] (new problem introduced in revision 3150)
- for very long history panel names (close to the 32-character limit) history
plots produce the error "Cannot find /history/display/foo/bar/variables" (broke
in revision 3190 "use strlcpy()", in previous revisions, this bug was silent
stack corruption)
- elog attachments did not work for file names containing character plus (+)
(attachement URLs should be properly encoded to escape special CGI characters)
K.O.
    Reply  26 Aug 2006, Konstantin Olchanski, Bug Fix, fixes for minor mhttpd problems 
> I commited fix for minor mhttpd problems (rev 3314):
> - elog attachments did not work for file names containing character plus (+)
> (attachement URLs should be properly encoded to escape special CGI characters)

I accidentally indirectly learned that the above change produced incorrect URLs
when more than one experiment is defined. I now commited a fix to this problem.

K.O.
Entry  25 Feb 2005, Konstantin Olchanski, Bug Fix, fixed: double free in FORMAT_MIDAS ybos.c causing lazylogger crashes 
We stumbled upon and fixed a "double free" bug in src/ybos.c causing crashes in
lazylogger writing .mid files in the FORMAT_MIDAS format (why does it use
ybos.c? Pierre says- for generic file i/o). Why this code had ever worked before
remains a mystery. K.O.
    Reply  29 Dec 2010, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition? 
> 
> The only remaining problem when running my script is some kind of deadlock between the ODB and SYSMSG semaphores...
> 


I committed changes to odb.c and midas.c fixing a number of places that could corrupt ODB and SYSMSG data, and fixing a number of deadlocks. Without these 
changes, on my Mac, MIDAS will reliably corrupt ODB or deadlock while running my odbedit fork-bomb torture test script. These changes still need to be tested on 
Linux (but I do not expect any problems).

Because my changes do not fix the original race condition in client creation/removal/cleanup, you may still occasionally see messages like this:
13:35:14 [ODBEdit24,ERROR] [odb.c:2112:db_find_key,ERROR] hkey 169592 invalid key type 376
13:35:15 [ODBEdit28,ERROR] [odb.c:3268:db_get_value,ERROR] hkey 162072 entry "Name" is of type NULL, not STRING

For now, I am happy that we no longer corrupt ODB (nor deadlock) and I will work with Stefan on a permanent solution for this.

Special thanks go to the T2K/ND280 experiment, specifically, to Tim Nicholls and to the unnamed person who emailed me their script that executes many odbedit 
commands to setup midas history plots.


svn rev 4930
K.O.


P.S. Below is my torture test script, I usually run many of them in a sequence "./test1.perl >& xxx1; ./test1.perl >& xxx2; ... etc".

#!/usr/bin/perl -w
for (my $i=0; $i<50; $i++)
{
#my $cmd = "odbedit -c \'scl -w\' &";
#my $cmd = "odbedit -c \'ls -l /system/clients\' >& xxx$i &";
my $cmd = "odbedit -c \'ls -l /system/clients\' &";
system $cmd;
}
#end

svn rev 4930
K.O.
    Reply  11 Feb 2011, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition? 
> > 
> > The only remaining problem when running my script is some kind of deadlock between the ODB and SYSMSG semaphores...
> > 
> 
> For now, I am happy that we no longer corrupt ODB (nor deadlock) ...
>

Found one more deadlock between ODB and SYSMSG semaphores, this time through cm_watchdog():

If cm_watchdog somehow runs while we are holding the ODB semaphore, it will eventually try to lock SYSMSG (through bm_cleanup & co) in
violation of our semaphore locking order. If at the same time another application tries to lock stuff using the correct order (SYSMSG first, ODB last),
the two programs will deadlock (wait for each other forever). I presently have two copies of gdb attached to two copies of odbedit
waiting for each other in a deadlock through this cm_watchdog scenario...

Solution shall follow quickly, I have been hunting this deadlock for the last couple of weeks...

K.O.
    Reply  15 Feb 2011, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition? 
> Solution shall follow quickly, I have been hunting this deadlock for the last couple of weeks...

Over the last couple of days I made a series of commits to odb.c and midas.c to implement a buffer-based cm_msg()
and fix the latest deadlock problem, also to help with the race conditions in client creation and cleanup.

My torture test runs okey in my mac now, one remaining problem is spurious client removal caused
by semaphore starvation - I see 2-3-7-10 sec wait times for semaphores - probably caused by some
kind of unfairness in the MacOS SysV semaphore implementation (in a "fair" semaphore implementation,
the process that waited the longest would be woken up the first and one would never see semaphore wait
times measured in seconds). Probably worth investigating fairness of MacOS posix semaphores. On LInux
things are probably different and under normal running conditions one should not see any semaphore starvation.

I will be doing extensive tests of this update at TRIUMF, but I do not expect any problems. If you use this
version and see any anomalies, please report them as replies to this message or email me directly.

svn rev 4976
K.O.
    Reply  16 Feb 2011, Konstantin Olchanski, Bug Report, fixed. odb corruption, odb race condition? 
> My torture test runs okey in my mac now, one remaining problem is spurious client removal caused
> by semaphore starvation...

My torture test runs okey on Linux and I do not see any problems with spurious client removal - actually
I do not see any strange longs waits for semaphores that I was seeing on MacOS. Must be another
proof that MacOS is years behind Linux in kernel technology (but parsecs ahead in user experience)

K.O.
Entry  25 Oct 2013, Konstantin Olchanski, Bug Fix, fixed mlogger run auto restart bug 
A problem existed in midas for some time: when recording long data sets of time (or event) limited runs 
with logger run auto restart set to "yes", the runs will automatically stop and restart as expected, but 
sometimes the run will stop and never restart and beam will be lost until the experiment operator on shift 
wakes up and restarts the run manually.

I have now traced this problem to a race condition inside the mlogger - when a run is being stopped from 
the mlogger, the mlogger run transition handler (tr_stop) triggers an immediate attempt to start the next 
run, without waiting for the run-stop transition to actually complete. If the run-stop transition does not 
finish quickly enough, a safety check in start_the_run() will cause the run restart attempt to silently fail 
without any error message.

This race condition is pretty rare but somehow I managed to replicate it while debugging the 
multithreaded transitions. It is fixed by making mlogger wait until the run-stop transition completes.

https://bitbucket.org/tmidas/midas/commits/b2631fbed5f7b1ec80e8a6c8781ada0baed7702b

K.O.
    Reply  25 Oct 2013, Stefan Ritt, Bug Fix, fixed mlogger run auto restart bug 
> A problem existed in midas for some time: when recording long data sets of time (or event) limited runs 
> with logger run auto restart set to "yes", the runs will automatically stop and restart as expected, but 
> sometimes the run will stop and never restart and beam will be lost until the experiment operator on shift 
> wakes up and restarts the run manually.
> 
> I have now traced this problem to a race condition inside the mlogger - when a run is being stopped from 
> the mlogger, the mlogger run transition handler (tr_stop) triggers an immediate attempt to start the next 
> run, without waiting for the run-stop transition to actually complete. If the run-stop transition does not 
> finish quickly enough, a safety check in start_the_run() will cause the run restart attempt to silently fail 
> without any error message.
> 
> This race condition is pretty rare but somehow I managed to replicate it while debugging the 
> multithreaded transitions. It is fixed by making mlogger wait until the run-stop transition completes.
> 
> https://bitbucket.org/tmidas/midas/commits/b2631fbed5f7b1ec80e8a6c8781ada0baed7702b
> 
> K.O.

More generally I kind of consider the mlogger auto restart facility as deprecated. It works in the background and the operator does not have a clue 
what is going on. We use now the sequencer to achieve exactly the same functionality. It just requires a few lines of sequencer code:

LOOP INFINITE 
  TRANSITION start 
  WAIT events, 5000 
  TRANSITION stop 
ENDLOOP 

So the run start is only executed after the runs has been successfully stopped. You can do things in the sequencer like "stop run and sequence 
immediately" or "stop after current run has finished" which are a bit hard to do with the old method. So people should move to the sequencer.

/Stefan
    Reply  28 Oct 2013, Konstantin Olchanski, Bug Fix, fixed mlogger run auto restart bug 
> 
> More generally I kind of consider the mlogger auto restart facility as deprecated. It works in the background and the operator does not have a clue 
> what is going on. We use now the sequencer to achieve exactly the same functionality.
>

Before subruns were available, most experiments at TRIUMF have used the "auto restart" function. Now, I think most of them use subruns,
with the notable exception of PIENU where the analysis framework could not handle subruns. (PIENU is now shutdown and disassembled).

>
> It just requires a few lines of sequencer code:
> 
> LOOP INFINITE 
>   TRANSITION start 
>   WAIT events, 5000 
>   TRANSITION stop 
> ENDLOOP 
> 

Mouse click "auto restart" to "yes" is a little bit simpler than setting up a sequencer file, and it survives a crash of mhttpd.

Does the sequencer survive a crash or a restart of mhttpd?

K.O.
    Reply  28 Oct 2013, Stefan Ritt, Bug Fix, fixed mlogger run auto restart bug 
> Does the sequencer survive a crash or a restart of mhttpd?

Yes. Of course runs will not be started/stopped when mhttpd is not running, but when you restart it gracefully continues where it stopped, since all variables such as event count or current line number of 
the sequence are store in the ODB.

/Stefan
Entry  05 May 2005, Konstantin Olchanski, Bug Fix, fix: minor bit rot in the example experiment 
I fixed some minor bit rot in the example experiment: a few minor Makefile
problems, make the analyzer use the current histogram creation macros, etc. I
also added startup and shutdown scripts. These will be documented as we work
through them with our Summer student. K.O.
Entry  18 Aug 2005, Konstantin Olchanski, Bug Fix, fix race condition between clients on run start/stop, pause/resume 
It turns out that the new priority sequencing of run state transitions had a
flaw: the frontends, the analyzer and the logger all registered at priority 500
and were invoked in essentially a random order. For example the frontend could
get a begin-run transition before the logger and so start sending data before
the logger opened the output file. Same for the analyzer and same for the end of
run. Also the sequencing for pause/resume run and begin/end run was different
when the two pairs ought to have identical sequencing.

I now commited changes to mana.c and mlogger.c changing their transition sequencing:

start and resume:
200 - logger (mlogger.c, no change)
300 - analyzer (mana.c, was 500)
500 - frontends (mfe.c, no change)

stop and pause:
500 - frontends (mfe.c, no change)
700 - analyzer (mana.c, was 500)
800 - mlogger (mlogger.c, was 500)

P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
see an anomaly in the analyzer, where it receives data events after the
end-of-run transition.

K.O.
ELOG V3.1.4-2e1708b5