Back Midas Rome Roody Rootana
  Midas DAQ System, Page 42 of 138  Not logged in ELOG logo
ID Date Author Topic Subjectdown
  25   14 Jul 2004 Stefan Ritt future direction discussion?
Have changed your entry as Non-HTML (easier to reply to...)

Here are some "initial" comments, by no means complete...

> Are we ready for 2.0? 
> Stefan - do you have any ideas/enhancements?

A big thing along the horizon I see is the ROME environment
(http://midas.psi.ch/rome/). So we will move away from PAW to ROOT. Although
the DAQ part will stay untouched, the whole analysis back-end changes,
including some XML configuration and MySQL support. I guess that would justify
a 2.0. I will discuss this at TRIUMF when I come in September, see how useful
ROME is for other users...

> 1) For one I would like to explore memory mapping (mmap()) on Linux
> - I've used it once upon a time on DEC OSF/1 and I found it really
> nice compared to shared memory. 
> From a user standpoint it behaves as a shared memory but is mapped 
> to a real file that can be easily "removed" when neccessary. 
> One really annoying thing in MIDAS is when it goes ballistic
> the cleanup which is somewhat tricky. 
> The question if there is any performance penalty associated

I guess there are no performance penalties, since under the hood both
techniques are handled similarly. The problem is that besides share memories
you need also semaphores to controll exclusive access to the memory, either
shm() funcitons or mmap(), so this would only fix half of the problem. I seem
to remember that mmap() was not available on some Ultix systems or so, but I
guess that's obsolete by now...

> 2) Expanding hardware support: 
>    a) custom microcontrolers?
>    b) more hardware
>    c) how about a "standard" Linux device /dev/midas
>    for various PCI cards (PCI<->CAMAC) (PCI<->VME) 

Well, you cannot develop hardware support for hardware you don't have, so the
policy up to now was that everyone developing some special drivers or hardware
support contributed it to the package. About c), we have already a CAMAC
driver standard, but at the user space level, so I don't see the benefit of
having kernel-mode standardized drivers. The only difference will be that the
debugging will be harder. VME standard is there in a kind of poor start right
now, but I expect to finish it this fall.

As for a), there is the MSCB system (http://midas.psi.ch/mscb) which has midas
support on the device driver and bus driver level, but I learned that
distributing hardware (or PCB designs if you like) is much harder than sharing
software. 

> 3) I have never really seen a midas deployment that uses interrupts. 
> I do understand the ease of polling and the fact that these days
> CPU's are cheap but sometimes it is important to use interrupts.
> Any examples/experience?

It's not only the "ease" of polling, but also that it's faster (in almost all
cases) and less troublesome. But hey, interrupt support is included in mfe.c,
so if you are fanatic about interrupts, please feel free to use them.
  27   15 Jul 2004 Konstantin Olchanski future direction discussion?
> > Are we ready for 2.0? 

I disapprove of version number inflation. Why not go straight for midas version
3000-Pro-Z?

> A big thing along the horizon I see is the ROME environment
> (http://midas.psi.ch/rome/). So we will move away from PAW to ROOT. Although
> the DAQ part will stay untouched, the whole analysis back-end changes,
> including some XML configuration and MySQL support.

I looked at the ROME slides from Pierre, and it seems to suffer badly from the
second-system syndrome (read The Mythical Man-Month).

For us, it is important to get the data into a form where we can process it with
ROOT and I would prefer if we could concentrate in improving the (embryonic) ROOT
online analysis capabilities.

> > 1) For one I would like to explore memory mapping (mmap())

It is trivial to replace System-V shared memory with mmap(). I am surprised that
it has not been done yet. System-V semaphores are a little bit harder to get rid of.

> > 2) Expanding hardware support: 
> >    a) custom microcontrolers?
> >    b) more hardware
> >    c) how about a "standard" Linux device /dev/midas
> >    for various PCI cards (PCI<->CAMAC) (PCI<->VME) 

We cannot expect MIDAS Authors to provide drivers for all possible hardware.

At best, we can mount an effort to collect exisiting drivers from all MIDAS users
"out there" and to integrate them into MIDAS.

Even that is highly non-trivial- many drivers use non-portable native hardware
access interfaces (direct bit-banging on PPCs, VMIC library on VMIC Linux
machines, etc). We have already failed to create an efficient generic portable VME
access library.

> As for a), there is the MSCB system (http://midas.psi.ch/mscb)

I never saw the point of having tons of MIDAS code for MSCB hardware that nobody
has and nobody will ever have.

> > 3) I have never really seen a midas deployment that uses interrupts. 
>
> It's not only the "ease" of polling, but also that it's faster (in almost all
> cases) and less troublesome. But hey, interrupt support is included in mfe.c,
> so if you are fanatic about interrupts...

Interrupts are important when one cannot afford chewing up CPU cycles, memory
cycles, PCI cycles and VME cycles on polling.

As I understand, common CAMAC hardware does not generate interrupts- this explains
lack of examples and lack of interrupt use. At TRIUMF, our new VME hardware
supports interrupts and I have a VMIC-based setup where I can (and intend to) test
MIDAS support of interrupts.

K.O.
  1761   13 Jan 2020 Peter KunzForumfrontend issues with midas-2019-09

After upgrading to the lastes MIDAS version I got the DAQ frontend of my application running by changing all compiler directives from cc to g++ and using

#include "mfe.h"

extern HNDLE hDB

 extern "C" { 
 #include <CAENComm.h>  
 }

With these changes everything seems to work fine.

However, I'm having trouble with a slow control frontend using a tcpip driver. It compiled well with the older MIDAS version. Even though all the functions in question are defined in the frontend code, the following error comes up:

g++ -o feMotor -DOS_LINUX -Dextname -g -O2 -fPIC -Wall -Wuninitialized  -fpermissive -I/home/pkunz/packages/midas/include -I. -I/home/pkunz/packages/midas/drivers/bus   /home/pkunz/packages/midas/lib/libmidas.a feMotor.o /home/pkunz/packages/midas/drivers/bus/tcpip.o cd_Galil.o /home/pkunz/packages/midas/lib/libmidas.a /home/pkunz/packages/midas/lib/mfe.o -lm -lz -lutil -lnsl -lpthread -lrt
/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `initialize_equipment()':
/home/pkunz/packages/midas/src/mfe.cxx:687: undefined reference to `interrupt_configure(int, int, long)'
/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `readout_enable(unsigned int)':
/home/pkunz/packages/midas/src/mfe.cxx:1236: undefined reference to `interrupt_configure(int, int, long)'
/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:1238: undefined reference to `interrupt_configure(int, int, long)'
/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `main':
/home/pkunz/packages/midas/src/mfe.cxx:2791: undefined reference to `interrupt_configure(int, int, long)'
/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:2792: undefined reference to `interrupt_configure(int, int, long)'
collect2: error: ld returned 1 exit status
make: *** [Makefile:36: feMotor] Error 1

 I guess the the aforementioned DAQ frontend compiles because its equipment definitions don't call on the function `initialize_equipment()', but I can't figure out why it doesn't work. Help is appreciated. P.K.

  1765   13 Jan 2020 Konstantin OlchanskiForumfrontend issues with midas-2019-09
(please use the "plain" text, much easier to answer).

Hi, Peter, I think you misread the error message. There is no error about initialize_equipment(), the error 
is about interrupt_configure(). initialize_equipment() is just one of the functions that calls it.

The cause of the error most likely is a mismatch between the declaration of interrupt_configure() in 
mfe.h and the definition of this function in your program (in feMotor.c, I guess).

Sometimes this mismatch is hard to identify just by looking at the code.

One fool-proof method to debug this is to extract the actual function prototypes from your object files, 
both the declaration ("U") in mfe.o and the definition ("T") in your program should be identical:

nm feMotor.o |  grep -i interrupt | c++filt
0000000000000f90 T interrupt_configure(int, int, long) <--- should be this

nm ~/packages/midas/lib/libmfe.a | grep -i interrupt | c++filt
U interrupt_configure(int, int, long)

If they are different, you adjust your program until they match. One way to ensure the match is to copy 
the declaration from mfe.h into your program.

K.O.




<p>&nbsp;</p>

<table align="center" cellspacing="1" style="border:1px solid #486090; width:98%">
	<tbody>
		<tr>
			<td style="background-color:#486090">Peter Kunz wrote:</td>
		</tr>
		<tr>
			<td style="background-color:#FFFFB0">
			<p>After upgrading to the lastes MIDAS version I got the DAQ frontend of my application 
running by changing all compiler directives from cc to g++ and using</p>

			<p>#include &quot;mfe.h&quot;</p>

			<p>extern HNDLE hDB</p>

			<p>&nbsp;extern &quot;C&quot; {&nbsp;<br />
			&nbsp;#include &lt;CAENComm.h&gt; &nbsp;<br />
			&nbsp;}</p>

			<p>With these changes everything seems to work fine.</p>

			<p>However, I&#39;m having trouble with a slow control frontend using a tcpip driver. It 
compiled well with the older MIDAS version. Even though all the functions in question are defined in the 
frontend code, the following error comes up:</p>

			<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;">g++ -o feMotor 
-DOS_LINUX -Dextname -g -O2 -fPIC -Wall -Wuninitialized &nbsp;-fpermissive -
I/home/pkunz/packages/midas/include -I. -I/home/pkunz/packages/midas/drivers/bus &nbsp; 
/home/pkunz/packages/midas/lib/libmidas.a feMotor.o 
/home/pkunz/packages/midas/drivers/bus/tcpip.o cd_Galil.o 
/home/pkunz/packages/midas/lib/libmidas.a /home/pkunz/packages/midas/lib/mfe.o -lm -lz -lutil -lnsl -
lpthread -lrt<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
`initialize_equipment()&#39;:<br />
			/home/pkunz/packages/midas/src/mfe.cxx:687: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
`readout_enable(unsigned int)&#39;:<br />
			/home/pkunz/packages/midas/src/mfe.cxx:1236: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:1238: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `main&#39;:<br />
			/home/pkunz/packages/midas/src/mfe.cxx:2791: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:2792: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			collect2: error: ld returned 1 exit status<br />
			make: *** [Makefile:36: feMotor] Error 1</div>

			<p>&nbsp;I guess the the aforementioned DAQ frontend compiles because its equipment 
definitions don&#39;t call on the function `initialize_equipment()&#39;, but I can&#39;t figure out why 
it doesn&#39;t work. Help is appreciated. P.K.</p>
			</td>
		</tr>
	</tbody>
</table>

<p>&nbsp;</p>
  1766   13 Jan 2020 Konstantin OlchanskiForumfrontend issues with midas-2019-09
(please use the "plain" text, much easier to answer).

Hi, Peter, I think you misread the error message. There is no error about 
initialize_equipment(), the error is about interrupt_configure(). initialize_equipment() is just 
one of the functions that calls it.

The cause of the error most likely is a mismatch between the declaration of 
interrupt_configure() in mfe.h and the definition of this function in your program (in 
feMotor.c, I guess).

Sometimes this mismatch is hard to identify just by looking at the code.

One fool-proof method to debug this is to extract the actual function prototypes from your 
object files, both the declaration ("U") in mfe.o and the definition ("T") in your program 
should be identical:

nm feMotor.o |  grep -i interrupt | c++filt
0000000000000f90 T interrupt_configure(int, int, long) <--- should be this

nm ~/packages/midas/lib/libmfe.a | grep -i interrupt | c++filt
U interrupt_configure(int, int, long)

If they are different, you adjust your program until they match. One way to ensure the 
match is to copy the declaration from mfe.h into your program.

K.O.




<p>&nbsp;</p>

<table align="center" cellspacing="1" style="border:1px solid #486090; width:98%">
	<tbody>
		<tr>
			<td style="background-color:#486090">Peter Kunz wrote:</td>
		</tr>
		<tr>
			<td style="background-color:#FFFFB0">
			<p>After upgrading to the lastes MIDAS version I got the DAQ frontend of my 
application running by changing all compiler directives from cc to g++ and using</p>

			<p>#include &quot;mfe.h&quot;</p>

			<p>extern HNDLE hDB</p>

			<p>&nbsp;extern &quot;C&quot; {&nbsp;<br />
			&nbsp;#include &lt;CAENComm.h&gt; &nbsp;<br />
			&nbsp;}</p>

			<p>With these changes everything seems to work fine.</p>

			<p>However, I&#39;m having trouble with a slow control frontend using a 
tcpip driver. It compiled well with the older MIDAS version. Even though all the functions in 
question are defined in the frontend code, the following error comes up:</p>

			<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;">g++ 
-o feMotor -DOS_LINUX -Dextname -g -O2 -fPIC -Wall -Wuninitialized &nbsp;-fpermissive 
-I/home/pkunz/packages/midas/include -I. -I/home/pkunz/packages/midas/drivers/bus 
&nbsp; /home/pkunz/packages/midas/lib/libmidas.a feMotor.o 
/home/pkunz/packages/midas/drivers/bus/tcpip.o cd_Galil.o 
/home/pkunz/packages/midas/lib/libmidas.a /home/pkunz/packages/midas/lib/mfe.o -lm -lz 
-lutil -lnsl -lpthread -lrt<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
`initialize_equipment()&#39;:<br />
			/home/pkunz/packages/midas/src/mfe.cxx:687: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
`readout_enable(unsigned int)&#39;:<br />
			/home/pkunz/packages/midas/src/mfe.cxx:1236: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:1238: undefined 
reference to `interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `main&#39;:
<br />
			/home/pkunz/packages/midas/src/mfe.cxx:2791: undefined reference to 
`interrupt_configure(int, int, long)&#39;<br />
			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:2792: undefined 
reference to `interrupt_configure(int, int, long)&#39;<br />
			collect2: error: ld returned 1 exit status<br />
			make: *** [Makefile:36: feMotor] Error 1</div>

			<p>&nbsp;I guess the the aforementioned DAQ frontend compiles because 
its equipment definitions don&#39;t call on the function `initialize_equipment()&#39;, but I 
can&#39;t figure out why it doesn&#39;t work. Help is appreciated. P.K.</p>
			</td>
		</tr>
	</tbody>
</table>

<p>&nbsp;</p>
  1767   13 Jan 2020 Peter KunzForumfrontend issues with midas-2019-09
Thanks for explaining this, Konstantin. 
After updating the function to

INT interrupt_configure(INT cmd, INT source, POINTER_T adr)
{
  return CM_SUCCESS;
}

it compiled without errors. In the original code the "INT source" variable was missing.



> (please use the "plain" text, much easier to answer).
> 
> Hi, Peter, I think you misread the error message. There is no error about initialize_equipment(), the error 
> is about interrupt_configure(). initialize_equipment() is just one of the functions that calls it.
> 
> The cause of the error most likely is a mismatch between the declaration of interrupt_configure() in 
> mfe.h and the definition of this function in your program (in feMotor.c, I guess).
> 
> Sometimes this mismatch is hard to identify just by looking at the code.
> 
> One fool-proof method to debug this is to extract the actual function prototypes from your object files, 
> both the declaration ("U") in mfe.o and the definition ("T") in your program should be identical:
> 
> nm feMotor.o |  grep -i interrupt | c++filt
> 0000000000000f90 T interrupt_configure(int, int, long) <--- should be this
> 
> nm ~/packages/midas/lib/libmfe.a | grep -i interrupt | c++filt
> U interrupt_configure(int, int, long)
> 
> If they are different, you adjust your program until they match. One way to ensure the match is to copy 
> the declaration from mfe.h into your program.
> 
> K.O.
> 
> 
> 
> 
> <p>&nbsp;</p>
> 
> <table align="center" cellspacing="1" style="border:1px solid #486090; width:98%">
> 	<tbody>
> 		<tr>
> 			<td style="background-color:#486090">Peter Kunz wrote:</td>
> 		</tr>
> 		<tr>
> 			<td style="background-color:#FFFFB0">
> 			<p>After upgrading to the lastes MIDAS version I got the DAQ frontend of my application 
> running by changing all compiler directives from cc to g++ and using</p>
> 
> 			<p>#include &quot;mfe.h&quot;</p>
> 
> 			<p>extern HNDLE hDB</p>
> 
> 			<p>&nbsp;extern &quot;C&quot; {&nbsp;<br />
> 			&nbsp;#include &lt;CAENComm.h&gt; &nbsp;<br />
> 			&nbsp;}</p>
> 
> 			<p>With these changes everything seems to work fine.</p>
> 
> 			<p>However, I&#39;m having trouble with a slow control frontend using a tcpip driver. It 
> compiled well with the older MIDAS version. Even though all the functions in question are defined in the 
> frontend code, the following error comes up:</p>
> 
> 			<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;">g++ -o feMotor 
> -DOS_LINUX -Dextname -g -O2 -fPIC -Wall -Wuninitialized &nbsp;-fpermissive -
> I/home/pkunz/packages/midas/include -I. -I/home/pkunz/packages/midas/drivers/bus &nbsp; 
> /home/pkunz/packages/midas/lib/libmidas.a feMotor.o 
> /home/pkunz/packages/midas/drivers/bus/tcpip.o cd_Galil.o 
> /home/pkunz/packages/midas/lib/libmidas.a /home/pkunz/packages/midas/lib/mfe.o -lm -lz -lutil -lnsl -
> lpthread -lrt<br />
> 			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
> `initialize_equipment()&#39;:<br />
> 			/home/pkunz/packages/midas/src/mfe.cxx:687: undefined reference to 
> `interrupt_configure(int, int, long)&#39;<br />
> 			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function 
> `readout_enable(unsigned int)&#39;:<br />
> 			/home/pkunz/packages/midas/src/mfe.cxx:1236: undefined reference to 
> `interrupt_configure(int, int, long)&#39;<br />
> 			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:1238: undefined reference to 
> `interrupt_configure(int, int, long)&#39;<br />
> 			/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `main&#39;:<br />
> 			/home/pkunz/packages/midas/src/mfe.cxx:2791: undefined reference to 
> `interrupt_configure(int, int, long)&#39;<br />
> 			/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:2792: undefined reference to 
> `interrupt_configure(int, int, long)&#39;<br />
> 			collect2: error: ld returned 1 exit status<br />
> 			make: *** [Makefile:36: feMotor] Error 1</div>
> 
> 			<p>&nbsp;I guess the the aforementioned DAQ frontend compiles because its equipment 
> definitions don&#39;t call on the function `initialize_equipment()&#39;, but I can&#39;t figure out why 
> it doesn&#39;t work. Help is appreciated. P.K.</p>
> 			</td>
> 		</tr>
> 	</tbody>
> </table>
> 
> <p>&nbsp;</p>
  1773   14 Jan 2020 Stefan RittForumfrontend issues with midas-2019-09
We updated midas/examples/experiment/frontend.cxx to correctly contain

/*-- Interrupt configuration ---------------------------------------*/

INT interrupt_configure(INT cmd, INT source, POINTER_T adr)
{
   switch (cmd) {
   case CMD_INTERRUPT_ENABLE:
      break;
   case CMD_INTERRUPT_DISABLE:
      break;
   case CMD_INTERRUPT_ATTACH:
      break;
   case CMD_INTERRUPT_DETACH:
      break;
   }
   return SUCCESS;
}

but if you upgrade from C to C++ from your own old frontend code you might be hit by that issue. 

Maybe Konstantin can update elog:1526 to contain a hint about "INT source".

Stefan
  1774   14 Jan 2020 Stefan RittForumfrontend issues with midas-2019-09
Actually now I see that 

a4) poll_event() and interrupt_configure() have "source" as "int[]" instead of "int" (why did this work before?)

mention this already, but maybe it's not completely clear that one has to change "int[] source" to "int source"

Stefan


> We updated midas/examples/experiment/frontend.cxx to correctly contain
> 
> /*-- Interrupt configuration ---------------------------------------*/
> 
> INT interrupt_configure(INT cmd, INT source, POINTER_T adr)
> {
>    switch (cmd) {
>    case CMD_INTERRUPT_ENABLE:
>       break;
>    case CMD_INTERRUPT_DISABLE:
>       break;
>    case CMD_INTERRUPT_ATTACH:
>       break;
>    case CMD_INTERRUPT_DETACH:
>       break;
>    }
>    return SUCCESS;
> }
> 
> but if you upgrade from C to C++ from your own old frontend code you might be hit by that issue. 
> 
> Maybe Konstantin can update elog:1526 to contain a hint about "INT source".
> 
> Stefan
  2704   05 Feb 2024 Pavel MuratForumforbidden equipment names ?
Dear MIDAS experts,

I have multiple daq nodes with two data receiving FPGAs on the PCIe bus each. 
The FPGAs come under the names of DTC0 and DTC1. Both FPGAs are managed by the same slow control frontend. 
To distinguish FPGAs of different nodes from each other, I included the hostname to the equipment name, 
so for node=mu2edaq09 the FPGA names are 'mu2edaq09:DTC0' and 'mu2edaq09:DTC1'. 

The history system didn't like the names, complaining that 

21:26:06.334 2024/02/05 [Logger,ERROR] [mlogger.cxx:5142:open_history,ERROR] Equipment name 'mu2edaq09:DTC1' 
contains characters ':', this may break the history system

So the question is : what are the safe equipment/driver naming rules and what characters 
are not allowed in them? - I think this is worth documenting, and the current MIDAS docs at 
 
https://daq00.triumf.ca/MidasWiki/index.php/Equipment_List_Parameters#Equipment_Name 

don't say much about it.

-- many thanks, regards, Pasha
  2709   13 Feb 2024 Konstantin OlchanskiForumforbidden equipment names ?
> equipment names are 'mu2edaq09:DTC0' and 'mu2edaq09:DTC1'

I think all names permitted for ODB keys are allowed as equipment names, any valid UTF-8,
forbidden chars are "/" (ODB path separator) and "\0" (C string terminator). Maximum length
is 31 byte (plus "\0" string terminator). (Fixed length 32-byte names with implied terminator
are no longer permitted).

The ":" character is used in history plot definitions and we are likely eventually change that,
history event names used to be pairs of "equipment_name:tag_name" but these days with per-variable
history, they are triplets "equipment_name,variable_name,tag_name". The history plot editor
and the corresponding ODB entries need to be updated for this. Then, ":" will again be a valid
equipment name.

I think if you disable the history for your equipments, MIDAS will stop complaining about ":" in the name.

K.O.
  296   19 Aug 2006 Konstantin OlchanskiBug Fixfixes for minor mhttpd problems
I commited fix for minor mhttpd problems (rev 3314):
- for a newly created experiment, the "history" button gave the error [history
panel "" does not exist] (new problem introduced in revision 3150)
- for very long history panel names (close to the 32-character limit) history
plots produce the error "Cannot find /history/display/foo/bar/variables" (broke
in revision 3190 "use strlcpy()", in previous revisions, this bug was silent
stack corruption)
- elog attachments did not work for file names containing character plus (+)
(attachement URLs should be properly encoded to escape special CGI characters)
K.O.
  297   26 Aug 2006 Konstantin OlchanskiBug Fixfixes for minor mhttpd problems
> I commited fix for minor mhttpd problems (rev 3314):
> - elog attachments did not work for file names containing character plus (+)
> (attachement URLs should be properly encoded to escape special CGI characters)

I accidentally indirectly learned that the above change produced incorrect URLs
when more than one experiment is defined. I now commited a fix to this problem.

K.O.
  200   25 Feb 2005 Konstantin OlchanskiBug Fixfixed: double free in FORMAT_MIDAS ybos.c causing lazylogger crashes
We stumbled upon and fixed a "double free" bug in src/ybos.c causing crashes in
lazylogger writing .mid files in the FORMAT_MIDAS format (why does it use
ybos.c? Pierre says- for generic file i/o). Why this code had ever worked before
remains a mystery. K.O.
  739   29 Dec 2010 Konstantin OlchanskiBug Reportfixed. odb corruption, odb race condition?
> 
> The only remaining problem when running my script is some kind of deadlock between the ODB and SYSMSG semaphores...
> 


I committed changes to odb.c and midas.c fixing a number of places that could corrupt ODB and SYSMSG data, and fixing a number of deadlocks. Without these 
changes, on my Mac, MIDAS will reliably corrupt ODB or deadlock while running my odbedit fork-bomb torture test script. These changes still need to be tested on 
Linux (but I do not expect any problems).

Because my changes do not fix the original race condition in client creation/removal/cleanup, you may still occasionally see messages like this:
13:35:14 [ODBEdit24,ERROR] [odb.c:2112:db_find_key,ERROR] hkey 169592 invalid key type 376
13:35:15 [ODBEdit28,ERROR] [odb.c:3268:db_get_value,ERROR] hkey 162072 entry "Name" is of type NULL, not STRING

For now, I am happy that we no longer corrupt ODB (nor deadlock) and I will work with Stefan on a permanent solution for this.

Special thanks go to the T2K/ND280 experiment, specifically, to Tim Nicholls and to the unnamed person who emailed me their script that executes many odbedit 
commands to setup midas history plots.


svn rev 4930
K.O.


P.S. Below is my torture test script, I usually run many of them in a sequence "./test1.perl >& xxx1; ./test1.perl >& xxx2; ... etc".

#!/usr/bin/perl -w
for (my $i=0; $i<50; $i++)
{
#my $cmd = "odbedit -c \'scl -w\' &";
#my $cmd = "odbedit -c \'ls -l /system/clients\' >& xxx$i &";
my $cmd = "odbedit -c \'ls -l /system/clients\' &";
system $cmd;
}
#end

svn rev 4930
K.O.
  741   11 Feb 2011 Konstantin OlchanskiBug Reportfixed. odb corruption, odb race condition?
> > 
> > The only remaining problem when running my script is some kind of deadlock between the ODB and SYSMSG semaphores...
> > 
> 
> For now, I am happy that we no longer corrupt ODB (nor deadlock) ...
>

Found one more deadlock between ODB and SYSMSG semaphores, this time through cm_watchdog():

If cm_watchdog somehow runs while we are holding the ODB semaphore, it will eventually try to lock SYSMSG (through bm_cleanup & co) in
violation of our semaphore locking order. If at the same time another application tries to lock stuff using the correct order (SYSMSG first, ODB last),
the two programs will deadlock (wait for each other forever). I presently have two copies of gdb attached to two copies of odbedit
waiting for each other in a deadlock through this cm_watchdog scenario...

Solution shall follow quickly, I have been hunting this deadlock for the last couple of weeks...

K.O.
  743   15 Feb 2011 Konstantin OlchanskiBug Reportfixed. odb corruption, odb race condition?
> Solution shall follow quickly, I have been hunting this deadlock for the last couple of weeks...

Over the last couple of days I made a series of commits to odb.c and midas.c to implement a buffer-based cm_msg()
and fix the latest deadlock problem, also to help with the race conditions in client creation and cleanup.

My torture test runs okey in my mac now, one remaining problem is spurious client removal caused
by semaphore starvation - I see 2-3-7-10 sec wait times for semaphores - probably caused by some
kind of unfairness in the MacOS SysV semaphore implementation (in a "fair" semaphore implementation,
the process that waited the longest would be woken up the first and one would never see semaphore wait
times measured in seconds). Probably worth investigating fairness of MacOS posix semaphores. On LInux
things are probably different and under normal running conditions one should not see any semaphore starvation.

I will be doing extensive tests of this update at TRIUMF, but I do not expect any problems. If you use this
version and see any anomalies, please report them as replies to this message or email me directly.

svn rev 4976
K.O.
  750   16 Feb 2011 Konstantin OlchanskiBug Reportfixed. odb corruption, odb race condition?
> My torture test runs okey in my mac now, one remaining problem is spurious client removal caused
> by semaphore starvation...

My torture test runs okey on Linux and I do not see any problems with spurious client removal - actually
I do not see any strange longs waits for semaphores that I was seeing on MacOS. Must be another
proof that MacOS is years behind Linux in kernel technology (but parsecs ahead in user experience)

K.O.
  921   25 Oct 2013 Konstantin OlchanskiBug Fixfixed mlogger run auto restart bug
A problem existed in midas for some time: when recording long data sets of time (or event) limited runs 
with logger run auto restart set to "yes", the runs will automatically stop and restart as expected, but 
sometimes the run will stop and never restart and beam will be lost until the experiment operator on shift 
wakes up and restarts the run manually.

I have now traced this problem to a race condition inside the mlogger - when a run is being stopped from 
the mlogger, the mlogger run transition handler (tr_stop) triggers an immediate attempt to start the next 
run, without waiting for the run-stop transition to actually complete. If the run-stop transition does not 
finish quickly enough, a safety check in start_the_run() will cause the run restart attempt to silently fail 
without any error message.

This race condition is pretty rare but somehow I managed to replicate it while debugging the 
multithreaded transitions. It is fixed by making mlogger wait until the run-stop transition completes.

https://bitbucket.org/tmidas/midas/commits/b2631fbed5f7b1ec80e8a6c8781ada0baed7702b

K.O.
  923   25 Oct 2013 Stefan RittBug Fixfixed mlogger run auto restart bug
> A problem existed in midas for some time: when recording long data sets of time (or event) limited runs 
> with logger run auto restart set to "yes", the runs will automatically stop and restart as expected, but 
> sometimes the run will stop and never restart and beam will be lost until the experiment operator on shift 
> wakes up and restarts the run manually.
> 
> I have now traced this problem to a race condition inside the mlogger - when a run is being stopped from 
> the mlogger, the mlogger run transition handler (tr_stop) triggers an immediate attempt to start the next 
> run, without waiting for the run-stop transition to actually complete. If the run-stop transition does not 
> finish quickly enough, a safety check in start_the_run() will cause the run restart attempt to silently fail 
> without any error message.
> 
> This race condition is pretty rare but somehow I managed to replicate it while debugging the 
> multithreaded transitions. It is fixed by making mlogger wait until the run-stop transition completes.
> 
> https://bitbucket.org/tmidas/midas/commits/b2631fbed5f7b1ec80e8a6c8781ada0baed7702b
> 
> K.O.

More generally I kind of consider the mlogger auto restart facility as deprecated. It works in the background and the operator does not have a clue 
what is going on. We use now the sequencer to achieve exactly the same functionality. It just requires a few lines of sequencer code:

LOOP INFINITE 
  TRANSITION start 
  WAIT events, 5000 
  TRANSITION stop 
ENDLOOP 

So the run start is only executed after the runs has been successfully stopped. You can do things in the sequencer like "stop run and sequence 
immediately" or "stop after current run has finished" which are a bit hard to do with the old method. So people should move to the sequencer.

/Stefan
  924   28 Oct 2013 Konstantin OlchanskiBug Fixfixed mlogger run auto restart bug
> 
> More generally I kind of consider the mlogger auto restart facility as deprecated. It works in the background and the operator does not have a clue 
> what is going on. We use now the sequencer to achieve exactly the same functionality.
>

Before subruns were available, most experiments at TRIUMF have used the "auto restart" function. Now, I think most of them use subruns,
with the notable exception of PIENU where the analysis framework could not handle subruns. (PIENU is now shutdown and disassembled).

>
> It just requires a few lines of sequencer code:
> 
> LOOP INFINITE 
>   TRANSITION start 
>   WAIT events, 5000 
>   TRANSITION stop 
> ENDLOOP 
> 

Mouse click "auto restart" to "yes" is a little bit simpler than setting up a sequencer file, and it survives a crash of mhttpd.

Does the sequencer survive a crash or a restart of mhttpd?

K.O.
ELOG V3.1.4-2e1708b5