Back Midas Rome Roody Rootana
  Midas DAQ System, Page 95 of 146  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
    Reply  22 May 2007, Stefan Ritt, Bug Report, analyzer_init called by odb_load 
> Thanks for the quick reply, Stefan.
> 
> Please don't change anything in the code unless you find it really important.
I guess 
> changing the analyzer_init prototype will break a lot of code out there?
> 
> In fact, I think I do understand this behavior now.
> And even without your suggested fix there is a simple workaround: I add a static 
> variable to my analyzer_init.cxx file, and do something similar to your bFirst
fix.
> 
> In conclusion, commit your fix if it does not harm others. Postpone this
commit to a 
> future new version of midas which breaks a lot of things anyway...
> 
> A last question, for me to understand: Why not call db_open_record in 
> ana_begin_of_run then?

I fully agree with you that db_open_record would better go into ana_begin_of_run
(and
analyzer_init not being called in odb_load), and I fully agree with you that
changing the
code would break many experiments. ;-)

So I guess we leave it as it is right now as you suggested.
Entry  20 Aug 2007, Konstantin Olchanski, Bug Report, how to handle end of run? 
I am having problems with handling the end-of-run situation in my midas
frontend. I have a device that continuously sends data (over USB) and I read
this data in my "read_event" function.

Everything is good until the end-of-run, at which time this happens:
0) mfe.c calls my read_event() to read the data (loop until the end-of-run
transition)
1) mfe.c calls my end_of_run()
2) here, I tell the device "please stop sending data"
3) all seems good, but wait!!!
4) there is all this data generated between step 0 and step 2 still sitting
inside the device and it has nowhere to go: the run is ended, the output file is
closed, my read_event() will never be called ever again (well, until the next run).

It seems to me mfe.c needs to have one more function, something like
"pre_end_of_run()" that works like this:
0) mfe.c calls my read_event() to read the data (loop until the end-of-run
transition)
1) mfe.c calls pre_end_of_run(), here I tell the device to stop sending data
2) mfe.c calls read_event() for the very last time, to give me the opportunity
to read and send away any data I still may have.
3) mfe.c calls the end_of_run(). The run is truely finished.

Any thoughts?

K.O.
    Reply  03 Sep 2007, Stefan Ritt, Bug Report, how to handle end of run? 
> I am having problems with handling the end-of-run situation in my midas
> frontend. I have a device that continuously sends data (over USB) and I read
> this data in my "read_event" function.
> 
> Everything is good until the end-of-run, at which time this happens:
> 0) mfe.c calls my read_event() to read the data (loop until the end-of-run
> transition)
> 1) mfe.c calls my end_of_run()
> 2) here, I tell the device "please stop sending data"
> 3) all seems good, but wait!!!
> 4) there is all this data generated between step 0 and step 2 still sitting
> inside the device and it has nowhere to go: the run is ended, the output file is
> closed, my read_event() will never be called ever again (well, until the next run).
> 
> It seems to me mfe.c needs to have one more function, something like
> "pre_end_of_run()" that works like this:
> 0) mfe.c calls my read_event() to read the data (loop until the end-of-run
> transition)
> 1) mfe.c calls pre_end_of_run(), here I tell the device to stop sending data
> 2) mfe.c calls read_event() for the very last time, to give me the opportunity
> to read and send away any data I still may have.
> 3) mfe.c calls the end_of_run(). The run is truely finished.
> 
> Any thoughts?

You can achieve the desired functionality without changing mfe.c:

0) mfe.c calls read_event
1) mfe.c calls end_of_run. Your end_of_run tells the device to stop data and flushes
the remaining data. At this point you have to re-make actually a part of the mfe.c
functionality, but basically you need a bm_compose_event() and a bm_send_event(), so
just a few lines of code. If you want to have the final event number right in your
equipment, you also need to update eq->events_sent accordingly. 

Given the fact that 99% of the experiments do not need this functionality, I propose
that we keep mfe.c and you add the few lines of code into your user part of the
specific frontend.

Stefan
Entry  08 Oct 2007, Carl Metelko, Bug Report, Error in data format- ending blocks on 32bit boundary x86_64 
Hi,
    I found that midas banks can be given an extra 32 bits of zeros when
trying to keep to 32bit boundary on my x86_64. 

This can be fixed by changing (in midas.h)
#define ALIGN8(x)  (((x)+7) & ~7)
to
#define ALIGN8(x)  (((x)+3) & ~3)

Is there any bad consequences doing this?
    Reply  08 Oct 2007, Stefan Ritt, Bug Report, Error in data format- ending blocks on 32bit boundary x86_64 
> Hi,
>     I found that midas banks can be given an extra 32 bits of zeros when
> trying to keep to 32bit boundary on my x86_64. 
> 
> This can be fixed by changing (in midas.h)
> #define ALIGN8(x)  (((x)+7) & ~7)
> to
> #define ALIGN8(x)  (((x)+3) & ~3)
> 
> Is there any bad consequences doing this?

Yes. ALIGN8 means 'align to 8-byte boundary' (64-bit), and if you change that, you
break the code at various locations. Furthermore, 8-byte aligned access is faster
on x86_64 than 4-byte aligned access, so you will get a performance penalty. If
course if you have very many small banks, the zero padding can cause some
overhead, but in that case you could combine some data into a single bank.
Entry  11 Oct 2007, Stefan Ritt, Bug Report, _syscall0 not available on gcc 4.1.1 
Dear Stephan,

I am writting on behalf of the LiBeRACE collaboration
at Berkeley/Livermore.

We are trying to use midas (2.0.0) for our acquisition system.
However we had some difficulties to compile it on LINUX Fedora
Core 6 with gcc 4.1.1
I tried to trace back the problem and I found that _syscall0 in
system.c is actually an obsolete call (since gcc 4.x apparently).
Playing with assembly language being behond my competence, I would 
like to know if you ever came across this situation recently and
if you have any suggestion(s).

With my best regards
Julien GIBELIN


------------------------------------------------------
GIBELIN Julien

Lawrence Berkeley National Laboratory
Nuclear Science Division
One Cyclotron Rd.
MS 88R0192
BERKELEY, CA 94720-8101

Tel: +1 (510) 495-2695
Fax: +1 (510) 486-7983
------------------------------------------------------
    Reply  11 Oct 2007, Stefan Ritt, Bug Report, _syscall0 not available on gcc 4.1.1 
> Dear Stephan,
> 
> I am writting on behalf of the LiBeRACE collaboration
> at Berkeley/Livermore.
> 
> We are trying to use midas (2.0.0) for our acquisition system.
> However we had some difficulties to compile it on LINUX Fedora
> Core 6 with gcc 4.1.1
> I tried to trace back the problem and I found that _syscall0 in
> system.c is actually an obsolete call (since gcc 4.x apparently).
> Playing with assembly language being behond my competence, I would 
> like to know if you ever came across this situation recently and
> if you have any suggestion(s).

The '_syscall0' function call was replaced by 'syscall' in SVN revision 3583. I
would recommend that you switch to the current SVN version (see
http://ladd00.triumf.ca/~daqweb/doc/midas/html/quickstart.html on how to obtain
the SVN version). If the problem still persists, please let us know.

- Stefan
Entry  18 Feb 2008, Konstantin Olchanski, Bug Report, mhttpd safari 3.0.4 redirect problem 
I now encountered a new problem with mhttpd - I connect using the Safari 3.0.4 browser, go to the 
"Programs" page, press the button "Start feplc" (or any other "start" button) and instead of starting this 
program, I get an error in the browser, funny entries in ODB in "/Programs", corrupted ODB and a spew 
of messages in the midas log file about the mess. ODB has to be reloaded from backup to recover.

Investigation shows that the culprit is odd bahaviour of the "redirect()" function:

    /* start command */
    if (*getparam("Start")) {
       /* for NT: close reply socket before starting subprocess */
-      redirect2("?cmd=programs");
+      redirect2("/?cmd=programs");

The version without "/" makes Safari explode - it appends the "?cmd..." stuff to the existing URL, which 
already has the "?cmd..." tags, making a mess.

Firefox accepts either version.

ODB corruption happens here:
 
       sprintf(str, "/Programs/%s/Start command", name);
-      db_get_value(hDB, 0, str, command, &size, TID_STRING, TRUE);
+      db_get_value(hDB, 0, str, command, &size, TID_STRING, FALSE);
       if (command[0]) {
          ss_system(command);

It looks like db_get_value() would corrupt ODB if given funny "str". When Safari explodes,
funny strings are generated.

The simple fix is to replace "TRUE" with "FALSE", then at least db_get_value() does not try to make bogus 
entries in ODB.

The "Stop" command has the same problem, but does not currupt ODB - there is no db_get_value() in 
that code path.

I am reporting this "fresh" as I made one of our daq systems work again.

I did not investigate the history of changes to this "redirect" command (perhaps it was broken in the 
recent reorganisation of midas urls?), what versions of Safari work or not.

K.O.
Entry  18 Feb 2008, Konstantin Olchanski, Bug Report, potential memory corruption in odb,c:extract_key() 
It looks like ODB function extract_key() will overwrite the array pointed to by "key_name" if given an odb 
path with very long names (as seems to happen when redirection explodes in the Safari web browser, via 
db_get_value(TRUE) via mhttpd "start program" button). All  callers of this function seem to provide 256 
byte strings, so the problem would not show up in normal use - only when abnormal odb paths are being 
parsed. Proposed solution is to add a "length" argument to this function. (Actually ODB path elements 
should be restricted to NAME_LENGTH (32 bytes), right?). K.O.
    Reply  18 Feb 2008, Exaos Lee, Bug Report, Great! But I failed to run it. :( 
I encountered the error message as the following:
Traceback (most recent call last):
  File "runtest.py", line 42, in <module>
    import midas
  File "/opt/MIDAS.PSI/Resources/PyMIDAS/pymidas/midas/__init__.py", line 140, in
<module>
    cmidas = ctypes.cdll.LoadLibrary('libmidas.so')
  File "/usr/lib/python2.5/site-packages/PIL/__init__.py", line 431, in LoadLibrary
    
  File "/usr/lib/python2.5/site-packages/PIL/__init__.py", line 348, in __init__
    
OSError: /opt/MIDAS.PSI/Versions/Current/lib/libmidas.so: undefined symbol:
cam16i_rq

Compiling the MIDAS library using NEED_SHLIB=1 causes the same "undefined
reference" error. But it can be fixed by adding "-shared" to CFLAGS in the
Makefile. Though the libmidas.so can be successfully created, the above error is
still there. Can anybody help me?

Environment:
Platform: Ubuntu Linux 7.10 with gcc 4.1
MIDAS version: 2.0.0, svn-4106
Python version: 2.5.1
Entry  18 Feb 2008, Jimmy Ngai, Bug Report, Analyzer cannot run as Daemon 
Hi All,

I'm testing MIDAS SVN rev-4113 on Scientific Linux 5.1 (i386) and the Analyzer 
can't start as Daemon. What I mean "can't" is that it stops running 
immediately without leaving any error messages. However, it can run offline or 
without becoming a Daemon. I have tested with ROOT 5.14e/5.16/5.18 and 
the "Experiment" example coming with MIDAS and this problem always happens. 
Any ideas?

Best Regards,
Jimmy
    Reply  21 Feb 2008, Stefan Ritt, Bug Report, mhttpd safari 3.0.4 redirect problem 
>     /* start command */
>     if (*getparam("Start")) {
>        /* for NT: close reply socket before starting subprocess */
> -      redirect2("?cmd=programs");
> +      redirect2("/?cmd=programs");

The second version won't work if mhttpd is run under an Apache proxy. Assume the proxy redirects

http://proxy.ca/midas

to

http://daq.ca:8080

If you now do a redirect to "/?cmd=programs", you will end up at

http://proxy.ca/?cmd=programs

which is now what you want. I tried to put a "./?cmd=programs", and that bings you to

http://proxy.ca/midas/./?cmd=programs

which is correctly redirected to

http://daq.ca:8080/?cmd=programs

I tried with the windows version (ughhh) of Safari and it worked for me. So give it a try, the change is committed.

> ODB corruption happens here:
>  
>        sprintf(str, "/Programs/%s/Start command", name);
> -      db_get_value(hDB, 0, str, command, &size, TID_STRING, TRUE);
> +      db_get_value(hDB, 0, str, command, &size, TID_STRING, FALSE);
>        if (command[0]) {
>           ss_system(command);
> 
> It looks like db_get_value() would corrupt ODB if given funny "str". When Safari explodes,
> funny strings are generated.

What happes is an endless redirect from xxxx -> xxxx?cmd=Programs. So in the end you have

http://url.ca?cmd=programs?cmd=programs?cmd=programs?cmd=programs....

and in the end you get a stack overflow, which busts all.

> The simple fix is to replace "TRUE" with "FALSE", then at least db_get_value() does not try to make bogus 
> entries in ODB.

I changed both butting FALSE there and adding

   if (strchr(name, '?'))
      *strchr(name, '?') = 0;

which keeps the URL short.

So for me it looks fine at the moment, but I cannot guarantee that everything works, so keep an eye open on that.
    Reply  21 Feb 2008, Konstantin Olchanski, Bug Report, potential memory corruption in odb,c:extract_key() 
> It looks like ODB function extract_key() will overwrite the array pointed to by "key_name" if given an odb 
> path with very long names (as seems to happen when redirection explodes in the Safari web browser, via 
> db_get_value(TRUE) via mhttpd "start program" button). All  callers of this function seem to provide 256 
> byte strings, so the problem would not show up in normal use - only when abnormal odb paths are being 
> parsed. Proposed solution is to add a "length" argument to this function. (Actually ODB path elements 
> should be restricted to NAME_LENGTH (32 bytes), right?). K.O.

This is fixed in svn revision 4129.

K.O.
    Reply  26 Feb 2008, Denis Bilenko, Bug Report, NEED_SHLIB=1 is broken 
I have the exact same problem with midas rev. 4129.
`make NEED_SHLIB=1` doesn't work.

To fix it apply this patch to Makefile

Index: Makefile
===================================================================
--- Makefile    (revision 4129)
+++ Makefile    (working copy)
@@ -270,7 +270,7 @@

 OBJS =  $(LIB_DIR)/midas.o $(LIB_DIR)/system.o $(LIB_DIR)/mrpc.o \
        $(LIB_DIR)/odb.o $(LIB_DIR)/ybos.o $(LIB_DIR)/ftplib.o \
-       $(LIB_DIR)/mxml.o $(LIB_DIR)/cnaf_callback.o \
+       $(LIB_DIR)/mxml.o \
        $(LIB_DIR)/history.o $(LIB_DIR)/alarm.o $(LIB_DIR)/elog.o

 ifdef NEED_STRLCPY

i.e. remove cnaf_callback.o which causes the link errors.

I propose that libmidas.so is built by default, so when something breaks it won't go unnoticed.
    Reply  27 Feb 2008, Konstantin Olchanski, Bug Report, NEED_SHLIB=1 is broken 
--- Makefile    (revision 4129)
+++ Makefile    (working copy)
-       $(LIB_DIR)/mxml.o $(LIB_DIR)/cnaf_callback.o \
+       $(LIB_DIR)/mxml.o \
> i.e. remove cnaf_callback.o which causes the link errors.


Hi, Denis - I confirm that cnaf_callback.c is only used by MIDAS frontends that implement CAMAC
functions and that it should not required for building the MIDAS library. I am now looking at removing 
it from libmidas.

> I propose that libmidas.so is built by default, so when something breaks it won't go unnoticed

We have been through this before and decided that shared libraries are bad and we do not want to use 
them. The option for building libmidas.so was preserved, though.

Not to refight old wars, on reason against using shared libraries was version skew - one could never be 
sure what version of midas is being used - depending on the PATH, LD_LIBRARY_PATH, rpath settings, 
etc. There were other reasons, perhaps practical, perhaps with the mserver.

The main problem with "just build it", is that then the rest of midas will link against it bringing back all 
the problems we solved by going away from using shared libraries.

So back to your proposal about building libmidas.so - can you look and see if you can do the Python 
bindings with a statically linked midas library?

I know it is possible with Perl bindings - perl creates it's own shared library containing perl api glue 
linked against a foreign static library libfoo.a , so in theory, the shared library is not needed.

But perhaps, Python do things differently...

K.O.
Entry  27 Feb 2008, Konstantin Olchanski, Bug Report, mhttpd: cannot attach history to elog 
From "history" pages, the "create elog" button stopped working - it takes us to the elog entry form, but 
then, the "submit" button does not create any elog entries, instead dumps us into an invalid history 
display. This is using the internal elog.

This change in mhttpd.c::show_elog_new() makes it work again:

-       ("<body><form method=\"POST\" action=\"./\" enctype=\"multipart/form-data\">\n");
+       ("<body><form method=\"POST\" action=\"/EL/\" enctype=\"multipart/form-data\">\n");

Problem and fix confirmed with Linux/firefox and MacOS/firefox and Safari.

K.O.
    Reply  28 Feb 2008, Konstantin Olchanski, Bug Report, mhttpd: cannot attach history to elog 
> From "history" pages, the "create elog" button stopped working - it takes us to the elog entry form, but 
> then, the "submit" button does not create any elog entries, instead dumps us into an invalid history 
> display. This is using the internal elog.
> 
> This change in mhttpd.c::show_elog_new() makes it work again:
> -       ("<body><form method=\"POST\" action=\"./\" enctype=\"multipart/form-data\">\n");
> +       ("<body><form method=\"POST\" action=\"/EL/\" enctype=\"multipart/form-data\">\n");

This was a problem with relative URLs and it is now fixed. Svn revision 4131, fixes: delete elog, make elog from odb, make elog from history.

K.O.
    Reply  29 Feb 2008, Denis Bilenko, Bug Report, NEED_SHLIB=1 is broken 
Having libmidas.so is absolutely necessary for pymidas to work. If there was no such
option in Makefile pymidas users would have to build it themselves.

What I proposed though is that you change Makefile so it builds libmidas.so in
addition to (not instead of) static library. So if someone prefer to build
binaries statically they
may continue to do so. On other hand when someone needs a shared library they won't 
discover that it can't be easily built.
Entry  07 Mar 2008, Randolf Pohl, Bug Report, array overflows and other bugs out.0.make
Hi,

I have just compiled MIDAS svn 4132 on a fresh SuSE 10.3 x86_64 system and gcc 
found a bunch of bugs, I guess.

> uname -a
Linux pc 2.6.22.17-0.1-default #1 SMP 2008/02/10 20:01:04 UTC x86_64 x86_64 
x86_64 GNU/Linux

> gcc --version
gcc (GCC) 4.2.1 (SUSE Linux)

I see loads of warnings during compile, most of which I know from earlier 
compiles:
* warning: dereferencing type-punned pointer will break strict-aliasing rules
* warning: pointer targets in passing argument 3 of 'getsockname' differ in
           signedness

But then there is a new one (in fact lots of this one), and brief
inspection suggests this is a true bug with the possibility of truly
nasty consequences.

(1)=========================
src/midas.c:7398: warning: array subscript is above array bounds
Inspection of midas.c:

   if (i == MAX_DEFRAG_EVENTS) {
      /* no buffer available -> no first fragment received */
7398: free(defrag_buffer[i].pevent);
      memset(&defrag_buffer[i].event_id, 0, sizeof(EVENT_DEFRAG_BUFFER));
      cm_msg(MERROR, "bm_defragement_event",
             "Received fragment without first fragment (ID %d) Ser#:%d",
             pevent->event_id & 0x0FFF, pevent->serial_number);
      return;
   }

And midas.c line 7297:
EVENT_DEFRAG_BUFFER defrag_buffer[MAX_DEFRAG_EVENTS];

So, if(i==MAX_DEFRAG_EVENTS) free(defrag_buffer[i]).
I guess this is an off-by-one bug.

(2)==========================
src/midas.c:2958: warning: array subscript is above array bounds

   for (i = 0; i < 13; i++)
2958  if (trans_name[i].transition == transition)
         break;

Holy cow, hard-coded "13" in the code! Should be a #define, shouldn't it?

Now look at midas.c lines 94ff:
struct {
   int transition;
   char name[32];
} trans_name[] = {
   {
   TR_START, "START",}, {
   TR_STOP, "STOP",}, {
   TR_PAUSE, "PAUSE",}, {
   TR_RESUME, "RESUME",}, {
   TR_DEFERRED, "DEFERRED",}, {
0, "",},};

There is no trans_name[12].

The trans_name[12] shows up in line 2894 and 2790, too.

(3)=============================
mfe.c:
src/mfe.c:412: warning: array subscript is above array bounds
src/mfe.c:311: warning: array subscript is above array bounds
src/mfe.c:340: warning: array subscript is above array bounds

412: device_drv->dd(CMD_GET_DEMAND, device_drv->dd_info, i, 
          &device_drv->mt_buffer->channel[i].array[CMD_GET_DEMAND]);


   for (cmd = CMD_GET_FIRST; cmd <= CMD_GET_LAST; cmd++) {
 (..)
311:  device_drv->mt_buffer->channel[current_channel].array[cmd] = value;

   for (cmd = CMD_GET_FIRST; cmd <= CMD_GET_LAST; cmd++) {
 (..)
340:  device_drv->mt_buffer->channel[i].array[cmd] = value;


CMD_GET_DEMAND is in include/midas.h:
#define CMD_GET_DEMAND               CMD_GET_DIRECT  // = 20

I haven't even tried to understand mfe.c, nor did I read it. 
But I suspect the thing should always be something like
....array[cmd-CMD_GET_FIRST]
so array[] is indexed from [0], not from an arbitrary number that
depends on the number of commands you insert before line 698 in
midas.h. But please could the author of this check this very carefully?


(4)=========================
src/lazylogger.c:1957: warning: array subscript is below array bounds

if ((channel < 0) && (lazyinfo[channel].hKey != 0))

That is lazyinfo[something below zero].


(5)=============================
More warnings an expert might want to have a look at:

* warning: deprecated conversion from string constant to 'char*'

* src/fal.c:106: warning: non-local variable '<anonymous struct> out_info'
                 uses anonymous type
* src/fal.c:3064: warning: non-local variable '<anonymous struct> eb' uses
                  anonymous type

I attach the full output of make.
Could someone knowledgeable please have a look at these warnings and fix them?

They make me a bit nervous when thinking about data integrity, and
there are now so many that they actually start to hide serious stuff
like the ones I presented.

Oh, I got rid of the "dereferencing type-punned pointer" thing by adding
"-fno-strict-aliasing" as a compiler flag. Was suggested on the Web. Seemed to 
have worked during data taking (the data look reasonable :-). Is that a 
possible fix/workaround?


Cheers,

Randolf
    Reply  07 Mar 2008, Stefan Ritt, Bug Report, array overflows and other bugs 
> I have just compiled MIDAS svn 4132 on a fresh SuSE 10.3 x86_64 system and gcc 
> found a bunch of bugs, I guess.

Ahh, great! gcc is getting more and more clever. Each time gcc is updated, it finds
a few new issues.

Indeed some are real bugs, and I will work down the list as time permits. I see
however no immediate thread (you are not using fragmented events, a transition 12
never occurs, etc.). Issue #4 from your list has to be checked by Pierre-Andre. 
ELOG V3.1.4-2e1708b5