Back Midas Rome Roody Rootana
  Midas DAQ System, Page 46 of 47  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
Entry  20 Nov 2003, Stefan Ritt, , Implementation of db_check_record() 
As Konstantin pointed out correctly, the db_create_record() call is pretty 
heavy since it copies whole structures around the ODB. Therefore, it 
should not used frequently. It might be that several problems are caused 
by that, for example the "phantom" records reported in elog:40 .

I have therefore implemented the function 

db_check_record(HNDLE hDB, HNDLE hKey, char *keyname, char *rec_str, 
                BOOL correct)

which takes an ASCII structure in the same way as db_create_record(), but 
only checks this ASCII structure against the ODB contents without writing 
anything to the ODB. 

If the record does not exist at all, it is created via db_create_record(). 
This is useful for example with the /Runinfo structure on a virgin ODB.

If the parameter "correct" is FALSE, the function returns 
DB_STRUCT_MISMATCH if the ODB contents is wrong (wrong order of variables, 
wrong name of variables, wrong type or array size). The calling function 
should then abort, since a subsequent db_open_record() would fail. Note 
that although abort() is useful, one should add cm_disconnect_experiment() 
just before the abort() in order to have the application "log out" from 
the ODB gracefully. If the parameter "correct" is TRUE, the function 
db_create_record() is called internally to correct a mismatching record.

I have changed most calls of db_create_record() in mhttpd.c, mfe.c, mana.c 
and mlogger.c. Pierre, could you do the same for lazylogger.c?

I also started to put assert()'s everywhere and encourage everyone to 
follow. Under Windows, the asserts() are removed automatically if 
compiling in "Release" mode.

So I committed many changes, did some quick tests, but am not 100% 
convinced that all the changes are good. So please use the new code 
cautiously, and let me know if there is any new problem. I also would like 
to get some feedback if the whole thing becomes more stable now.
    Reply  27 Nov 2003, Konstantin Olchanski, , Implementation of db_check_record() 
> I have therefore implemented the function 
> db_check_record(HNDLE hDB, HNDLE hKey, char *keyname, char *rec_str, BOOL
correct)

Stephan, something is very wrong with the new code. My
"/logger/channels/0/settings" is being destroyed on "begin run". Midas
checkout from october 31st is okey. This is a show stopper, but I am in a rush
and cannot debug it. I am falling back to the Oct 31st version... K.O.
       Reply  30 Nov 2003, Konstantin Olchanski, , Implementation of db_check_record() 
> > I have therefore implemented the function 
> > db_check_record(HNDLE hDB, HNDLE hKey, char *keyname, char *rec_str, BOOL
> correct)
> 
> Stephan, something is very wrong with the new code. My
> "/logger/channels/0/settings" is being destroyed on "begin run".

Okey. I found the problem in db_check_record(): when we decide that we have a
mismatch, we call db_create_record(...,rec_str), but by this time, rec_str no
longer points to the beginning of the ODB string because we started parsing it.

I tried this solution: save rec_str into rec_str_orig, then when we decide that
we have a mismatch, call db_create_record() with this saved rec_str_orig. It
fixes my immediate problem (destruction of "/logger/channels/0/settings"), but is
it correct?

I would like to fix it ASAP to get cvs-head working again: our mhttpd dumps core
on an assert() failure in db_create_record() and the set of db_check_record()
changes might fix it for me.

Here is the CVS diff:

RCS file: /usr/local/cvsroot/midas/src/odb.c,v
retrieving revision 1.73
diff -r1.73 odb.c
7810a7811
> char             *rec_str_orig = rec_str;
7820c7821
<     return db_create_record(hDB, hKey, keyname, rec_str);
---
>     return db_create_record(hDB, hKey, keyname, rec_str_orig);
7838c7839
<       return db_create_record(hDB, hKey, keyname, rec_str);
---
>       return db_create_record(hDB, hKey, keyname, rec_str_orig);
8023c8024
<               return db_create_record(hDB, hKey, keyname, rec_str);
---
>               return db_create_record(hDB, hKey, keyname, rec_str_orig);
8037c8038
<               return db_create_record(hDB, hKey, keyname, rec_str);
---
>               return db_create_record(hDB, hKey, keyname, rec_str_orig);

K.O.
          Reply  30 Nov 2003, Stefan Ritt, , Implementation of db_check_record() 
Fixed and committed. Can you check if it's working?
             Reply  01 Dec 2003, Konstantin Olchanski, , Implementation of db_check_record() 
> Fixed and committed. Can you check if it's working?
Yes, it is fixed. Thanks. K.O.
Entry  09 Dec 2003, Paul Knowles, , db_close_record non-local/non-return 
Hi All,

I have found a weird one:

The following code executes on the frontend machine in the
frontend_exit() routine, and connects to the odb running on
another separate machine:
...
     cm_msg(MINFO,__func__, "line %d", __LINE__);

     cm_get_experiment_database(&hdb, NULL);

     cm_msg(MINFO,__func__, "line %d", __LINE__);
     status = db_find_key(hdb, 0, "/Experiment/Run Parameters", &hkey);
     cm_msg(MINFO,__func__, "line %d, hkey=%d, status=%d",
            __LINE__, hkey, status);
     checkstat("db_find_key returned status %d", status);
     cm_msg(MINFO,__func__, "line %d", __LINE__);
     status = db_close_record(hdb, hkey);

     /* NOTREACHED!! the above call to db_close_record
        doesn't return!
      */
     cm_msg(MINFO,__func__, "line %d, status=%d", __LINE__, status);
     checkstat("db_close_record returned status %d", status);

checkstat is a macro that does the following:
#define checkstat(format, arg...)\
do{ if(status != DB_SUCCESS) {\
cm_msg(MERROR, __func__, format, ## arg);\
return FE_ERR_ODB;}}while(0)

The key exists, and the status of the search is 1
(i.e., DB_SUCCESS) and rest of the code tries to run.  What gets
really weird is that the db_close_record _doesn't_ _return_.
The code following the NOTREACHED comment just doesn't get
called.  I get the message from the __LINE__ just in front
of the call, but not the message afterwards (cm_msg and printf 
were tried).  Somehow db_close_record is causing a non-local 
exit or signal or something. No error message is printed and the 
frontend continues to exit with exit code 0.  But, since the rest
of my frontend_exit/odb closing doesn't happen, the odb is left in
a lost state requiring a cleanup.  If I comment out the calls to 
db_close_record, the rest of my frontend_exit runs normally 
and the cm_disconnect_experiment() in mfe.c eventually closes my 
open records correctly (I expect, anyway) and this is the present 
workaround i am using.  The terror i have is that several of my 
hotlinked callback routines will call the close_record routine 
when resetting illegal values.  No end of hilarity will result there...

I was using the same code in the frontend under 1.9.2 and
have only recently upgraded to 1.9.3-? tarball from PAA and 
there were no problems using the 1.9.2 code: this is a 1.9.3
issue.

I have localized the weirdness to what I think is the RPC interface.
Running the nullfrontend (no camac access) on the same machine as 
hosts the ODB I can make the problem appear and disappear in the 
following way:
(odb is local on machine ``monet'')

nullfe -h monet -e acqmonad     : db_close_record will get lost

nullfe -e acqmonad              : db_close_record works as expected.

I've tried also with the patch for the 256 byte odb string bug since
many of the open records have strings of that length, but that isn't
it. The only substancial looking change to mserver from 1.9.2 to 1.9.3
is the SIGPIPE ignore and that doesn't look like a good candidate either.
Can this be that some of the 
   #IFDEF LOCAL_ROUTINES
that got moved about in odb.c and others
are causing the remote call to get confused?

Clearly the answer is to just use stable and happy 1.9.2, but the 
people for whom I am working now really want to use ROOT for
an analyzer...


cheers,
.p.

Paul Knowles.                   phone: 41 26 300 90 64
email: Paul.Knowles@unifr.ch      Fax: 41 26 300 97 47
finger me at pexppc33.unifr.ch for more contact information
    Reply  12 Dec 2003, Stefan Ritt, , db_close_record non-local/non-return 
Hi Paul,

sorry my late reply, I had to find some time for debugging your problem. 
Thank you very much for the detailed description of the problem, I wish all 
bug reports would be such elaborate!

You were right that there was a bug in the RPC system. The function 
db_remove_open_record() got a new parameter recently, which was not changed 
in the RPC call, and caused the mserver side to crash on any 
db_close_record() call.

I fixed it and the update is under CVS (http://midas.psi.ch/cgi-
bin/cvsweb/midas/src/). Since you need to update many files, I wonder if I 
should enable anonymous CVS read access. Does anybody know how to set this 
up using "ssh" as the protocol (via CVS_RSH=ssh)?

Please note that db_close_record() is not necessary as 
cm_disconnect_experiment() takes care of this, but having it there does not 
hurt.
Entry  12 Dec 2003, Stefan Ritt, , Several small fixes and changes 
I committed several small fixes and changes:

- install.txt which mentions explicitly ROOT
- mana.c and the main Makefile which fixes all HBOOK compiler warnings
- mana.c to write an explicit warning if the experiment directoy contains 
uppercase letters in the path (HBOOK does not like this and refuses to 
read/write histos)
- mserver.c, mrpc.c, odb.c to fix a wrong parameter in 
db_remove_open_record() (see previous entry from Paul)
- added experim.h into the dependency of the hbookexpt Makefile
Entry  15 Dec 2003, Stefan Ritt, , Poll about default indent style  
Dear all,

there are continuing requests about the C indent style we use in midas. As 
you know, the current style does not comply with any standard. It is even 
a mixture of styles since code comes from different people. To fix this 
once and forever, I am considering using the "indent" program which comes 
with every linux installation. Running indent regularly on all our code 
ensures a consistent look. So I propose (actually the idea came from Paul 
Knowles) to put a new section in the midas makefile:

indent:
        find . -name "*.[hc]" -exec indent <flags> {} \;

so one can easily do a "make indent". The question is now how the <flags> 
should look like. The standard is GNU style, but this deviates from the 
original K&R style such that the opening "{" is put on a new line, which I 
use but most of you do not. The "-kr" style does the standard K&R style, 
but used tabs (which is not good), and does a 4-column indention which is 
I think too much. So I would propose following flags:

indent -kr -nut -i2 -di8 -bad <filename.c>

Please take some of your source code, and format it this way, and let me 
know if these flags are a good combination or if you would like to have 
anything changed. It should also be checked (->PAA) that this style 
complies with the DOC++ system. Once we all agree, I can put it into the 
makefile, execute it and commit the newly formatted code for the whole 
source tree.
    Reply  18 Dec 2003, Paul Knowles, , Poll about default indent style  
Hi Stefan,

> once and forever, I am considering using the "indent" program which comes 
> with every linux installation. Running indent regularly on all our code 
> ensures a consistent look.

I think this can be called a Good Thing.

> The "-kr" style does the standard K&R style, 
> but used tabs (which is not good), and does a 4-column 
> indention which is I think too much. So I would propose 
> following flags:
>        indent -kr -nut -i2 -di8 -bad <filename.c>

(some of this is a repeat from an earlier mail to SR):
You might also want a -l90 for a longer line length than 75
characters.  K&R style with indentation from 5 to 8 spaces
is a good indicator of complexity: as soon as 40 characters
of code wind up unreadably squashed to the right of the
screen, you have to refactor to have less indentation
levels.  This means you wind up rolling up the inner parts
of deeply nested conditionals or loops as separate
functions, making the whole code easier to understand.

I think that setting -i2 is ``going around the problem'' 
of deep nesting.  If you really need to keep the indentation 
tabs less than 4 (8 is ideal) because your code is falling off the 
right edge of the screen, you are indented too deeply.  Why do 
I say that?  There is the famous ``7+-1'' idea that you can hold
in you head only 7 ideas (give or take one) at any time.  I'm not 
that smart and I top out at about 5:  So for example, a conditional 
in a loop  in a conditional in a switch is about as deep a level 
of nesting as  I can easily understand (remember that I also have 
to hold the line i'm working on as well): that's 4 levels, plus one for the
function itself and we are at 40 characters away from the right edge
of the screen using -i8 and have some 40 characters available for writing code
(how often is a line of code really longer than about 40 characters?).
On top of that, the indentation is easily seen so you know immediately 
wheather you are at the upper conditional, or inner conditional.  A -i2
just doesn't make the difference big enough.  -i5 is a happy balance 
with enough visual clue as to the indentation level, but leaves you 50
to 60 characters for the code line itself.

However, if you are indenting very deeply, then the poor reader can't hold
on to the context: there are more than 6 or 7 things to keep in mind.
In those cases, roll up the inner levels as a separate function and 
call it that way. The inner complexity of the nested statements gets 
nicely abstracted and then dumb people like me can understand what 
you are doing.

So, in brief: indent is a good idea, and -in with n>=4 will be best.
I don't think -i2 will lend itself to making the code so much easier 
to read.

thanks for listening.
.p.
       Reply  18 Dec 2003, Stefan Ritt, , Poll about default indent style  
Hi Paul,

I agree with you that a nesting level of more than 4-5 is a bad thing, but I 
believe that throughout the midas code, this level is not exceeded (my poor 
mind also does not hold more than 5 things (;-) ). An indent level of 8 columns 
alone does hot force you too much in not extending the nesting level. I have 
seen code which does that, so there are nesting levels of 8 and more, which 
ends up that the code is smashed to the right side of the screen, where each 
statement is broken into many line since each line only holds 10 or 20 
characters. All the nice real estate on the left side of the scree is lost.

So having said that, I don't feel a strong need of giving up a "-i2", since the 
midas code does not contain deep nesting levels and hopefully will never have. 
In my opinion, a small indent level makes more use of your screen space, since 
you do not have a large white area at the left. A typical nesting level is 3-4, 
which causes already 32 blank charactes at the left, or 1/3 of your screen, 
just for nothing. It will lead to more lines (even with -l90), so people have 
to scroll more.

What do others think (Pierre, Konstantin, Renee) ?
          Reply  01 Jan 2004, Konstantin Olchanski, , Poll about default indent style  
> I don't feel a strong need of giving up a "-i2"...

I am comfortable with the current MIDAS styling convention and I would rather not
have yet another private religious war over the right location for the curley braces.

If we are to consider changing the MIDAS coding convention, I urge all and sundry
to read the ROOT coding convention, as written by Rene Brun and Fons Rademakers at
http://root.cern.ch/root/Conventions.html. The ROOT people did their homework, they
did read the literature and they produced a well considered and well argumented style.

Also, while there, do read the Taligent documentation- by far, one of the most
coherent manuals to C++ programming style.

K.O.
    Reply  06 Jan 2004, Stefan Ritt, , Poll about default indent style  
Ok, taking all comments so far into account, I conclude adopting the ROOT 
coding style would be best for us. So I put

indent:
	find . -name "*.[hc]" -exec indent -kr -nut -i3 {} \;

Into the makefile. Hope everybody is happy now (;-)))
Entry  14 Jan 2004, Konstantin Olchanski, , First try- midas on darwin/macosx xxx
While watching "The Wizard of Oz", the greatest movie ever made, I took a shot at building 
midas on my macosx computer. After stumbling on a few small and on a few hard problems, I 
built almost everything. However, odb does not work- some further debugging is in order.

Anyway, the easy problems are:
- a few missing header files: pty.h, sys/vfs.h, malloc.h
- a few missing features in system.c (stime(), "get tape position")
- /usr/include/string.h already has strlcpy() & co.
- dbg_malloc() has inconsistent prototypes (size_t vs unsigned int)
- for reasons unknown, PVM is #defined. This flushed a bug in mana.c

A few hard problems:
- namespace pollution by Apple- they #define ALIGN in system headers, colliding with ALIGN 
in midas.h. I was amazed that the two are almost identical, but MIDAS ALIGN aligns to 8 
bytes, while Apple does 4 bytes. ALIGN is used all over the place and I am not sure how to 
reconcile this.
- "timezone" in mhttpd.c. On linux, it's an "int", on darwin, it's a function. What gives?
- building libmidas.a requires running ranlib
- building libmidas.so requires unknown macosx specific magic.

For your enjoyment, the "cvs diff" is attached. The resulting code is known to not work.

K.O.
    Reply  14 Jan 2004, Stefan Ritt, , First try- midas on darwin/macosx 
Great, I got already questions about MacOSX support...

Once it's working, you should commit the changes. But take into account that using "//" for 
comments might cause problems for the VxWorks compiler (talk to Pierre about that!).

> A few hard problems:
> - namespace pollution by Apple- they #define ALIGN in system headers, colliding with ALIGN 
> in midas.h. I was amazed that the two are almost identical, but MIDAS ALIGN aligns to 8 
> bytes, while Apple does 4 bytes. ALIGN is used all over the place and I am not sure how to 
> reconcile this.

You can rename ALIGN to ALIGN8 all over the place.

> - "timezone" in mhttpd.c. On linux, it's an "int", on darwin, it's a function. What gives?

Wrap it into a function get_timezone(). Under linux, just return "timezone", under OSX, 
return timezone() via conditional compiling.

> - building libmidas.a requires running ranlib
> - building libmidas.so requires unknown macosx specific magic.

I guess we should foget for now about the shared libraries (Mac people anyhow have too much 
money so they can affort additional RAM (;-) ), but building the static library is mandatory. 
       Reply  16 Jan 2004, Konstantin Olchanski, , First try- midas on darwin/macosx xxx
> Great, I got already questions about MacOSX support...
> Once it's working, you should commit the changes.

With the ALIGN8() change ODB works, mhttpd works. ALIGN8 change now commited to cvs, verified that "make all" builds 
on Linux.

ROOT stuff still blows up because of more namespace pollution (/usr/include/sys/something does #define Free(x) 
free(blah...)). Arguably, it is not Apple's fault- portable programs should not include any <sys/foo.h> header files. I 
think I can fix it by moving "#include <sys/mount.h>" from midasinc.h to system.h.

Also figured out why PVM is defined- more pollution from "#include <sys/blah...>". This is only in mana.c and I will 
repace every "#ifdef PVM" with "#ifdef HAVE_PVM". Is there documentation that should be updated as well? Alternatively I 
can try to play games with header files...


> But take into account that using "//" for comments might cause problems for the VxWorks compiler (talk to Pierre 
about that!).

Yes, "// comments" stay out of midas. I used them to make the modification more visible.

> You can rename ALIGN to ALIGN8 all over the place.

Done, commited.

> > - "timezone" in mhttpd.c. On linux, it's an "int", on darwin, it's a function. What gives?
> Wrap it into a function get_timezone(). Under linux, just return "timezone", under OSX, 
> return timezone() via conditional compiling.

Right. Still on the todo list.

> > - building libmidas.a requires running ranlib

I still have to cleanup the Makefile. Not commiting it yet.

Then, a new problem- on MacOSX, pthread_t is not an "INT" and system.c:ss_thread_create() whines about it. I want to 
introduce a system dependant THREAD_T (or whatever) and make ss_thread_create() return that, rather than INT.

ROOT stuff is still not fully tested- it takes a little while to build ROOT on a 600MHz laptop.

Attached is my current CVS diff.

K.O.
          Reply  17 Jan 2004, Stefan Ritt, , First try- midas on darwin/macosx 
> With the ALIGN8() change ODB works, mhttpd works. ALIGN8 change now commited to cvs, verified that "make all" builds 
> on Linux.

Verified that "make all" still works under Windows.

> ROOT stuff still blows up because of more namespace pollution (/usr/include/sys/something does #define Free(x) 
> free(blah...)). Arguably, it is not Apple's fault- portable programs should not include any <sys/foo.h> header files. I 
> think I can fix it by moving "#include <sys/mount.h>" from midasinc.h to system.h.

I would like to keep all OS specific #includes in midasinc.h. In worst case put another section there for OSX, like

in midas.h:

#if !defined(OS_MACOSX)
#if defined ( __????__ ) <- put the proper thing here
#define OS_MACOSX
#endif
#endif

then make a new seciton in midasinc.h

#ifdef OS_MACOSX
#include <...>
#endif

> Also figured out why PVM is defined- more pollution from "#include <sys/blah...>". This is only in mana.c and I will 
> repace every "#ifdef PVM" with "#ifdef HAVE_PVM". Is there documentation that should be updated as well? Alternatively I 
> can try to play games with header files...

Right, PVM should be replaced by HAVE_PVM. This is only for the analyzer. I planned at some point to run the analyzer in 
parallel on a linux cluster, but it was never really used. Going to ROOT, that facility should be replaces by PROOF.

> Then, a new problem- on MacOSX, pthread_t is not an "INT" and system.c:ss_thread_create() whines about it. I want to 
> introduce a system dependant THREAD_T (or whatever) and make ss_thread_create() return that, rather than INT.

Good. If you have a OS_MACOSX, that should help you there.

-SR
             Reply  18 Jan 2004, Konstantin Olchanski, , First try- midas on darwin/macosx xxx
> I would like to keep all OS specific #includes in midasinc.h

No go. Here is the problem:

midasinc.h includes sys/mount.h, which #defines Free(x) to be something else
mana.c includes msystem.h, which includes midasinc.h
mana.c includes ROOT header files, which blow up because Free(x) is redefined.

I want this:

mana.c does *not* include sys/mount.h
system.c does include sys/mount.h

Simplest solution is to take sys/mount.h out of midasinc.h and include it in system.c

> Right, PVM should be replaced by HAVE_PVM.

Commited.

> > Then, a new problem- on MacOSX, pthread_t is not an "INT" and system.c:ss_thread_create() whines about it. I want to 
> > introduce a system dependant THREAD_T (or whatever) and make ss_thread_create() return that, rather than INT.
> Good. If you have a OS_MACOSX, that should help you there.

Okey. In Darwin, pthread_t is not an int. It is a pointer to a struct. In midas.c I typedef midas_pthread_t to HANDLE on Windows and to pthread_t n OS_UNIX.

This uncovered a problem with ss_getthandle(). What is it supposed to do? On Windows it returns a handle to the current thread, on OS_UNIX, it returns getpid(). 
What gives? I am leaving it alone for now.

Attached is the current diff. Most changes are in system.c: ss_timezone() and midas_pthread_t. The Makefile part is already commited. Building the shared 
library was made dependant on NEED_SHLIB. Now, building static midas applications is very simple, use "make SHLIB="

K.O.
                Reply  19 Jan 2004, Stefan Ritt, , First try- midas on darwin/macosx 
> I want this:
> 
> mana.c does *not* include sys/mount.h
> system.c does include sys/mount.h
> 
> Simplest solution is to take sys/mount.h out of midasinc.h and include it in system.c

Agree.

> This uncovered a problem with ss_getthandle(). What is it supposed to do? On Windows it returns a handle to the current thread, on OS_UNIX, it returns getpid(). 
> What gives? I am leaving it alone for now.

The Unix version of ss_getthandle() returns the pid since at the time when I wrote that function (many years ago) there were no threads under Unix. It should now 
be replaces with a function which returns the real thread id (at least under Linux).
                   Reply  19 Jan 2004, Konstantin Olchanski, , First try- midas on darwin/macosx 
> > Simplest solution is to take sys/mount.h out of midasinc.h and include it in system.c
> Agree.

Done.

With this, I commited the rest of my changes: midas_thread_t in midas.h, change ss_thread_xxx() prototypes in msystem.h
, implementation in system.c

My cvs diff is now empty.

Midas should compile on Darwin aka macosx, I tested "odbedit" and "mhttpd"- they seem to work.
 
> > This uncovered a problem with ss_getthandle(). 
> The Unix version of ss_getthandle() returns the pid since at the time when I wrote that function (many years ago) there were no threads under Unix. It should now 
> be replaces with a function which returns the real thread id (at least under Linux).

I do not want to touch this. Sorry.

K.O.
Entry  19 Jan 2004, Konstantin Olchanski, , darwin aka macosx changes 
I commited the final bits to make Midas build on Darwin aka macosx.

Here is the summary:

1) I treat Darwin as a funny linux, so OS_LINUX is always defined
2) OS_DARWIN is defined for places where the two differ
3) system dependant directory is "midas/darwin/{bin,lib}"
4) a few header files had to be moved around to dodge namespace pollution by Apple system 
header files (i.e. one of the PowerPC header files #defines PVM- collision with PVM in mana.c, 
another #defines Free(x)- collision with ROOT header files)
5) ss_thread_create() and ss_thread_kill() now use midas_thread_t. On Darwin ptherad_t is not 
an "int".
6) the Makefile has no support for building the midas shared library on macosx.
7) on my Mac OS 10.2.8 machine, "make all" works, "odbedit" and "mhttpd" run. This is the 
full extent of my testing. Status on Mac OS 10.3.x is unknown.

K.O.
Entry  30 Mar 2004, Konstantin Olchanski, , elog fixes elog-fixes.txt
I am about to commit the mhttpd Elog fixes we have been using in TWIST since
about October. The infamous Elog "last N days" problem is fixed, sundry
memory overruns are caught and assert()ed.

For the curious, the "last N days" problem was caused by uninitialized data
in the elog handling code. A non-zero-terminated string was read from a file
and passed to atoi(). Here is a simplifed illustration:

char str[256]; // uninitialized, filled with whatever happens on the stack
read(file,str,6); // read 6 bytes, non-zero terminated
// str now looks like this: "123456UUUUUUUUU....", "U" is uninitialized memory
int len = atoi(str); // if the first "U" happens to be a number, we lose.

The obvious fix is to add "str[6]=0" before the atoi() call.

Attached is the CVS diff for the proposed changes. Please comment.

K.O.
    Reply  30 Mar 2004, Stefan Ritt, , elog fixes 
Thanks for fixing these long lasting bugs. The code is much cleaner now, please
commit it.
Entry  28 Apr 2004, Konstantin Olchanski, , mhttpd "start run" input field length? 
I am setting up a new experiment and I added a "comment" field to "/
Experiment/Edit on start". When I start the run, I see this field, but I
cannot enter anything: the HTML "maxlength" is zero (or 1?). I traced this
to mhttpd.c: if (this is a string) maxlength = key.item_size. But what is
key.item_size for a string? The current length? If so, how do I enter a
string that is longer than the current one (zero in case I start from
scratch). I am stumped! K.O.
    Reply  30 Apr 2004, Stefan Ritt, , mhttpd  
> I am setting up a new experiment and I added a "comment" field to "/
> Experiment/Edit on start". When I start the run, I see this field, but I
> cannot enter anything: the HTML "maxlength" is zero (or 1?). I traced this
> to mhttpd.c: if (this is a string) maxlength = key.item_size. But what is
> key.item_size for a string? The current length? If so, how do I enter a
> string that is longer than the current one (zero in case I start from
> scratch). I am stumped! K.O.

Your problem is that you created a ODB string with zero length. If you do this
through ODBEdit, a default length of 32 is used:

[local:Test:S]Edit on start>cr string Comment
String length [32]:
[local:Test:S]Edit on start>ls -l
Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
Comment                         STRING  1     32    2s   0   RWD
[local:Test:S]Edit on start>

which then results in a maxlength of 32 as well during run start. I presume
you used mhttpd itself to create the string. Trying to reporduce this, I found
that mhttpd creates strings with zero length. I will fix this soon. Until
then, use ODBEdit to create your strings.
Entry  06 Jun 2004, Konstantin Olchanski, , Makefile: set -rpath 
I commited Makefile bits to set the RPATH on dynamically linked executables
to find libmidas.so and ROOT shared libraries without setting
LD_LIBRARY_PATH , etc. K.O.
Entry  07 May 2004, Konstantin Olchanski, , min(a,b) in mana.c and mlogger.c 
When I compile current cvs-head midas, I get errors about undefined function
min(). I do not think min() is in the list of standard C functions, so
something else should be used instead, like a MIN(a,b) macro. To make life
more interesting, in a few places, there is also a variable called "min".
Here is the error:

src/mana.c: In function `INT write_event_ascii(FILE*, EVENT_HEADER*, 
   ANALYZE_REQUEST*)':
src/mana.c:2571: `min' undeclared (first use this function)
src/mana.c:2571: (Each undeclared identifier is reported only once for each 
   function it appears in.)
make: *** [linux/lib/rmana.o] Error 1

K.O.
    Reply  07 May 2004, Stefan Ritt, , min(a,b) in mana.c and mlogger.c 
> When I compile current cvs-head midas, I get errors about undefined function
> min(). I do not think min() is in the list of standard C functions, so
> something else should be used instead, like a MIN(a,b) macro. To make life
> more interesting, in a few places, there is also a variable called "min".
> Here is the error:
> 
> src/mana.c: In function `INT write_event_ascii(FILE*, EVENT_HEADER*, 
>    ANALYZE_REQUEST*)':
> src/mana.c:2571: `min' undeclared (first use this function)
> src/mana.c:2571: (Each undeclared identifier is reported only once for each 
>    function it appears in.)
> make: *** [linux/lib/rmana.o] Error 1

This is really a miracle to me. The min/max macros are defined both in midas.h
and msystem.h and worked the last ten years or so. However, I agree that macros
should follow the standard and use capital letters, so I changed that.
       Reply  21 Jun 2004, Piotr Zolnierczuk, , min(a,b) in mana.c and mlogger.c 
> > When I compile current cvs-head midas, I get errors about undefined function
> > min(). I do not think min() is in the list of standard C functions, so
> > something else should be used instead, like a MIN(a,b) macro. To make life
> > more interesting, in a few places, there is also a variable called "min".
> > Here is the error:
> > 
> > src/mana.c: In function `INT write_event_ascii(FILE*, EVENT_HEADER*, 
> >    ANALYZE_REQUEST*)':
> > src/mana.c:2571: `min' undeclared (first use this function)
> > src/mana.c:2571: (Each undeclared identifier is reported only once for each 
> >    function it appears in.)
> > make: *** [linux/lib/rmana.o] Error 1
> 
> This is really a miracle to me. The min/max macros are defined both in midas.h
> and msystem.h and worked the last ten years or so. However, I agree that macros
> should follow the standard and use capital letters, so I changed that.

The problem is that /usr/include/c++/3.*/bits/stl_algobase.h contains 
#undef min
#undef max

and in C++ with STL one should really use something like this
    std::min<INT>(a,b)


Cheers
  Piotr
Entry  22 Jun 2004, Exaos Lee, , How to compile under Darwin-gcc? (MacOS X) 
I add the following to makefile and try to treat Darwin as FreeBSD/Linux.
But I failed.
============ 
#-----------------------
# This is for MacOS X
#
ifeq ($(OSTYPE), Darwin)
CC = gcc
OS_DIR = Darwin
OSFLAGS = -DOS_DARWIN -DOS_LINUX
LIBS = -lbsd -lcompat
SPECIFIC_OS_PRG =
endif
============

I got the following errors:
=============
gcc -c -g -O2 -Wall -Iinclude -Idrivers -LDarwin/lib -DINCLUDE_FTPLIB  
-DOS_DARWIN -DOS_FREEBSD -o Darwin/lib/midas.o src/midas.c
In file included from include/midasinc.h:45,
                 from include/msystem.h:114,
                 from src/midas.c:623:
/usr/include/string.h:112: error: conflicting types for `strlcat'
include/midas.h:1701: error: previous declaration of `strlcat'
/usr/include/string.h:113: error: conflicting types for `strlcpy'
include/midas.h:1700: error: previous declaration of `strlcpy'
In file included from include/msystem.h:114,
                 from src/midas.c:623:
include/midasinc.h:161:21: sys/vfs.h: No such file or directory
include/midasinc.h:164:17: pty.h: No such file or directory
src/midas.c:780: error: conflicting types for `dbg_malloc'
include/midas.h:1478: error: previous declaration of `dbg_malloc'
src/midas.c:817: error: conflicting types for `dbg_calloc'
include/midas.h:1479: error: previous declaration of `dbg_calloc'
src/midas.c:858: error: conflicting types for `strlcpy'
/usr/include/string.h:113: error: previous declaration of `strlcpy'
src/midas.c:892: error: conflicting types for `strlcat'
/usr/include/string.h:112: error: previous declaration of `strlcat'
gmake: *** [Darwin/lib/midas.o] Error 1
==========

Could anyone give me some hints. Thanks!
    Reply  22 Jun 2004, Konstantin Olchanski, , How to compile under Darwin-gcc? (MacOS X) 
The current (cvs) version of MIDAS should build on Mac OS X right out of the
box- I fixed all the problems you report back in February(?)- see the macosx
thread in this forum. A few weeks ago I verified that it still compiles on Mac
OS 10.3.4. The Mac OS port received minimal testing- I checked that "odbedit"
and "mhttpd" run, that's about it. K.O.

> I add the following to makefile and try to treat Darwin as FreeBSD/Linux.
> But I failed.
> ============ 
> #-----------------------
> # This is for MacOS X
> #
> ifeq ($(OSTYPE), Darwin)
> CC = gcc
> OS_DIR = Darwin
> OSFLAGS = -DOS_DARWIN -DOS_LINUX
> LIBS = -lbsd -lcompat
> SPECIFIC_OS_PRG =
> endif
> ============
> 
> I got the following errors:
> =============
> gcc -c -g -O2 -Wall -Iinclude -Idrivers -LDarwin/lib -DINCLUDE_FTPLIB  
> -DOS_DARWIN -DOS_FREEBSD -o Darwin/lib/midas.o src/midas.c
> In file included from include/midasinc.h:45,
>                  from include/msystem.h:114,
>                  from src/midas.c:623:
> /usr/include/string.h:112: error: conflicting types for `strlcat'
> include/midas.h:1701: error: previous declaration of `strlcat'
> /usr/include/string.h:113: error: conflicting types for `strlcpy'
> include/midas.h:1700: error: previous declaration of `strlcpy'
> In file included from include/msystem.h:114,
>                  from src/midas.c:623:
> include/midasinc.h:161:21: sys/vfs.h: No such file or directory
> include/midasinc.h:164:17: pty.h: No such file or directory
> src/midas.c:780: error: conflicting types for `dbg_malloc'
> include/midas.h:1478: error: previous declaration of `dbg_malloc'
> src/midas.c:817: error: conflicting types for `dbg_calloc'
> include/midas.h:1479: error: previous declaration of `dbg_calloc'
> src/midas.c:858: error: conflicting types for `strlcpy'
> /usr/include/string.h:113: error: previous declaration of `strlcpy'
> src/midas.c:892: error: conflicting types for `strlcat'
> /usr/include/string.h:112: error: previous declaration of `strlcat'
> gmake: *** [Darwin/lib/midas.o] Error 1
> ==========
> 
> Could anyone give me some hints. Thanks!
       Reply  23 Jun 2004, Exaos Lee, , How to compile under Darwin-gcc? (MacOS X) 
> The current (cvs) version of MIDAS should build on Mac OS X right out of the
> box- I fixed all the problems you report back in February(?)- see the macosx
> thread in this forum. A few weeks ago I verified that it still compiles on Mac
> OS 10.3.4. The Mac OS port received minimal testing- I checked that "odbedit"
> and "mhttpd" run, that's about it. K.O.
> 

Thanks a lot. But I cannot checkout module:
------------
01:52:16: pc2075.psi.ch: Operation timed out
01:52:16: cvs [checkout aborted]: end of file from server (consult above
messages if any)
------------

Could anybody add download tar package in the WWW interface of CVS repository.
I know the original CGI script has such a feature. Thanks.

P.S.
I use these commands to checkout:
   cvs -e ssh -d :ext:cvs@midas.psi.ch:/usr/local/cvsroot checkout midas
   cvs -e ssh -d :ext:cvs@midas.psi.ch:/usr/local/cvsroot update
          Reply  23 Jun 2004, Stefan Ritt, , How to compile under Darwin-gcc? (MacOS X) 
> Thanks a lot. But I cannot checkout module:
> ------------
> 01:52:16: pc2075.psi.ch: Operation timed out
> 01:52:16: cvs [checkout aborted]: end of file from server (consult above
> messages if any)
> ------------

Should work fine, just tried from outside PSI. Please check again.

> Could anybody add download tar package in the WWW interface of CVS repository.
> I know the original CGI script has such a feature. Thanks.

The tar package is only done for a new release (which will happen in the next days
BTW), so http://midas.psi.ch/download/tar/ contains the most recent packages. Upon
request I make a midas-snapshot.tar.gz, but since there will be a 1.9.4 soon, it's
maybe not necessary right now.
             Reply  23 Jun 2004, Exaos Lee, , How to compile under Darwin-gcc? (MacOS X) 
> 
> Should work fine, just tried from outside PSI. Please check again.

Unfortunately, I still encounter the same problem. 
---
pc2075.psi.ch: Operation timed out
cvs [checkout aborted]: end of file from server (consult above messages if any)
---

I am in LNS-INFN (Italy), i.e., I am outside PSI. So ... what's the problem? I try to
ping the host, and it is reachable:
--------
[exaos@exaos cvsnew]$ ping midas.psi.ch
PING pc2075.psi.ch (129.129.228.23): 56 data bytes
64 bytes from 129.129.228.23: icmp_seq=0 ttl=50 time=67.237 ms
64 bytes from 129.129.228.23: icmp_seq=1 ttl=50 time=64.202 ms
64 bytes from 129.129.228.23: icmp_seq=2 ttl=50 time=56.278 ms
...
--------
Is it the problem of firewall? I am not sure. So strange.

> 
> The tar package is only done for a new release (which will happen in the next days
> BTW), so http://midas.psi.ch/download/tar/ contains the most recent packages. Upon
> request I make a midas-snapshot.tar.gz, but since there will be a 1.9.4 soon, it's
> maybe not necessary right now.

Waiting for the new release ...
             Reply  28 Jun 2004, Exaos Lee, , Linking Error: g++ -rpath? 
I cannot checkout from the cvs server. So I download each latest file from the WWW
interface of CVS. While compiling these files, I encountered the following problems:
-------------
...
g++ -DHAVE_ROOT -c -g -O2 -Wall -Iinclude -Idrivers -Ldarwin/lib -DINCLUDE_FTPLIB  
-DOS_LINUX -DOS_DARWIN -DHAVE_STRLCPY -fPIC -Wno-unused-function -D_REENTRANT
-I/sw/include -I/opt/root/current/include -Wl,-rpath,/opt/root/current/lib -o
darwin/lib/rmana.o src/mana.c
g++: -rpath: linker input file unused because linking not done
g++: /opt/root/current/lib: linker input file unused because linking not done
...
g++ -g -O2 -Wall -Iinclude -Idrivers -Ldarwin/lib -DINCLUDE_FTPLIB   -DOS_LINUX
-DOS_DARWIN -DHAVE_STRLCPY -fPIC -Wno-unused-function -DHAVE_ROOT -D_REENTRANT
-I/sw/include -I/opt/root/current/include -Wl,-rpath,/opt/root/current/lib -o
darwin/bin/mlogger src/mlogger.c darwin/lib/libmidas.a -L/opt/root/current/lib -u
_G__cpp_setupG__Hist -u _G__cpp_setupG__Graf1 -u _G__cpp_setupG__G3D -u
_G__cpp_setupG__GPad -u _G__cpp_setupG__Tree -u _G__cpp_setupG__Rint -u
_G__cpp_setupG__PostScript -u _G__cpp_setupG__Matrix -u _G__cpp_setupG__Quadp -u
_G__cpp_setupG__Physics -lCore -lCint -lHist -lGraf -lGraf3d -lGpad -lTree -lRint
-lPostscript -lMatrix -lQuadp -lPhysics -lpthread -lm -L/sw/lib -ldl -lpthread
ld: unknown flag: -rpath
gmake: *** [darwin/bin/mlogger] Error 1
---------------
What does '-rpath' mean? It is just a linking error. Thanks.
                Reply  28 Jun 2004, Konstantin Olchanski, , Linking Error: g++ -rpath? 
> ld: unknown flag: -rpath
> gmake: *** [darwin/bin/mlogger] Error 1

Fixed. Good catch.

> What does '-rpath' mean?

You will have to read the "ld" manual. In the nutshell, it tells the executable where to look for shared libraries. 
Aparently it is not supported by Mac OS X.

K.O.
Entry  30 Jun 2004, Piotr Zolnierczuk, , mvme167 problems 
Hi,
 I am really puzzled: I am running the very same as far as sources
are concerned (Dec 12, 2003 snapsot) midas frontend (miniexp + camacnul)
on two different machines (and the same trusted private network):

1) one is an ancient Pentium/100 MHz laptop with RedHat Linux 7.3 and 
2) another one is event more ancient MVME167 25MHz running VxWorks 5.4.2

The front end on my Linux PC works just fine, whereas on the MVME167
I get intermittent crashes (most often at the end of the run).
[Correction: the crashes happen, I think, when the frontend wants 
to update the ODB]

The crashes happen in db_set_record routines

Any ideas what might be wrong? 
Except that MVME167 is a piece of ...#@!% 

Piotr
    Reply  30 Jun 2004, Piotr Zolnierczuk, , mvme167 problems 
A followup: I traced back the problem to version 1.9.2.

Version 1.9.1 does not have this problem but 1.9.2 does. 
For now I stick with 1.9.1

Piotr
 
Entry  14 Jul 2004, Exaos Lee, , install problem of Makefile on MacOS X (Darwin 7.4.0, gcc 3.3) 
I have compiled the sources on Darwin 7.4.0 with gcc 3.3. After the compilation of source codes, I 
try to execute "gmake install". I got the following message:
-------
Nothing to be done for "install".
-------

The install target could not be executed. Then I add the following line to the Makefile:
------
.PHONY: install
------

The install target can be executed. But when it is tring to copy "dio" to the proper directory, it 
cannot find the file. Then I found that the "utils" target isn't built. 
I try to build the target: darwin/bin/dio, I got the following error:
-------
cc -g -O2 -Wall -Iinclude -Idrivers -Ldarwin/lib -DINCLUDE_FTPLIB   -DOS_LINUX -DOS_DARWIN 
-DHAVE_STRLCPY -fPIC -Wno-unused-function -o darwin/bin/dio utils/dio.c
utils/dio.c:39:20: sys/io.h: No such file or directory
utils/dio.c: In function `main':
utils/dio.c:46: warning: implicit declaration of function `iopl'
gmake: *** [darwin/bin/dio] Error 1
--------
So, the include file "sys/io.h" may be changed under Darwin. I don't know how. I will try later. I 
hope somebody can notice this. 
Best regards.
    Reply  14 Jul 2004, Exaos Lee, , install problem of Makefile on MacOS X (Darwin 7.4.0, gcc 3.3) 
There are not such a file "io.h" inside my MacOS X. In fact, I didn't find any file containing function iopl().
So what is the equivalent function of iopl() under MacOS X? The utility dio should be modified in order to be compiled under Darwin-
gcc platform. 
       Reply  14 Jul 2004, Konstantin Olchanski, , install problem of Makefile on MacOS X (Darwin 7.4.0, gcc 3.3) 
> There are not such a file "io.h" inside my MacOS X. In fact, I didn't find any file containing function iopl().
> So what is the equivalent function of iopl() under MacOS X? The utility dio should be modified in order to be compiled under Darwin-
> gcc platform. 

"dio" is not supported under MacOSX. It is used to grant user programs access to PCI and ISA cards (usually CAMAC interfaces). We have
no MacOSX hardware with PCI or ISA slots so we cannot test and support this functionality.

The MacOSX Makefile should not try to build "dio". I will accept a patch to fix this Makefile bug.

K.O.
Entry  31 Aug 2004, Konstantin Olchanski, , mlogger crash if using mserver. 
Our users keep making a simple mistake- they set MIDAS_SERVER_HOST in their
environement. Most midas programs do not mind this- they go through the
mserver, inefficient but benign- except for the mlogger, which dumps core
about 10 seconds after starting. This mightily confuses the users-
everything works perfectly, except for the mlogger, (for most users) the
most obscure and magical part of midas. Obviously they can't take data
without the mlogger and they fail to correlate this crash with editing their
.cshrc file, so we get panic calls at midnight or whenever. And every time,
while debugging midas malfunctions, changes to .cshrc is absolutely the last
place we look for. Ouch!

As it turns out, mlogger does crash if it uses the mserver-
log_system_history() calls db_lock_database(), with a prompt crash because
the mlogger is not directly connected to any ODB (it's mserver is).

Obviously, running the mlogger via the mserver makes no sense, but we should
warn about this rather than dump core.

I propose this patch to src/mlogger.c::log_system_history():

-   db_lock_database(hDB);
-   db_notify_clients(hDB, hist_log[index].hKeyVar, FALSE);
-   db_unlock_database(hDB);
-
+   if (!rpc_is_remote())
+     {
+       db_lock_database(hDB);
+       db_notify_clients(hDB, hist_log[index].hKeyVar, FALSE);
+       db_unlock_database(hDB);
+     }
+   else
+     {
+       cm_msg(MERROR, "log_system_history", "Warning: mlogger is running
remotely via the mserver. This is an unsupported configuration. Please unset
MIDAS_SERVER_HOST and restart the mlogger");
+     }

K.O.
    Reply  07 Sep 2004, Stefan Ritt, , mlogger crash if using mserver. 
I trapped myself into that problem recently so it's the right time to fix it (;-).

We have two options: 

a) Make the logger work remotely, even if it's suboptimal and 
b) Make the logger refuse to run remotely. 

I have no case where I need to run the logger remotely, so I would opt for b).
This would mean removing the "-h" command line switch and the evaluation of
MIDAS_SERVER_HOST, or just supplying an empty host string to
cm_connect_experiment().

Let me know if you agree, I can then remove the "-h" option. The patch you
suggested I would apply in addition.

- Stefan
       Reply  15 Sep 2004, Konstantin Olchanski, , mlogger crash if using mserver. 
> I trapped myself into that problem recently so it's the right time to fix it (;-).
> We have two options: 
> a) Make the logger work remotely, even if it's suboptimal and 
> b) Make the logger refuse to run remotely.

After some discussion between Stefan, Pierre and myself, it was decided to disallow
running mlogger remotely via the mserver.

K.O.
Entry  31 Aug 2004, Konstantin Olchanski, , midas odb locking 
One of our experiments is suffering from periodic ODB corruption and I
suspected that there might be a problem with ODB locking. In the last few
days, I finally had time to read the ODB locking code, to write a little
test program and to play with ODB. This is what I found:

1) ODB locking appears to be sound, my test program failed to find any
locking flaws, except for a big problem, described below. Please read on.

2) ODB locking is "unfair". A program "while (1) { lock(); do_stuff();
unlock(); /* no sleep here */ }" would lock out other users of ODB
(including odbedit) for seconds and minutes at a time. I see this as a flaw
in the semop() implementation in the Linux kernel and I cannot think of an
easy way to fix it in our code. (I tested only on RHL9 2.4.20-31.9smp on a
dual CPU machine. 2.6 kernels may work better).

3) presently, we use an infinite timeout waiting for the ODB lock. I suggest
we set the timeout to, say, 5 minutes, to protect against dead (or live)
locks that we saw a few times here at TRIUMF- every ODB client would hang
without any error messages or explanations forever waiting for the ODB lock
that is held by some rogue ODB client stuck in an infinite loop in corrupted
ODB.

4) while reading the locking code in db_{lock,unlock}_database(), I thought
that there is a race condition against the "lock_cnt" variable, until I
realized that this variable is local and there is no race condition. I would
like to comment this in the code?

5) I found a failure mode where db_close_database() erroneously deletes the
lock semaphore. Once the semaphore is deleted, ODB locking silently fails
(in db_lock_database() we do not check for success status of
mutex_wait_for()) and remaining ODB clients operate without locking protection.

This failure happens after ODB undercounts active clients after losing track
of clients removed by "idle timeouts" (and by other checks?). At some point,
db_close_database() decides that there are no more clients left, attempts to
delete the shared memory (this fails because there are still active clients
attached) and deletes the lock semaphore. Afterwards, the remaining "lost"
active clients operate without lock protection. This would tend to happen
while shutting down all clients, a time when they all rush-in to delete
themselves from "/system/clients", unsuring ODB corruption.

A quick solution I just coded would not work for mmap()-based shared memory
(I destroy the lock semaphore after the ODB shared memory is destroyed) as
this relies on "shm_nattch" counting feature of System-V shared memories,
absent in the mmap() based shared memories. Since the Windows implementation
uses mmap(), my "solution" is an obvious no-go. Alternatives would be to add
a second semaphore, just for counting active ODB clients (kluncky); or never
delete the semaphore in the first place (dirty, and how does one clear it if
it gets stuck in the locked state?).

For now, I would like to add a check to ss_mutex_waitfor() call in
db_lock_database() and crash if we can't get the mutex. Returning an error
code would be cleaner, but would not work because nobody checks the return
status of db_lock_database(). If can't get the mutex for (say) 5 minutes, I
think we should crash, too- something is very wrong and it is pointless to
continue waiting.

K.O.
    Reply  15 Sep 2004, Konstantin Olchanski, , midas odb locking 
After some discussion with Stefan-

> 1) ODB locking appears to be sound...
> 2) ODB locking is "unfair"

Stefan reminded me that "priority boosting" is the standard solution for this
problem. Since Linux does not appear to implement this, we may try doing it inside
midas, time permitting. "Fairness" behaviour of Win32, BSD and MacOSX may be worth
investigating.

> 3) presently, we use an infinite timeout waiting for the ODB lock.

I will add a timeout of 10 minutes, then shutdown the ODB client with an error message.

> 4) in db_{lock,unlock}_database(), [there is no] race condition against the
"lock_cnt" variable [because it is local].

I will document this.

> 5) I found a failure mode where db_close_database() erroneously deletes the
> lock semaphore. Once the semaphore is deleted, ODB locking silently fails
> (in db_lock_database() we do not check for success status of
> mutex_wait_for()) and remaining ODB clients operate without locking protection.

I will add a check and shutdown the ODB client with an error message if the lock
cannot be obtained (the mutex was deleted, the "lock" system call returns an error,
etc).

> [how to decide when the last ODB client disconnected from the shared memory and
when to delete the lock semaphore?]

We considered using a counting semaphore to count active ODB clients, if counting
semaphores do the right things on all supported systems (Linux, Win32, MacOSX).

K.O.
       Reply  16 Sep 2004, Stefan Ritt, , midas odb locking 
> I will add a timeout of 10 minutes, then shutdown the ODB client with an error message.

I added a timeout handling to db_lock_database. It was already present in
ss_mutex_wait_for, so it was just a matter of passing the status up the calling stack.
ODBEdit stops if it cannot obtain a lock after 5 minutes.
Entry  21 Sep 2004, Konstantin Olchanski, , ODB-EPICS gateway 
At TRIUMF, we use several different versions of code to interface MIDAS and
EPICS (http://www.aps.anl.gov/epics). Now that we more or less understand
our needs, I propose this design for a simplified "EPICS" MIDAS frontend. I
would like to keep this new front end in the MIDAS CVS repository, possibly
replacing the existing EPICS frontend in examples/epics.

The basic idea is to provide an ODB-driven bi-directional gateway between
EPICS and ODB with this functionality: periodically read EPICS data and save
it in ODB, optionally generate MIDAS events with EPICS data; for writing
data to EPICS, use hotlinks- if the user changes "write" variables in ODB,
the changes are sent to EPICS.

1) ODB structure
   /equipment/epicsgw/
     common/...
     statistics/...
     variables/
       epics[...] <--- EPICS->ODB data (double[])
       write[...] <--- ODB->EPICS data (double[])
     settings/
       num epics  <--- number of epics variables (int)
       num write  <--- number of write variables (int)
       names epics <--- human-readable names for EPICS variables (string[])
       names write <--- human-readable names for EPICS variables (string[])
       chans epics <--- EPICS channels for epics-read data (string[])
       chans write <--- EPICS channels for epics-write data (string[])
       period      <--- EPICS read period in milliseconds (int[])
       enable epics <-- enable (y/n) epics-read (bool[])
       enable write <-- enable (y/n) epics-write (bool[])
       enable events <- enable event generation (bool)

2) EPICS to ODB data path: periodically read each enabled "epics" variable
and write the data values to ODB. Other front ends can hotlink the "epics"
variables to receive updated epics data.

3) ODB to EPICS data path: monitor the hotlink to ".../variables/write". If
data changes, send the changes to EPICS. At startup, write all "write"
variables to EPICS.

4) event generation: TBD.

5) error handling: TBD.

K.O.
    Reply  21 Sep 2004, Stefan Ritt, , ODB-EPICS gateway 
The easiest way to achieve this is to write a new class driver, probably derived
from the multi.c class driver. One has just to rename all "output" with "write"
(or better "ODB2EPICS") and all "input" with "EPICS2ODB". The multi class driver
handles already a factor/offset for each channel (which could be 1/0 of course),
a threshold to update the ODB/EPICS only when a value changes significantly, to
retrieve labes from the bus driver (EPICS labes -> ODB settings), automatic
event generation and error handling. So it would be a good starting point.

What one gets from the class driver in the ODB is:

  /equipment/<name>/
     variables/
        Input[]     <--- read from the bus driver (float)
        Output[]    <--- witten to the bus driver (float)
     settings/
        Names Input[]        <--- human readable names
        Names Output[]       <--- human readable names
        Update Threshold[]
        Input Offset[]
        Input Factor[]
        Output Offset[]
        Output Factor[]
        Devices/
           Input/
              DD/   <--- parameters for Device Driver
                 ... Epics addresses, flags etc.
              BD/   <--- parameters for Bus Driver
           Output/
        
So if one uses the standard mfe.c code together with the multi.c class driver
and epics_ca.c device driver all what is left is the following:

- replace cd_gen.c by multi.c in the examples/epics directory
- break down the already existing flags into enable epics/write/events
- maybe add th EPICS read period

The last two things should be done in the epics_ca.c device driver, so one can
use the multi.c class driver without any change. Event generation and error
handling then comes for free.
Entry  02 Jul 2003, Pierre-André Amaudruz, , Midas/ROOT Analyser situation midas-root.jpg
The current and future situation of the Midas analyzer is summarized in the
attachment below.

Box explanation:
================
Front end:
---------
Midas code for accessing/gathering the hardware information into the Midas
format.

Midas SHM:
---------
Midas back end shared memory where the front end data are sent to.

mlogger:
-------
Data logger collecting the midas events and storing them on a physical
logging device (Disk, Tape)

Midas Analyzer:
--------------
Midas client for event-by-event analysis. Incoming data can be either online
or offline.

mserver:
-------
Subprocess interfacing external (remote) midas client to the centralized
data collection and database system.

PAW:
---
Standalone physics data analyzer (CERN).

ROOT:
----
Standalone Physics data analyser (CERN).


This diagram represents the data path from the Frontend to the analyzer in
online and offline mode. Each data path is annoted with a circled number
discussed below. In all cases, the data will flow from the front end
application to the midas back end data buffers which reside in a specific
share memory for a given experiment.

Path:
(1): From the shared memory, the midas analyzer can request events directly
and process them for output to divers destination.

(2): The data logger is a specific application which stores all the data to
 a storage media such as a disk or tape. This path is specific to the
creation of file.mid file format. The actual storage file in this .mid
format can be readout later on by the midas analyzer.

(3): The Midas analyzer has been developed originally for interfacing to the
PAW analyzer which uses its own shared memory segment for online display.
The analyzer can also save the data into a specific data format consistent
with PAW (HBOOK and Ntuples, extension .rz).

(4): Presently the data logger support a creation of the ROOT file format.
This file contains in the form of a Tree the midas event-by-event data. This
file is fully compatible with ROOT and therefore can be read out by the
standard ROOT application.

(5): Equivalent to the data logger, the analyzer receiving from the data
buffer or reading from a .mid file data can apply an event-by-event analysis
and on request produce a compliant ROOT file for further analysis. This
.root file can be composed of Trees as well as histograms.

(6): The possibility of ONLINE ROOT analysis has been implemented in a first
stage through the TMapFile (ROOT shared memory). While this configuration is
still in use an experiment, the intention is to deprecate it and replace it
with the data path (7).

(7): This path uses the network socket channel to transfer data out of the
analyzer to the ROOT environment. The current analyzer has a limited support
for ROOT analysis by only publishing on request the Midas analysis built in
histograms. No mean is yet implemented for Tree passing mechanism.

(8): The pass has not been yet investigated, but ROOT does provide
accessibility to external function calls which makes this option possible.
The ROOT framework will then perform dedicated event call to the main midas
data buffer using the standard midas communication scheme. The data format
translation from Midas banks to ROOT format will have to be taken care at
the user level in the ROOT environment.


Discussion:
==========
Presently the Socket communication between Midas and ROOT (7) is under
revision by Stefan Ritt and René Brun. This revision will simplify the
remote access of an object such as an histogram. For the Tree itself, the
requirement would be to implement a "ring buffer" mechanism for remote tree
request. This is currently under discussion.

The path (8) has been suggested by Triumf to address small experiment setup
where only a single analyzer is required. This path minimize the DAQ
requirements by moving all the data analysis handling to the user.
The same ROOT analysis code would be applicable to a ONLINE as well as
OFFLINE analysis.

Cons:
- Necessity of publishing raw data through the network for every instance of
the remote analyzer.
- Result sharing of the analysis cannot be done yet in real time.

Pros:
- No need of extra task for data translation (midas/root).
- Unique data unpacking code part of the user code.
- Less CPU requirement.

Other issues:
============
- The current necessity of the Midas shared memory for the midas analyzer to
run is a concern in particular for offline analysis where a priori no midas
is available. 

- The handling of the run/analyzer parameters. Possible parameter extraction
from file.odb.
Entry  15 Dec 2003, Pierre-André Amaudruz, , ROOT GUI at Triumf 
The current Triumf DAQ standard (Midas) since the second quarter of this
year (2003) has the capability to deal with ROOT histograms. The internal
midas logger can save data files in ROOT format and the analyzer can book
and fill ROOT histograms. These features triggered a new project started
during summer 2003 for building a Triumf GUI ROOT/Midas display utility.

The initial requirements for this utility are:
1) Solely based on ROOT (VirtualX, no Qt)
2) Similar overall functionality than PAW.
   - Open concurrent ROOT files.
   - Open connection to a single Midas Online experiment (requires analyzer
                                                          as server)
   - Optional Auto-update in ONLINE mode.
   - Zoning, Zooming option display.
   - Simple Historgram gaphic manipulation. (based on current ROOT
                                             implementation)
   - Tree manipulation ( use of TBrowser())
   - Simple user script invocation.
   - Optional experiment specific customization.
3) Session configuration save/restore option.

An initial version has been developed and currently is under evaluation.
Improvement and further development will based on the local experimenters
responses. 

This utility will be available for external use around the second quarter of
the 2004 at the latest.

.
Entry  05 Dec 2003, Konstantin Olchanski, , HOWTO setup MIDAS ROOT tree analysis 
> root -l
root> TFile *f = new TFile("run00064.root")
root> TTree *t = f->Get("Trigger")
root> t->StartViewer() // look at the ROOT TTree
root> t->MakeSelector() // generates Trigger.h, Trigger.C

edit run.C, the main program:
{
gROOT->Reset();
TFile f("data/run00064.root");
TTree *t = f.Get("Trigger");

TH1D* adc8 = new TH1D("adc8","ADC8",1500,0,1500-1);
TH1D* tdc2 = new TH1D("tdc2","TDC2",1500,0,1500-1);
TH2D* h12 = new TH2D("h2","ADC8 vs TDC2",100,0,1500,100,0,1500);
TH2D* h12cut = new TH2D("h2cut","ADC8 vs TDC2",50,0,1000-1,50,0,1500);

TSelector *s = TSelector::GetSelector("Trigger.C");
t->Process(s);

adc8->Draw();
tdc2->Draw();
h12->Draw();
h12cut->Draw();
}

edit Trigger.C:

Bool_t Trigger::ProcessCut(Int_t entry)
{
  fChain->GetTree()->GetEntry(entry);
  if (entry%100 == 0) printf("entry %d\r",entry);
  return kTRUE;
}

void Trigger::ProcessFill(Int_t entry)
{
  adc8->Fill(ADCS_ADCS[8]);
  tdc2->Fill(TDCS_TDCS[2]);
  h12->Fill(TDCS_TDCS[2],ADCS_ADCS[8]);
  if (ADCS_ADCS[8] > 100)
    h12cut->Fill(TDCS_TDCS[2],ADCS_ADCS[8]);
}

Run the analysis:

root -l
root> .x run.C

K.O.
    Reply  20 Jul 2004, Konstantin Olchanski, , HOWTO setup MIDAS ROOT tree analysis 
Updating the instructions to ROOT version 3.10.2. Example is from TRIUMF-KOPIO
tree analysis.

shell> root -l
root> TFile *f = new TFile("run00064.root")
root> Trigger->MakeSelector("TriggerSelector")  // "Trigger" is the tree name
inside the root file. Generates TriggerSelector.h and TriggerSelector.cpp

= edit run.C, the main program:

{
gROOT->Reset();
TSelector *s = TSelector::GetSelector("TriggerSelector.C");

TChain chain("Trigger"); // "Trigger" is the tree name inside the root files
chain.Add("run03016.root"); // can chain multiple files

TH1D* tdc2 = new TH1D("tdc2","TDC2",1500,0,1500-1);

chain.Process(s,"",500); // process 500 events
//chain.Process(s); // or process all events

tdc2->Draw();
}

= edit TriggerSelector.h:

in the TriggerSelector class members, i.e. "UInt_t TDC1_TDC1[47];" edit the
array size to be bigger than the maximum possible bank size

= edit TriggerSelector.C:

Bool_t TriggerSelector::Process(Int_t entry)
{
  fChain->GetTree()->GetEntry(entry);

  if (entry%100 == 0)
    printf("process %d, nTDC %3d, 0x%08x\n",entry,TDC1_nTDC1,TDC1_TDC1[1]);

  tdc2->Fill(TDC1_nTDC1);
  return kTRUE;
}

= Run the analysis:

shell> root -l
root> .x run.C
 
K.O.
Entry  05 Dec 2003, Konstantin Olchanski, , HOWTO setup MIDAS ROOT tree analysis 
> root -l
root> TFile *f = new TFile("run00064.root")
root> TTree *t = f->Get("Trigger")
root> t->StartViewer() // look at the ROOT TTree
root> t->MakeSelector() // generates Trigger.h, Trigger.C

edit run.C, the main program:
{
gROOT->Reset();
TFile f("data/run00064.root");
TTree *t = f.Get("Trigger");

TH1D* adc8 = new TH1D("adc8","ADC8",1500,0,1500-1);
TH1D* tdc2 = new TH1D("tdc2","TDC2",1500,0,1500-1);
TH2D* h12 = new TH2D("h2","ADC8 vs TDC2",100,0,1500,100,0,1500);
TH2D* h12cut = new TH2D("h2cut","ADC8 vs TDC2",50,0,1000-1,50,0,1500);

TSelector *s = TSelector::GetSelector("Trigger.C");
t->Process(s);

adc8->Draw();
tdc2->Draw();
h12->Draw();
h12cut->Draw();
}

edit Trigger.C:

Bool_t Trigger::ProcessCut(Int_t entry)
{
  fChain->GetTree()->GetEntry(entry);
  if (entry%100 == 0) printf("entry %d\r",entry);
  return kTRUE;
}

void Trigger::ProcessFill(Int_t entry)
{
  adc8->Fill(ADCS_ADCS[8]);
  tdc2->Fill(TDCS_TDCS[2]);
  h12->Fill(TDCS_TDCS[2],ADCS_ADCS[8]);
  if (ADCS_ADCS[8] > 100)
    h12cut->Fill(TDCS_TDCS[2],ADCS_ADCS[8]);
}

Run the analysis:

root -l
root> .x run.C

K.O.
    Reply  20 Jul 2004, Konstantin Olchanski, , HOWTO setup MIDAS ROOT tree analysis 
Updating the instructions to ROOT version 3.10.2. Example is from TRIUMF-KOPIO
tree analysis.

shell> root -l
root> TFile *f = new TFile("run00064.root")
root> Trigger->MakeSelector("TriggerSelector")  // "Trigger" is the tree name
inside the root file. Generates TriggerSelector.h and TriggerSelector.cpp

= edit run.C, the main program:

{
gROOT->Reset();
TSelector *s = TSelector::GetSelector("TriggerSelector.C");

TChain chain("Trigger"); // "Trigger" is the tree name inside the root files
chain.Add("run03016.root"); // can chain multiple files

TH1D* tdc2 = new TH1D("tdc2","TDC2",1500,0,1500-1);

chain.Process(s,"",500); // process 500 events
//chain.Process(s); // or process all events

tdc2->Draw();
}

= edit TriggerSelector.h:

in the TriggerSelector class members, i.e. "UInt_t TDC1_TDC1[47];" edit the
array size to be bigger than the maximum possible bank size

= edit TriggerSelector.C:

Bool_t TriggerSelector::Process(Int_t entry)
{
  fChain->GetTree()->GetEntry(entry);

  if (entry%100 == 0)
    printf("process %d, nTDC %3d, 0x%08x\n",entry,TDC1_nTDC1,TDC1_TDC1[1]);

  tdc2->Fill(TDC1_nTDC1);
  return kTRUE;
}

= Run the analysis:

shell> root -l
root> .x run.C
 
K.O.
ELOG V3.1.4-2e1708b5