Back Midas Rome Roody Rootana
  Midas DAQ System, Page 37 of 41  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
Entry  25 Feb 2005, Konstantin Olchanski, Bug Fix, fixed: double free in FORMAT_MIDAS ybos.c causing lazylogger crashes 
We stumbled upon and fixed a "double free" bug in src/ybos.c causing crashes in
lazylogger writing .mid files in the FORMAT_MIDAS format (why does it use
ybos.c? Pierre says- for generic file i/o). Why this code had ever worked before
remains a mystery. K.O.
Entry  25 Jan 2005, John M O'Donnell, Bug Report, histograms not saved in replay mode 
is there a reason why histograms are not saved after a replay?

   /* save histos if requested */
   if (out_info.histo_dump && clp.online) {
                              ^^^^^^^^^^

perhaps the && should be ||?
    Reply  26 Jan 2005, Stefan Ritt, Bug Report, histograms not saved in replay mode 
> is there a reason why histograms are not saved after a replay?
> 
>    /* save histos if requested */
>    if (out_info.histo_dump && clp.online) {
>                               ^^^^^^^^^^
> 
> perhaps the && should be ||?

The original reason for that is the for running online, you want some histos for
monitoring after each run. For running offline, you specify a root output file via
"-o xxx.root" which contains trees AND histos. So the histos would there be twice
if you remove the "clp.online" from above.

Having "-o xxx.root" is IMHO a cleaner way, since you might want to analyze a run
in different ways (like using different calibrations). So what you do is specify
different "-o cal00123.root", "-o final00123.root" and so on, while with the
mechanism in eor() you always get the same file name. So try using "-o xxx.root"
and see if that fits your needs.
Entry  20 Jan 2005, Konstantin Olchanski, Suggestion, HOWTO create ROOT objects in the MIDAS analyzer 
With recent changes to mana.c, creation of user ROOT objects in the MIDAS
analyser has changed. Here is the new example code for creating ROOT objects
that are visible in ROODY and are saved into the histogram file.

1) in the "global" context (outside of any function)

#include <TH1D.h>
#include <TProfile.h>

static TH1D* gMyHist1 = 0;
static TProfile* gMyHist2 = 0;

2) In the analyzer "init" or "begin run" method, create the histogram:

//extern TFolder *gManaHistosFolder; // from midas.h
gMyHist1 = new TH1D("gMyHist1",...);
gMyHist2 = new TProfile("gMyHist2",...);
gManaHistosFolder->Add(gMyHist1);
gManaHistosFolder->Add(gMyHist2);

(note: this will produce an warning about "possible memory leak")

3) In the per-event method, fill the histograms

gMyHist1->Fill(x);
gMyHist2->Fill(x,y);

4) In the Makefile, where you compile the frontend, add "-DUSE_ROOT" right after
"-I$(ROOTSYS)/include"

K.O.
    Reply  25 Jan 2005, John M O'Donnell, Suggestion, HOWTO create ROOT objects in the MIDAS analyzer book.patch
> (preliminary, untested. I will keep this updated as I get testing feedback)
> 
> With recent changes to mana.c, creation of user ROOT objects in the MIDAS
> analyser has changed. Here is the new example code for creating ROOT objects
> that are visible in ROODY and are saved into the histogram file.
> 
> 1) in the "global" context (outside of any function)
> 
> #include <TH1D.h>
> #include <TProfile.h>
> 
> static TH1D* gMyHist1 = 0;
> static TProfile* gMyHist2 = 0;
> 
> 2) In the analyzer "init" or "begin run" method, create the histogram:
> 
> //extern TFolder *gManaHistosFolder; // from midas.h
> gMyHist1 = new TH1D("gMyHist1",...);
> gMyHist2 = new TProfile("gMyHist2",...);
> gManaHistosFolder->Add(gMyHist1);
> gManaHistosFolder->Add(gMyHist2);
> 
> (note: this will produce an warning about "possible memory leak")
> 
> 3) In the per-event method, fill the histograms
> 
> gMyHist1->Fill(x);
> gMyHist2->Fill(x,y);
> 
> K.O.


the book functions provide a convenient place to check against object duplication
and memory leaks etc., and a place to ensure that consistent subfolders are being
used.  eg. a while back we decided that TCutGs should be in a "cuts" subfolder.

To extend the booking to TProfile is fairly easy.  In fact if you want to
use the simple constructor TProfile::TProfile (const char *, const char *, Int_t,
Axis_t, Axis_t), then you could infact just use h1_book<TProfile>.

It now seems to me that the names h1_book, h2_book, cut_book are all too long
and even more upsetting are inconsistent.  Some of them are templates (most) and
some are not.  Perhaps they should all be templates, and all have the same name.
The attached patch accomplishes this (without deleting the old names).  With this
patch you can now do

gMyHist1 = book<TProfile>( "gMyHist2",...);

New book templates are needed when you (1) wish to change the subfolder, or (2)
need to use a different argument list in the constructor.  If you need help with
this for the TProfile constructors which are different from TH1D constructors then
let me know.  They should be easy to do.

For TGraph at lot depends on how you want to initialise the data points.
Entry  20 Jan 2005, Konstantin Olchanski, Bug Report, Persistency problem with h1_book() & co 
The current h1_book() macros (and the previous example analyzer code) have an
odd persistency problem: for example, the user wants to change some histogram
limits, edits the h1_book() calls, rebuilds and restarts the analyzer, starts a
new run, and observes that all histograms are filled using the old limits, his
changes "did not take". The user panics, I get paged during the Holy Lunch Hour,
everybody is unhappy.

This is what I think happens:

1) analyzer starts
2) LoadRootHistgrams() loads old histograms from file
3) user code calls h1_book()
4) h1_book template in midas.h does this (roughly):
      hist = (TH1X *) gManaHistosFolder->FindObjectAny(name);
      if (hist == NULL) {
         hist = new TH1X(name, title, bins, min, max);
5) since the histogram already exists (loaded from the file, with the old
limits), the TH1X constructor is not called at all, new histogram limits are
utterly ignored.

A possible solution is to unconditionally create the ROOT objects, like I do in
the example code posted at http://dasdevpc.triumf.ca:9080/Midas/191. That code
produces an annoying warning from ROOT about possible memory leaks. This could
be fixed by adding a two liner to "find and delete" the object before it is
created, trippling the number of user code lines per histogram (find & delete,
then create). Highly ugly.

midas.h macros (h1_book & co) can be fixed by adding checks for histogram limits
and such, but I would much prefer a generic solution/convention that would work
for arbitrary ROOT objects without MIDAS-specific wrappers (think TProfile,
TGraph, etc...).

Any suggestions?

K.O.
    Reply  21 Jan 2005, John M O'Donnell, Bug Report, Persistency problem with h1_book() & co 
> The current h1_book() macros (and the previous example analyzer code) have an
> odd persistency problem: for example, the user wants to change some histogram
> limits, edits the h1_book() calls, rebuilds and restarts the analyzer, starts a
> new run, and observes that all histograms are filled using the old limits, his
> changes "did not take". The user panics, I get paged during the Holy Lunch Hour,
> everybody is unhappy.
> 
> This is what I think happens:
> 
> 1) analyzer starts
> 2) LoadRootHistgrams() loads old histograms from file

I can't get onto cvs@midas.psi.ch right now
(cvs update
cvs@midas.psi.ch's password: 
Permission denied, please try again.)

but when I changed LoadRootHistograms a few days ago I left it as:

    } else if (obj->InheritsFrom( "TH1")) {

      // still don't know how to do TH1s

so h1_book() is creating the first and only copy of the histograms.
I am able to create new histogram limits.
I don't get the memory leak problems.

However I have seen the memory leak problems before, and they are real.
They must be dealt with either by (1) first deleteing the old histogram
or (2) ensuring that histogram names are unique in the whole application
(different modules/folders can not use the same histogram names).

I will return to this once I can do a cvs update for midas.

John.

> 3) user code calls h1_book()
> 4) h1_book template in midas.h does this (roughly):
>       hist = (TH1X *) gManaHistosFolder->FindObjectAny(name);
>       if (hist == NULL) {
>          hist = new TH1X(name, title, bins, min, max);
> 5) since the histogram already exists (loaded from the file, with the old
> limits), the TH1X constructor is not called at all, new histogram limits are
> utterly ignored.
> 
> A possible solution is to unconditionally create the ROOT objects, like I do in
> the example code posted at <a
href="http://dasdevpc.triumf.ca:9080/Midas/191">http://dasdevpc.triumf.ca:9080/Midas/191</a>.
That code
> produces an annoying warning from ROOT about possible memory leaks. This could
> be fixed by adding a two liner to "find and delete" the object before it is
> created, trippling the number of user code lines per histogram (find & delete,
> then create). Highly ugly.
> 
> midas.h macros (h1_book & co) can be fixed by adding checks for histogram limits
> and such, but I would much prefer a generic solution/convention that would work
> for arbitrary ROOT objects without MIDAS-specific wrappers (think TProfile,
> TGraph, etc...).
> 
> Any suggestions?
> 
> K.O.
       Reply  21 Jan 2005, Stefan Ritt, Bug Report, Persistency problem with h1_book() & co 
> I can't get onto cvs@midas.psi.ch right now
> (cvs update
> cvs@midas.psi.ch's password: 
> Permission denied, please try again.)

I had to upgrade midas.psi.ch today with Scientific Linux 3.03. Most things are back to work, but
 I failed to do the anonymous CVS account. I have to wait for next week when the experts are
there. I will let you know when it's working again.

- Stefan
          Reply  25 Jan 2005, Stefan Ritt, Bug Report, Persistency problem with h1_book() & co 
> > I can't get onto cvs@midas.psi.ch right now
> > (cvs update
> > cvs@midas.psi.ch's password: 
> > Permission denied, please try again.)

cvs@midas.psi.ch should be up and running again.
       Reply  25 Jan 2005, John M O'Donnell, Bug Report, Persistency problem with h1_book() & co 
So now that cvs is reachable again I have confirmed that
the code segment
 
     } else if (obj->InheritsFrom( "TH1")) {
 
       // still don't know how to do TH1s

is indeed still present.
If you want me to look at this some more, you need to provide some code to exhibit the problem.

John.

> > The current h1_book() macros (and the previous example analyzer code) have an
> > odd persistency problem: for example, the user wants to change some histogram
> > limits, edits the h1_book() calls, rebuilds and restarts the analyzer, starts a
> > new run, and observes that all histograms are filled using the old limits, his
> > changes "did not take". The user panics, I get paged during the Holy Lunch Hour,
> > everybody is unhappy.
> > 
> > This is what I think happens:
> > 
> > 1) analyzer starts
> > 2) LoadRootHistgrams() loads old histograms from file
> 
> I can't get onto cvs@midas.psi.ch right now
> (cvs update
> cvs@midas.psi.ch's password: 
> Permission denied, please try again.)
> 
> but when I changed LoadRootHistograms a few days ago I left it as:
> 
>     } else if (obj->InheritsFrom( "TH1")) {
> 
>       // still don't know how to do TH1s
> 
> so h1_book() is creating the first and only copy of the histograms.
> I am able to create new histogram limits.
> I don't get the memory leak problems.
> 
> However I have seen the memory leak problems before, and they are real.
> They must be dealt with either by (1) first deleteing the old histogram
> or (2) ensuring that histogram names are unique in the whole application
> (different modules/folders can not use the same histogram names).
> 
> I will return to this once I can do a cvs update for midas.
> 
> John.
> 
> > 3) user code calls h1_book()
> > 4) h1_book template in midas.h does this (roughly):
> >       hist = (TH1X *) gManaHistosFolder->FindObjectAny(name);
> >       if (hist == NULL) {
> >          hist = new TH1X(name, title, bins, min, max);
> > 5) since the histogram already exists (loaded from the file, with the old
> > limits), the TH1X constructor is not called at all, new histogram limits are
> > utterly ignored.
> > 
> > A possible solution is to unconditionally create the ROOT objects, like I do in
> > the example code posted at <a
> href="<a
href="http://dasdevpc.triumf.ca:9080/Midas/191">http://dasdevpc.triumf.ca:9080/Midas/191</a>">http://dasdevpc.triumf.ca:9080/Midas/191"><a
href="http://dasdevpc.triumf.ca:9080/Midas/191</a>">http://dasdevpc.triumf.ca:9080/Midas/191</a></a></a>.
> That code
> > produces an annoying warning from ROOT about possible memory leaks. This could
> > be fixed by adding a two liner to "find and delete" the object before it is
> > created, trippling the number of user code lines per histogram (find & delete,
> > then create). Highly ugly.
> > 
> > midas.h macros (h1_book & co) can be fixed by adding checks for histogram limits
> > and such, but I would much prefer a generic solution/convention that would work
> > for arbitrary ROOT objects without MIDAS-specific wrappers (think TProfile,
> > TGraph, etc...).
> > 
> > Any suggestions?
> > 
> > K.O.
Entry  22 Dec 2004, Stefan Ritt, Suggestion, What to do with invalid data in the history system? 
Dealing with the NaN's in the history system in the past week, a question came
up at PSI about how to deal with invalid history data.

Assume you have several devices going into one history equipment, and one device
has a problem, such that it cannot be read. In the past, the device driver
system returned zero, which was written to the history file. While this is ok in
some cases, it might not be in others, where zero is maybe a valid measurement.
Furthermore, it might confuse some regulations loops.

An alternative is to keep the last correctly measured value. As long as the
device has its problem, the value is kept. However, values are written to the
history system which might look like valid, although they are not. So what about
writing explicitly NaNs to the history system? For the display routine the NaNs
could be omitted, leaving blank regions where no valid measurement is available.
Or one could explicitly mare the region as invalid. Konstantin, do you know how
to write NaN explicitly to a float variable? And what do the others think about
these possibilities?

- Stefan
    Reply  23 Dec 2004, Stefan Ritt, Suggestion, What to do with invalid data in the history system? hist.gif
I preliminary implemented NaNs into the history system. It works such that if a
device driver returns a read error status, the class driver writes a NaN
(Not-a-Number) into the corresponding variable via the new function ss_nan(). The
"mhist" utility directly displays these as "nan" (Linux) or "-1.#IND00" under
Windows, indicating the error status. The history display via mhttpd just skips
these values (see elog:/1). I think this is better than showing just zero values,
because in most cases zero is a valid measurement and could confuse people.

Of course it is not enough just having "gaps" in the history display, so it's
important that the corresponding device driver issues an error message, which could
even trigger an alarm.

I have tested this under Windows, but only compiled under Linux. The only class
driver I modified so far is "multi.c". People should have a look, make some tests,
and let me know if this is a good thing, or if we should change it somehow.

- Stefan
Entry  16 Dec 2004, Jan Wouters, Forum, cm_msg 
Could someone please explain to me how cm_msg, cm_msg1, etc. all work.  The
documentation is very terse.  

I want to setup a fairly significant set of debugging, and error messages for a
new frontend.  I need to get these messages to a logging file.  I also would
like to get the error messages to the user through whatever interface Midas
normally uses for error reporting.  

Jan
    Reply  22 Dec 2004, Stefan Ritt, Forum, cm_msg 
> Could someone please explain to me how cm_msg, cm_msg1, etc. all work.  The
> documentation is very terse.  
> 
> I want to setup a fairly significant set of debugging, and error messages for a
> new frontend.  I need to get these messages to a logging file.  I also would
> like to get the error messages to the user through whatever interface Midas
> normally uses for error reporting.  

For errors, use

  cm_msg(MERROR, "routine_name", "Your error message, code=%d", i);

This produces an error message which is logged to midas.log, and distributed to all
clients which have called cm_msg_register(). For example odbedit will just print
that message. The syntax of the second half of cm_msg is the same as for printf(),
so you can add format specifiers and variable arguments as you do for printf(). The
first argument is the message type (MDEBUG for example is only distributed but not
logged). 

For a more detailed list of message types, please refer to

http://midas.triumf.ca/doc/html/AppendixE.html#midas_macro
Entry  14 Dec 2004, Konstantin Olchanski, Info, Commit local TWIST modifications 
I am commiting MIDAS modification accumulated during the last few months of running TWIST:
1) system.c::ss_shm_open() fail if trying to map a file that is smaller than we expect.
2) midas.c::bm_lock_buffer(), el_submit(), el_delete_message(): do not wait for mutexes forever, use a 5 
minute timeout. If we can't get the lock, cm_msg()/abort().
The above helps dealing with complete midas freezes. I also have code to keep track of "who locked
the mutex *and* is still holding it?!?" but it is way too ugly to commit. I wish we had a "lockedByPid"
entry for all lockable objects.
K.O.
 
    Reply  14 Dec 2004, Konstantin Olchanski, Info, Commit local TWIST modifications 
> I am commiting MIDAS modification accumulated during the last few months of running TWIST:

More:
- mfe.c: in error messages "cannot find statistics record", also print
  the name of the record we are looking for.
- mlogger.c: in warning message "Write operation took N ms", report the name
  of the offending data stream.
- system.c: do not chdir("/") in ss_daemon_init()- it prevents us from ever
  getting core dumps from midas daemons. The old behaviour is trivially
  restored by "cd /" before starting the daemon; or by "limit coredumpsize 0".
- odb.c: db_validate_db() detect and break infinite looping on free list corruption.

K.O.
       Reply  14 Dec 2004, Konstantin Olchanski, Info, mhttpd: Commit local TWIST modifications 
> > I am commiting MIDAS modification accumulated...

mhttpd changes:

- Renee's improvements on http transaction logging
- Implement "minimum" and "maximum" clamping for history graphs. Unfortunately
  there is no GUI code for changing the "minimum" and "maximum" settings,
  other than directly frobbing the odb.
- When making history graphs, detect NaNs in the history data.
(- status page code for the TWIST event builder (precursor of the standard
   event builder) stays uncommited).

K.O.
       Reply  15 Dec 2004, Stefan Ritt, Info, Commit local TWIST modifications 
> - system.c: do not chdir("/") in ss_daemon_init()- it prevents us from ever
>   getting core dumps from midas daemons. The old behaviour is trivially
>   restored by "cd /" before starting the daemon; or by "limit coredumpsize 0".

The chdir("/") is from one of the unix text books. They say you HAVE to do it. If you start a
daemon on an NFS file system, you cannot unmount that file system as long as the daemon is
running. I'm sure the same code is inside most other daemons (apache, ...). So if we go away
from that standard, we have to be aware of the consequences.
          Reply  16 Dec 2004, Konstantin Olchanski, Info, "cd /" in ss_daemon_init(), was- Commit local TWIST modifications 
> > - system.c: do not chdir("/") in ss_daemon_init()- it prevents us from ever
> >   getting core dumps from midas daemons.
> 
> The chdir("/") is from one of the unix text books. They say you HAVE to do it. If you start a
> daemon on an NFS file system, you cannot unmount that file system as long as the daemon is
> running.

Right, I remember this NFS problem from a while back.

This problem does not exist in the current crop of Linux systems (since Red Hat 7.3 at least) - they
either kill off all user programs or use "umount -f" and "umount -l".

"umount -l" works in any case to unmount a "busy" filesystem.

For systems where the NFS problem does still exist, one should do this: "mlogger -D" becomes "(cd /; mlogger -D)".

So I suspect that the "cd /" advice from the unix programming book is no longer as necessary
as it used to be. (Perhaps a better advice would have been to "cd /tmp", so we could still get
core dumps from non-root daemons).

K.O.
Entry  15 Dec 2004, , Forum, Where's the definition of "H1_BOOK()" 
When i compile the experiment example of 1.9.5 the problem happened:

adccalib.c: In function `INT adc_calib_init()':
adccalib.c:114: `H1_BOOK' undeclared (first use this function)
adccalib.c:114: (Each undeclared identifier is reported only once for each
   function it appears in.)
make: *** [adccalib.o] Error 1

my ROOT is 4.01 and Zlib is 1.2.2
    Reply  15 Dec 2004, Pierre-Andre Amaudruz, Forum, Where's the definition of "H1_BOOK()" 
> When i compile the experiment example of 1.9.5 the problem happened:
> 
> adccalib.c: In function `INT adc_calib_init()':
> adccalib.c:114: `H1_BOOK' undeclared (first use this function)
> adccalib.c:114: (Each undeclared identifier is reported only once for each
>    function it appears in.)
> make: *** [adccalib.o] Error 1
> 
> my ROOT is 4.01 and Zlib is 1.2.2

We're in the process of fixing in the proper manner this problem, in the mean time
please add to the analyzer makefile the definition: -DUSE_ROOT at the line:
...
ROOTCFLAGS += -DHAVE_ROOT -DUSE_ROOT
Entry  14 Dec 2004, Jan Wouters, Forum, Frontend index 
What is the api call to determine the index of the frontend when specifying the
-i parameter during execution of the frontend? 
    Reply  15 Dec 2004, Stefan Ritt, Forum, Frontend index 
> What is the api call to determine the index of the frontend when specifying the
> -i parameter during execution of the frontend? 

INT get_frontend_index();

- Stefan
Entry  25 Nov 2004, chris pearson, Forum, use of assert in mhttpd 
   We've had mhttpd aborting regularly since upgrading from midas-1.9.3.  This
happens during elog queries, and is due to an elog file that was incorrectly
modified by hand.  The modification to the file occurred 6 months ago.
   el_retrieve(midas.c:15683) now has several assert statements, one of which
aborts the program on reading the bad entry.

   Why is assert used, instead of an error return from the function (if
necessary), and maybe an error message in the log file?  Assert statements are
often removed, using NDEBUG, for normal use.

Chris

   The problem elog entry had one character removed, so end-of-file came before
the end of the message.  This could probably occur without the file being
altered, if the disk containing the elog fills.
    Reply  14 Dec 2004, Konstantin Olchanski, Forum, use of assert in mhttpd 
>    We've had mhttpd aborting regularly since upgrading from midas-1.9.3.  This
> happens during elog queries, and is due to an elog file that was incorrectly
> modified by hand.

(sorry for delayed reply, for reasons unknown, I did not get an email notice when this was posted)

Yes, I agree, error handling in midas elog code is insufficient (note missing error checks for
read() and lseek() system calls). Anything but "perfect" elog files would cause funny errors and
malfunctions.

>  The modification to the file occurred 6 months ago.
>    el_retrieve(midas.c:15683) now has several assert statements, one of which
> aborts the program on reading the bad entry.

I added those to fix problems with "broken last NN days" and with infinite looping in the elog code
that we observed in TWIST.

You are welcome to replace the assert() statements with proper error handling. I used to have some code
that could report the filename of the bad elog file. Can we also report the exact file location for broken
files.

Please send me the diff, I will commit it to midas cvs.

>    Why is assert used, instead of an error return from the function (if
> necessary), and maybe an error message in the log file?  Assert statements are
> often removed, using NDEBUG, for normal use.

I use assert() in several ways:

0) I want a core dump each time X happens. (This is the only reasonable action when facing memory/stack
corruption. The problems in the elog code were stack corruption).
1) "I am too lazy to write proper error handling code" so I just crash and burn. This includes the
case where "proper error handling" would be "too invasive".
2) the error is too bad (or too deep) and there is no reasonable way to recover. Print an error message
and dump core (for later analysis). I sometimes use "cm_msg(); abort()". (assert is "printf("error"); abort()")

Please refer to literature for philosophic discussions on uses of assert() (Argh! Stefan will have my
head again!), but I will mention that "abort() early, abort() often" I find very effective. BTW, this technique
is heavily used in the Linux kernel (oops(), bug(), panic()) with some good effect, too.

>    The problem elog entry had one character removed, so end-of-file came before
> the end of the message.  This could probably occur without the file being
> altered, if the disk containing the elog fills.

Yes, I think you are right. In TWIST, we have seen disk-full conditions break both elog and history.

K.O.
Entry  24 Nov 2004, chris pearson, Info, midas on 64bit opteron 
   Midas, version 1.9.5 of 7th October, was installed, with a few changes, on a
64 bit opteron computer, running linux.  For this processor, as for the alpha
processor, long integers and addresses are 64 bits.  We added a new flag in the
Makefile,

250a251
> ARCH   = $(shell uname -m)
377a379,381
> ifeq ($(ARCH),x86_64)
> OSFLAGS := $(OSFLAGS) -DX86_64
> endif

and extended the alpha-specific definitions, of DWORD and PTYPE, in midas.h to
include this case,

549c549
< #ifdef __alpha
---
> #if defined(__alpha) || defined(X86_64)
598c598
< #ifdef __alpha
---
> #if defined(__alpha) || defined(X86_64)

apart from this, there are a large number of cases where pointers are cast to
integers, without using the PTYPE definition.  These all need to be changed by
hand, although these conversions should probably be removed anyway - in almost
all cases they are unnecessary, as just differences are being calculated.

There were also a number of warnings, which we ignored, where printf format
strings specified long integers, but the argument was not a long integer.  Casts
should probably be added in all cases where the type of the argument can vary
depending on the machine.

A midas analyser was made, which was able to successfully replay some data, but
this was all that was tested.

Chris
Entry  09 Nov 2004, Pierre-Andre Amaudruz, Bug Fix, New transition scheme 
Problem:
If cm_set_transition_sequence() is used for changing the sequence number, the 
command odbedit> start/stop/resume/pause -v report the propre sequence but the
action on the client side is actually not performed!

Fix:
Local transition table updated in midas.c (1.226)

Note:
The transition number under /system/clients/<pid>/transition...
is used internally. Changing it won't have any effect on the client action
if sequence number is not registered.
Entry  04 Nov 2004, Jan Wouters, Forum, Frontend code and the ODB 
I would like to know whether all parameters used by the frontend code have to be in the "Experiment/
Run Parameters" section.  This section can become big and difficult to maintain, because it is one single 
big section of experim.h (EXP_PARAM_DEFINED).  I have parameters the various frontends read at the 
beginning of each run, which set the hardware settings of various devices.  I would like to place these in 
a section all their own, organized by device.  Is this doable? 
    Reply  04 Nov 2004, Stefan Ritt, Forum, Frontend code and the ODB 
Hi Jan,

I usually keep under /Experiment/Run Parameters only those settings which are kind of "global" and thus of
interest to frontend *and* analyzer, like a run mode (data/calibration/cosmic/...). Settings more specific to a
frontend I keep under /Equipment/<name>/Settings where <name> is the equipment name the specific frontend
produces. In your case each frontend will then get its own tree (related to each fragment). Please note that
both discussed trees can contain a whole tree with subdirectories, which lets you organize your data better.

Best regards, Stefan.
Entry  02 Nov 2004, Renee Poutissou, Info, Event Builder info in mhttpd Status page 
Information about the Event Builder statistics has been removed from the 
Status page in mhttpd.  I heard from Pierre that this information might 
be redundant when using the new Event Builder format??? 
For the TWIST experiment, we are running and cannot change on the fly
to a new format Event Builder.  It is very important for us to show the users
the rates and statistics coming out of the EventBuilder.  I had  to put this
piece of code back in mhttpd.  
Can I put it back in the distribution? or do I have to put a special TWIST flag? 
or do I have to keep reinserting this every time there is an update to mhttpd.c? 
At the moment, TWIST is generating a couple of updates/week to mhttpd.c
Entry  22 Oct 2004, Konstantin Olchanski, Bug Fix, mhttpd message colouring 
I commited a fix to mhttpd logic that decides which messages should be shown in
"red" colour- before, any message with square brackets and colons would be
highlighted in red. Now only messages matching the pattern [...:...] are
highlighted. The decision logic was moved into a function message_red(). K.O.
Entry  13 Oct 2004, Konstantin Olchanski, Bug Report, TWIST upgrade bombed... 
The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
crashes during shared memory data buffer accesses. I am looking into it and I
will add information as I figure things out. K.O.
    Reply  13 Oct 2004, Pierre-Andre Amaudruz, Bug Report, TWIST upgrade bombed... 
> The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> crashes during shared memory data buffer accesses. I am looking into it and I
> will add information as I figure things out. K.O.

Since 1.9.5 the EventBuilder has been modified. Please consult the documentation
where the new mevb scheme is explained.
Test of the mevb with up to 16 frontends (15 different CPUs) has been tested
successfully. Data rate at the EventBuilder were measured about 50MB/s without the
logger and ~30MB/s with the logger.
       Reply  13 Oct 2004, Konstantin Olchanski, Bug Report, TWIST upgrade bombed... 
> > The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> > crashes during shared memory data buffer accesses. I am looking into it and I
> > will add information as I figure things out. K.O.
> 
> Since 1.9.5 the EventBuilder has been modified. Please consult the documentation
> where the new mevb scheme is explained.
> Test of the mevb with up to 16 frontends (15 different CPUs) has been tested
> successfully. Data rate at the EventBuilder were measured about 50MB/s without the
> logger and ~30MB/s with the logger.

It turns out that TWIST uses a private mevb.c. We will consider upgrading to the
standard one.

K.O.
    Reply  13 Oct 2004, Konstantin Olchanski, Bug Report, TWIST upgrade bombed... 
> The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> crashes during shared memory data buffer accesses. I am looking into it and I
> will add information as I figure things out. K.O.

I traced buffer memory corruption to a logic error in system.c::ss_shm_open(). If
a .SHM file exists, it's size is used as the size of the sysv shared memory
segment, even if the requested shared memory size is bigger, but the caller of
ss_shm_open()  thinks it got all the requested memory. Eventually we try to use
the unallocated memory and crash. This is the proposed fix and I will commit it
after I retest the upgrade during the next few days.

[olchansk@send src]$ cvs diff -u system.c
olchansk@midas.psi.ch's password: 
Index: system.c
===================================================================
RCS file: /usr/local/cvsroot/midas/src/system.c,v
retrieving revision 1.83
diff -u -r1.83 system.c
--- system.c    4 Oct 2004 07:04:01 -0000       1.83
+++ system.c    14 Oct 2004 05:51:16 -0000
@@ -544,8 +544,14 @@
       } else {
          /* if file exists, retrieve its size */
          file_size = (INT) ss_file_size(file_name);
-         if (file_size > 0)
+         if (file_size > 0) {
+            if (file_size < size) {
+               cm_msg(MERROR, "ss_shm_open", "Shared memory segment \'%s\' size
%d is smaller than requested size %d. Please remove it and try
again",file_name,file_size,size);
+               return SS_NO_MEMORY;
+            }
+            
             size = file_size;
+         }
       }
 
       /* get the shared memory, create if not existing */

K.O.
       Reply  14 Oct 2004, Stefan Ritt, Bug Report, TWIST upgrade bombed... 
Agree.

Once you did the modification, please check following situation: Create a fresh
ODB withe increased size ("odbedit -s 2000000" for example). Then check that the
other clients "adopt" this increased size. Note that some experiments need a
bigger ODB, and I don't want to have them recompile all clients, that's why the
code in ss_shm_open() can attach to a *larger* shared memory. However, it should
not matter to the process, since the ODB (or SYSTEM) shared memory size is
stored in the pheader->key_size and pheader->data_size of each participating
process. So they should never write beyond the limits defined in that header.
The size to ss_shm_open() is only a "hint" if the shared memory does not exist,
and is nowhere later used in the code.
    Reply  14 Oct 2004, Konstantin Olchanski, Bug Report, TWIST upgrade bombed... 
> The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> crashes during shared memory data buffer accesses. I am looking into it and I
> will add information as I figure things out. K.O.

On second try, it looks like we are in business- the first try did not work
because of two mistakes:

1) I did not delete *all* old .SHM files (.ODB.SHM, .SYSTEM.SHM, .YBUF1.SHM,
.YBUF2.SHM). I deleted ODB.SHM, so odb worked, but forgot about the data buffers
SYSTEM.SHM & co and ended up with segmentation faults and core dumps in the buffer
management code caused by a mismatch of the old-midas buffers and new-midas code.
2) while debugging these core dumps, I made an error in my test code, so even
after I deleted the old data buffers, things still did not work. Talk about
over-debugging a problem...

K.O.
Entry  13 Oct 2004, Konstantin Olchanski, Suggestion, No al_clear_alarm()? 
We have al_trigger_alarm(), but no matching al_clear_alarm(), and I need it to
clear my alarm once the alarm condition no longer exists. Any objections if I
add this function? K.O.
    Reply  13 Oct 2004, Stefan Ritt, Suggestion, No al_clear_alarm()? 
> We have al_trigger_alarm(), but no matching al_clear_alarm(), and I need it to
> clear my alarm once the alarm condition no longer exists. Any objections if I
> add this function? K.O.

The idea is that once an alarm got triggered, it stays until the user
acknowledged, even if the alarm condition has been disappeared. Through mhttpd,
the user can press the "Reset" button, which then executes al_reset_alarm().
However, it is possible to call al_reset_alarm() directly from user code to
achieve the same thing.
       Reply  13 Oct 2004, Konstantin Olchanski, Suggestion, No al_clear_alarm()? 
> > We have al_trigger_alarm(), but no matching al_clear_alarm(), and I need it to
> > clear my alarm once the alarm condition no longer exists. Any objections if I
> > add this function? K.O.
> 
> call al_reset_alarm()

Thanks. I must be quite blind as I did not see al_reset_alarm() in midas.h. I se eit
now. Thanks.

K.O.
Entry  13 Oct 2004, Konstantin Olchanski, Bug Report, silly odbedit "rename Display xxx/yyy" 
odbedit command "rename Display xxx/yyy" creates a key named "xxx/yyy" (yes,
with a slash in the name) and this key cannot be deleted or renamed...
K.O.
    Reply  13 Oct 2004, Stefan Ritt, Bug Report, silly odbedit "rename Display xxx/yyy" 
> odbedit command "rename Display xxx/yyy" creates a key named "xxx/yyy" (yes,
> with a slash in the name) and this key cannot be deleted or renamed...
> K.O.

"rename" is "rename", not "mv" under Unix. If you want this functionality, put it
in and don't complain!
Entry  13 Oct 2004, Konstantin Olchanski, Bug Report, db_paste: found string exceeding MAX_STRING_LENGTH 
I am updating TWIST to the latest MIDAS and when I load a saved .odb file, I get
these messages. Their text ought to say where and what strings it does not like.
K.O.



[twistonl@midtwist ~/online]$ odbedit
Please define environment variable 'MIDASSYS'
pointing to the midas installation directory.
[local:twist:S]/>load /twist/data_onl/current/run17548.odb
[odb.c:5600:db_paste] found string exceeding MAX_STRING_LENGTH
[odb.c:5600:db_paste] found string exceeding MAX_STRING_LENGTH
[odb.c:5600:db_paste] found string exceeding MAX_STRING_LENGTH
    Reply  13 Oct 2004, Stefan Ritt, Bug Report, db_paste: found string exceeding MAX_STRING_LENGTH 
Can you attach 

/twist/data_onl/current/run17548.odb

so I can reproduce the problem?
Entry  29 Sep 2004, Stefan Ritt, Info, Increased number of clients in midas.h, important! 
Due to some request several limitations like the maximal number of clients to the ODB have 
been increased in midas.h and committed to CVS. It is important to note that clients compiled
with the old limits cannot coexist with clients compiled with the new limits. You will get
ODB corruption notifications and everything will crash, and you wonder where this comes from.

So once you CVS update midas.h, revision 1.139, please make sure to recompile *ALL* your
midas applications with the new midas.h.

Stefan
    Reply  03 Oct 2004, Konstantin Olchanski, Info, Increased number of clients in midas.h, important! 
> It is important to note that clients compiled
> with the old limits cannot coexist with clients compiled with the new limits. You will get
> ODB corruption notifications and everything will crash, and you wonder where this comes from.
> 
> So once you CVS update midas.h, revision 1.139, please make sure to recompile *ALL* your
> midas applications with the new midas.h.

Stefan, to avoid confusion from crashes caused by incompatible ODBs would it be possible to add a "version number" to ODB, together with a check and an error message 
saying "oops... incompatible ODB, please rebuild your programs"? We tend to have different versions of midas floating around and users have old executables stashed away, 
and all this makes it rather difficult to manually keep track on what ODB is compatible with what midas.

K.O.
       Reply  03 Oct 2004, Stefan Ritt, Info, Increased number of clients in midas.h, important! 
> Stefan, to avoid confusion from crashes caused by incompatible ODBs would it be possible to add a "version number" to ODB,
together with a check and an error message 
> saying "oops... incompatible ODB, please rebuild your programs"? We tend to have different versions of midas floating around and
users have old executables stashed away, 
> and all this makes it rather difficult to manually keep track on what ODB is compatible with what midas.

I fully agree that a version number in ODB is a good thing, and I certainly will put one there, but this won't help for old
applications. If I add new code which checks in cm_connect_experiment() if the version number matches, this will only help for new
applications connecting to old ODBs. If old applications (prior to invention of the version number) connect to a new ODB, they still
will crash.

However, we are planning to make a new release 1.9.5 soon (next week), so can can people tell not to "mix" 1.9.5 with pre-1.9.5
programs.
          Reply  03 Oct 2004, Konstantin Olchanski, Info, Increased number of clients in midas.h, important! 
> However, we are planning to make a new release 1.9.5 soon (next week), so can can people tell not to "mix" 1.9.5 with pre-1.9.5 programs.

Right. We cannot fix the past, but we should fix the future. BTW, "do not mix versions" is hard to enforce and mismatches did, do and
will happen. For one thing, looking at a given midas-using executable, how do I tell what version of midas it has inside?

K.O.
             Reply  04 Oct 2004, Stefan Ritt, Info, Increased number of clients in midas.h, important! 
> Right. We cannot fix the past, but we should fix the future. BTW, "do not mix versions" is hard to enforce and mismatches did, do and
> will happen

For remote connections (through mserver), there is already a version check. If the minor version differs, you get a warning, if the major
versions differ (1.>>9<<.4), the client won't start. So at least for remote connection you get a clue.

> For one thing, looking at a given midas-using executable, how do I tell what version of midas it has inside?

Ther is a function cm_get_version() returning the version. As for the executable, all you can do is a

strings <executable> | grep 1.9
                Reply  08 Oct 2004, chris pearson, Info, Increased number of clients in midas.h, important! 
> > For one thing, looking at a given midas-using executable, how do I tell what version of midas it has inside?
> 
> Ther is a function cm_get_version() returning the version. As for the executable, all you can do is a
> 
> strings <executable> | grep 1.9

   A lot of programs have a commandline option, such as "--version", where they return the program version number then exit.  As well as the
program version number, the version number of the midas library it's linked with could also be returned (There can be more than one
libmidas.so on a system and this would show which one was currently being linked)

   Something I would find useful would be for the version number to identify precisely which version you have, i.e. not to have different
versions of midas given the same version number.  I've had problems earlier this year due to midas-1.9.3 changing several times between
January and July, while keeping the same number.  I think if "in-between" versions of midas are to be made available, they should contain a
revision number or date or something in the version number to identify them.

> For remote connections (through mserver), there is already a version check. If the minor version differs, you get a warning, if the major
> versions differ (1.>>9<<.4), the client won't start. So at least for remote connection you get a clue.

   Safety measures like these can sometimes get in the way, if you know what you're doing.  So unless there is absolutely no possibility of
success, I think checks such as this one should be overrideable (by a client option).

Chris
                   Reply  08 Oct 2004, Konstantin Olchanski, Info, Increased number of clients in midas.h, important! 
>    A lot of programs have a commandline option, such as "--version", where they return the program version number then exit.  As well as the
> program version number, the version number of the midas library it's linked with could also be returned (There can be more than one
> libmidas.so on a system and this would show which one was currently being linked)

This would solve the versioning problem for midas built from versionned tarballs, and I am considering a similar scheme for midas installed from
RPMs. But what do I do for midas built from CVS?!? K.O.
ELOG V3.1.4-2e1708b5