Back Midas Rome Roody Rootana
  Midas DAQ System, Page 36 of 45  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
Entry  15 Mar 2007, Konstantin Olchanski, Info, mhdump: a standalone MIDAS history dump utility 
While working on improvements to the MIDAS history system, I understood the data
format of the MIDAS .hst files and wrote a standalone program to extract data
from them, called mhdump.

mhdump is intended to be easier to use, compared to mhist. By default it reads
and decodes all the data in the given .hst files, with options to limit the
decoding to specified events and tags, and an option to omit the event and tag
names from the output.

mhdump is completely standalone and does not require MIDAS header files and
libraries.

The mhdump source code and a description of the .hst file format are here:
http://daq-plone.triumf.ca/SR/MIDAS/utils/mhdump/

I hope people find this program useful. If you have any feedback (patches, bug
reports, requests for improvements), please post them as replies to this forum
message.

K.O.
    Reply  15 Mar 2007, Stefan Ritt, Info, mhdump: a standalone MIDAS history dump utility 
> I hope people find this program useful. If you have any feedback (patches, bug
> reports, requests for improvements), please post them as replies to this forum
> message.

I wouldn't mind putting this into the midas distribution. Put it under utils/, add
an entry to the Makefile, and fix that warning:


mhdump.cxx: In function `int readHstFile(FILE*)':
mhdump.cxx:161: warning: comparison between signed and unsigned integer expressions
       Reply  20 Nov 2007, Konstantin Olchanski, Info, mhdump: a standalone MIDAS history dump utility 
> > I hope people find this program useful. If you have any feedback (patches, bug
> > reports, requests for improvements), please post them as replies to this forum
> > message.
> 
> I wouldn't mind putting this into the midas distribution. Put it under utils/, add
> an entry to the Makefile, and fix that warning:
> 
> 
> mhdump.cxx: In function `int readHstFile(FILE*)':
> mhdump.cxx:161: warning: comparison between signed and unsigned integer expressions

Done and done.

The program mhdump, a standalone decoder for midas history files, is now in midas svn.

K.O.
Entry  17 Oct 2007, Randolf Pohl, Forum, Adding MIDAS .root-files 
Dear MIDAS users,

I want to add several .root-files produced by the MIDAS analyzer, in a fast 
and convenient way. ROOT's hadd fails because it does not know how to treat 
TFolders. I guess this problem is not unique to me, so I hope that somebody of 
you might already have found a solution.

Why don't I just run "analyzer -r 1 10000"?
We have taken lots of runs under (rapidly) varying conditions, so it would be 
lots of "-r". And the analysis is quite involved, so rerunning all data takes 
about one hour on a fast PC making this quite painful.
Therefore, I would like to rerun all data only once, and then add the result 
files depending on different criteria.

Of course, I tried to write a script that does the adding. But somehow it is 
incredibly slow. And I am not the Master Of C++, too.

Is there any deeper reason for MIDAS using TFolders, not TDirectorys? ROOT's 
hadd can treat TDirectory. Can I simply patch "my" MIDAS? Is there general 
interest in a change like this? (Does anyone have experience with the speed of 
hadd?)

Looking forward to comments from the Forum.

Cheers,

Randolf
    Reply  17 Oct 2007, John M O'Donnell, Forum, Adding MIDAS .root-files histoAdd.cxx
The following program handles regular directories in a file, or folders (ugh).
Most histograms are added bin by bin.

For scaler events it is convenient to see the counts as a function of time (ala
sclaer history plots in mhttpd).  If the histogram looks like a scaler plot versus
time, then new bins are added on to the end (or into the middle!) of the histogram.

All different versions of cuts are kept.

TTrees are not explicitly supported, so probably don't do the right thing...

John.

> Dear MIDAS users,
> 
> I want to add several .root-files produced by the MIDAS analyzer, in a fast 
> and convenient way. ROOT's hadd fails because it does not know how to treat 
> TFolders. I guess this problem is not unique to me, so I hope that somebody of 
> you might already have found a solution.
> 
> Why don't I just run "analyzer -r 1 10000"?
> We have taken lots of runs under (rapidly) varying conditions, so it would be 
> lots of "-r". And the analysis is quite involved, so rerunning all data takes 
> about one hour on a fast PC making this quite painful.
> Therefore, I would like to rerun all data only once, and then add the result 
> files depending on different criteria.
> 
> Of course, I tried to write a script that does the adding. But somehow it is 
> incredibly slow. And I am not the Master Of C++, too.
> 
> Is there any deeper reason for MIDAS using TFolders, not TDirectorys? ROOT's 
> hadd can treat TDirectory. Can I simply patch "my" MIDAS? Is there general 
> interest in a change like this? (Does anyone have experience with the speed of 
> hadd?)
> 
> Looking forward to comments from the Forum.
> 
> Cheers,
> 
> Randolf
Entry  17 Oct 2007, Randolf Pohl, Forum, Multi-core CPUs 
Dear Forum,

I have this beautiful Intel Quadcore with fast disks, but MIDAS does obviously 
only make use of one CPU at a time. Has anyboy of you already done some work 
on making MIDAS parallel? Event-based data analysis should be the best 
candidate for this.

Has anybody done this with PVM? There is some PVM-related stuff in the MIDAS 
sources, but I got the impression this works only with HBOOK, not with ROOT. 
Or am I wrong?
But then PVM is probably also not the most efficient thing one ONE machine 
with multiple CPUs, right? And finally, with PVM we're back to 
adding .root-files efficiently (see my previous post).

Any thoughts?

Cheers,

Randolf
    Reply  17 Oct 2007, Stefan Ritt, Forum, Multi-core CPUs 
> I have this beautiful Intel Quadcore with fast disks, but MIDAS does obviously 
> only make use of one CPU at a time. Has anyboy of you already done some work 
> on making MIDAS parallel? Event-based data analysis should be the best 
> candidate for this.

There are ring buffer routines rb_xxx for distributed event analysis, but this is
currently only implemented in the front-end framework. These routines are pretty
simple, and their integration into the analyzer should not be very difficult.
Unfortunately I don't have time for that right now. We do our analysis such that we
analyze four different runs in parallel on a quadcore machine.

- Stefan
Entry  11 Oct 2007, Stefan Ritt, Bug Report, _syscall0 not available on gcc 4.1.1 
Dear Stephan,

I am writting on behalf of the LiBeRACE collaboration
at Berkeley/Livermore.

We are trying to use midas (2.0.0) for our acquisition system.
However we had some difficulties to compile it on LINUX Fedora
Core 6 with gcc 4.1.1
I tried to trace back the problem and I found that _syscall0 in
system.c is actually an obsolete call (since gcc 4.x apparently).
Playing with assembly language being behond my competence, I would 
like to know if you ever came across this situation recently and
if you have any suggestion(s).

With my best regards
Julien GIBELIN


------------------------------------------------------
GIBELIN Julien

Lawrence Berkeley National Laboratory
Nuclear Science Division
One Cyclotron Rd.
MS 88R0192
BERKELEY, CA 94720-8101

Tel: +1 (510) 495-2695
Fax: +1 (510) 486-7983
------------------------------------------------------
    Reply  11 Oct 2007, Stefan Ritt, Bug Report, _syscall0 not available on gcc 4.1.1 
> Dear Stephan,
> 
> I am writting on behalf of the LiBeRACE collaboration
> at Berkeley/Livermore.
> 
> We are trying to use midas (2.0.0) for our acquisition system.
> However we had some difficulties to compile it on LINUX Fedora
> Core 6 with gcc 4.1.1
> I tried to trace back the problem and I found that _syscall0 in
> system.c is actually an obsolete call (since gcc 4.x apparently).
> Playing with assembly language being behond my competence, I would 
> like to know if you ever came across this situation recently and
> if you have any suggestion(s).

The '_syscall0' function call was replaced by 'syscall' in SVN revision 3583. I
would recommend that you switch to the current SVN version (see
http://ladd00.triumf.ca/~daqweb/doc/midas/html/quickstart.html on how to obtain
the SVN version). If the problem still persists, please let us know.

- Stefan
Entry  08 Oct 2007, Carl Metelko, Bug Report, Error in data format- ending blocks on 32bit boundary x86_64 
Hi,
    I found that midas banks can be given an extra 32 bits of zeros when
trying to keep to 32bit boundary on my x86_64. 

This can be fixed by changing (in midas.h)
#define ALIGN8(x)  (((x)+7) & ~7)
to
#define ALIGN8(x)  (((x)+3) & ~3)

Is there any bad consequences doing this?
    Reply  08 Oct 2007, Stefan Ritt, Bug Report, Error in data format- ending blocks on 32bit boundary x86_64 
> Hi,
>     I found that midas banks can be given an extra 32 bits of zeros when
> trying to keep to 32bit boundary on my x86_64. 
> 
> This can be fixed by changing (in midas.h)
> #define ALIGN8(x)  (((x)+7) & ~7)
> to
> #define ALIGN8(x)  (((x)+3) & ~3)
> 
> Is there any bad consequences doing this?

Yes. ALIGN8 means 'align to 8-byte boundary' (64-bit), and if you change that, you
break the code at various locations. Furthermore, 8-byte aligned access is faster
on x86_64 than 4-byte aligned access, so you will get a performance penalty. If
course if you have very many small banks, the zero padding can cause some
overhead, but in that case you could combine some data into a single bank.
Entry  02 Oct 2007, Konstantin Olchanski, Info, ROODY, ROOTANA updates 
The ROODY online histogram viewer and the ROOTANA midas analyzer toolkit have been updated to work 
with ROOT version 5.16 and tested on Linux (SL4.4) and MacOS (10.4.10/PPC).

This update includes the library called "TNetDirectory" for access to remote ROOT objects. This library is 
still under development, but is complete enough for use with ROODY. To try it, please specify -P9091 in 
rootana and -Plocalhost:9091 in ROODY.

K.O.
Entry  06 Sep 2007, Stefan Ritt, Info, Introduction of MIDAS_MAX_EVENT_SIZE 
We had the problem that different experiments used different MAX_EVENT_SIZE
values (the MEG experiment actually 10 MB!). If each experiment changes the
value in midas.h and accidentally commits it, other experiments are affected.
Therefore I modified midas.h and the Makefile to accept a new environment
variable MIDAS_MAX_EVENT_SIZE. If this value is set, the Makefile passes it's
value to midas.h where it supersedes the default value which is currently at 4 MB.

PAA: Can you pleas add this to the documentation at the right spot? Thanks.
Entry  20 Aug 2007, Konstantin Olchanski, Bug Report, how to handle end of run? 
I am having problems with handling the end-of-run situation in my midas
frontend. I have a device that continuously sends data (over USB) and I read
this data in my "read_event" function.

Everything is good until the end-of-run, at which time this happens:
0) mfe.c calls my read_event() to read the data (loop until the end-of-run
transition)
1) mfe.c calls my end_of_run()
2) here, I tell the device "please stop sending data"
3) all seems good, but wait!!!
4) there is all this data generated between step 0 and step 2 still sitting
inside the device and it has nowhere to go: the run is ended, the output file is
closed, my read_event() will never be called ever again (well, until the next run).

It seems to me mfe.c needs to have one more function, something like
"pre_end_of_run()" that works like this:
0) mfe.c calls my read_event() to read the data (loop until the end-of-run
transition)
1) mfe.c calls pre_end_of_run(), here I tell the device to stop sending data
2) mfe.c calls read_event() for the very last time, to give me the opportunity
to read and send away any data I still may have.
3) mfe.c calls the end_of_run(). The run is truely finished.

Any thoughts?

K.O.
    Reply  03 Sep 2007, Stefan Ritt, Bug Report, how to handle end of run? 
> I am having problems with handling the end-of-run situation in my midas
> frontend. I have a device that continuously sends data (over USB) and I read
> this data in my "read_event" function.
> 
> Everything is good until the end-of-run, at which time this happens:
> 0) mfe.c calls my read_event() to read the data (loop until the end-of-run
> transition)
> 1) mfe.c calls my end_of_run()
> 2) here, I tell the device "please stop sending data"
> 3) all seems good, but wait!!!
> 4) there is all this data generated between step 0 and step 2 still sitting
> inside the device and it has nowhere to go: the run is ended, the output file is
> closed, my read_event() will never be called ever again (well, until the next run).
> 
> It seems to me mfe.c needs to have one more function, something like
> "pre_end_of_run()" that works like this:
> 0) mfe.c calls my read_event() to read the data (loop until the end-of-run
> transition)
> 1) mfe.c calls pre_end_of_run(), here I tell the device to stop sending data
> 2) mfe.c calls read_event() for the very last time, to give me the opportunity
> to read and send away any data I still may have.
> 3) mfe.c calls the end_of_run(). The run is truely finished.
> 
> Any thoughts?

You can achieve the desired functionality without changing mfe.c:

0) mfe.c calls read_event
1) mfe.c calls end_of_run. Your end_of_run tells the device to stop data and flushes
the remaining data. At this point you have to re-make actually a part of the mfe.c
functionality, but basically you need a bm_compose_event() and a bm_send_event(), so
just a few lines of code. If you want to have the final event number right in your
equipment, you also need to update eq->events_sent accordingly. 

Given the fact that 99% of the experiments do not need this functionality, I propose
that we keep mfe.c and you add the few lines of code into your user part of the
specific frontend.

Stefan
Entry  29 Aug 2007, Konstantin Olchanski, Info, Added data compression to mlogger 
I now commited the changes to mlogger (mlogger.c, msystem.h) implementing data
compression using zlib (svn revision 3845)

To enable compression, observe that mlogger is compiled with -DHAVE_ZLIB (see
the Makefile), in "/Logger/Channels/NNN/Settings", set "compression" to "1" and
the filename to "run%05d.mid.gz" (note the suffix ".gz").

In the Makefile, I only enabled HAVE_ZLIB for Linux, as that is the only
platform I tested. If somebody can test compression on Windows, please do and
let us know.

My ROOT analyzer (rootana) package can read compressed MIDAS files directly and
if one wants to add this capability to other MIDAS-related packages, one is
welcome to use my TMidasFile.cxx as an example
(http://ladd00.triumf.ca/viewcvs/rootana/trunk/TMidasFile.cxx?view=markup).

K.O.
Entry  08 Jun 2006, Konstantin Olchanski, Bug Fix, commit latest ccusb.c CAMAC-USB driver 
I commited the latest driver for the Wiener CCUSB USB-CAMAC driver. It
implements all functions from mcstd.h and has been tested to be plug-compatible
with at least one of our CAMAC frontends. K.O.
    Reply  23 Sep 2006, Konstantin Olchanski, Bug Fix, commit latest ccusb.c CAMAC-USB driver 
> I commited the latest driver for the Wiener CCUSB USB-CAMAC driver. It
> implements all functions from mcstd.h and has been tested to be plug-compatible
> with at least one of our CAMAC frontends. K.O.

This driver is known to not work with the latest CCUSB firmware (20x, 204, 30x, 303). I know what 
modifications are required and an updated driver will be available shortly. If there is a delay, and you need the 
driver ASAP, please drop me an email.

Also, I am thinking about dropping support for the very old CCUSB firmware revisions (before 204). (Any 
comments?)

K.O.
       Reply  22 Aug 2007, Konstantin Olchanski, Bug Fix, commit latest ccusb.c CAMAC-USB driver 
> > I commited the latest driver for the Wiener CCUSB USB-CAMAC driver. It
> > implements all functions from mcstd.h and has been tested to be plug-compatible
> > with at least one of our CAMAC frontends. K.O.

Well, it took almost a year to finish an updated driver, which has now been
commited to MIDAS SVN (see http://savannah.psi.ch/viewcvs/trunk/drivers/camac/ccusb/?root=midas).

This supports ccusb firmware release 0x402. With earlier firmware, simple CAMAC operations should work,
but to use the readout list feature one has to have the latest main firmware (0x402 as of today) and the latest CPLD
firmware.

The driver kit includes:
- the "ccusb" driver which implements the MIDAS mcstd.h CAMAC interface;
- test_ccusb to probe the interface and generally make the lights flash;
- ccusb_flash for updating the ccusb main firmware (assembled from bits and pieces found on the CCUSB driver CD);
- feccusb, an example midas frontend, which uses the ccusb readout list feature and has extensive error handling,
should be good enough for production use (unlike the Wiener libxxusb drivers, which lack basic error handling).
- analyzer.cxx, an rootana-based example on how to decode the ccusb data;
- README file with release notes.

If you use this driver, please drop me an email (even if it works perfectly for you, hah!) - the ccusb device is very
nice but can be hard to use and I would like to hear about problems other people have with it.

Today's version of the README files is attached below:

MIDAS driver for the Wiener/JTec CC-USB CAMAC-USB interface.

Date: 22-AUG-2007/KO

Note 1: The CC-USB interface comes with a CD which contains manuals,
firmware files, Windows and Linux software. The Wiener/JTec driver
is called "libxxusb". These MIDAS/musbstd drivers were written before
libxxusb bacame available and do not use libxxusb.

This driver implements the MIDAS CAMAC interafce "mcstd.h" using
the MIDAS USB interface musbstd.h.

Note 2: There exist many revisions of CCUSB firmware. Basic CAMAC
access works in all of them, but the "readout list" feature seems
to be only functional with firmware revision 0x402 or older and
with CPLD revisions CC_atmmgr_101406.jed, CC_datamgr_021905.jed,
CC_lammgr_brdcst_041906.jed or older.

To upgrade the main CCUSB firmware, follow instructions from
the CCUSB manual. On Linux, one can use the ccusb_flash
program included with these MIDAS drivers. It is a copy
of ccusb_flash from the Wiener CD, with all the pieces
assembled into one place and with a working Makefile. (I am too
lazy to add the flashing bits to the ccusb.c driver).

To upgrade the CPLD firmware, one needs a Xilinx JTag programmer
cable (we use a "parallel port to JTag" cable provided by Wiener),
and the Xilinx software (on Linux, we use Xilinx91i). For successful
upgrade, follow instructions from Xilinx and Wiener.

Note 3: Before starting to use the CCUSB interface, one should obtain
the latest version of the CCUSB manual and firmware by downloading
the latest version the CCUSB driver CD from the Wiener web
site (registration required)

Note 4: The example CCUSB frontend assumes this hardware configuration:
LeCroy 2249A 12 channel ADC in slot 20, Kinetic Systems 3615 6 channel
scaler in slot 12. NIM trigger input connected to CCUSB input "I1"
firing at 10-100 Hz. Without the external trigger CCUSB will not
generate any data and the frontend will only give "data timeout"
errors. With the trigger, the LED on the scaler should flash at 1 Hz
and the LEDs on the CCUSB should flash at the trigger rate.

Note 5: The CCUSB interface does not reliably power up in some CAMAC
crates (this has something to do with the sequence in which
different voltages start at different times with different CAMAC
power supplies). Some newer CCUSB modules may have this
problem fixed in the hardware and in the CPLD firmware. For modules
exhibiting this problem (i.e. no USB communication after power up),
try to cycle the power several time, or implement the "hardware reset
switch" (ask Wiener).

Note 6: The CCUSB firmware is very fickle and would crash if you look
at it the wrong way. This MIDAS driver tries to avoid all known crashers
and together with the example frontend, can recover from some
of them. Other crashes cannot be recovered from other than by
a hardware reset or power cycle.

//end
Entry  26 Jul 2007, Stefan Ritt, Info, Change of pointer type in mvmestd.h 
I had to change the pointer type of mvme_read and mvme_write to (void *) instead
to (mvme_locaddr_t *) to avoid warnings under 64-bit linux. Please adjust your
VME drivers if necessary.

- Stefan
    Reply  12 Aug 2007, Konstantin Olchanski, Info, Change of pointer type in mvmestd.h 
> I had to change the pointer type of mvme_read and mvme_write to (void *) instead
> to (mvme_locaddr_t *) to avoid warnings under 64-bit linux. Please adjust your
> VME drivers if necessary.

Updated: vmicvme.c (VMIVME-7750/7805) and gefvme.c (GEFANUC V7865)

K.O.
Entry  29 Jun 2007, Konstantin Olchanski, Bug Fix, mscb, musbstd fixed on Linux, MacOS 
I commited a few minor changes to musbstd and mscb code to make them work on
MacOSX (tested on 10.3.9) and Linux (tested on Fedora 6).

The basic functions work with the MSCB USB master, but I still need to
investigate some cases where the connection hangs and usb communications do not
work until the USB cable is unplugged and plugged back in. I see this problem
both on MacOS and Linux.

Important changes:
1) mscb_select_device() does not work on both Linux and MacOS and is disabled.
Please run "msc -d usb0".
2) on Linux, the Makefile should define -DOS_LINUX and -DHAVE_LIBUSB;
   on MacOS, the Makefile should define -DOS_LINUX and -DOS_DARWIN. (This is
because MacOS is treated as a funny type of Linux).
3) when doing USB communications, one has to use the correct endpoint numbers,
which seem to be system dependant and for now, I hard code them in mscb.c for
the tested systems.

There supposed to be no changes to the Windows code, but I cannot test on
Windows, so if somebody does and finds breakage, please let me know.

K.O.
    Reply  02 Jul 2007, Stefan Ritt, Bug Fix, mscb, musbstd fixed on Linux, MacOS 

KO wrote:
There supposed to be no changes to the Windows code, but I cannot test on Windows, so if somebody does and finds breakage, please let me know.


I can confirm that revision 3713 still works under Windows.
    Reply  06 Jul 2007, Konstantin Olchanski, Bug Fix, mscb, musbstd fixed on Linux, MacOS 
> I commited a few minor changes to musbstd and mscb code...
>
> The basic functions work with the MSCB USB master, but I still need to
> investigate some cases where the connection hangs and usb communications do not
> work until the USB cable is unplugged and plugged back in. I see this problem
> both on MacOS and Linux.

I think I fixed the hangs we see on linux and macos - at the end all I had to do is
issue a usb reset to make mscb communicate again.

Also tested on Linux FC6 and SL4.5.

K.O.
Entry  10 May 2007, Konstantin Olchanski, Info, RHEL5/SL5 success! 
FWIW, I am running latest 32-bit MIDAS on an AM2 dual core AMD machine under 64-bit SL5. Everything 
seems to work correctly. K.O.

P.S. For the record, the compiler produces two sets of warnings:
- warning: pointer targets in passing argument 3 of â differ in signedness
- warning: dereferencing type-punned pointer will break strict-aliasing rules
(I do not understand the meaning of the second warning. type-punned pointer, huh?)
K.O.
    Reply  03 Jul 2007, Ryu Sawada, Info, RHEL5/SL5 success! 
> P.S. For the record, the compiler produces two sets of warnings:
> - warning: dereferencing type-punned pointer will break strict-aliasing rules
> (I do not understand the meaning of the second warning. type-punned pointer, huh?)

This is because strict aliasing rule is broken in this code.

In ISO C++99 standard, it is illegal to create two pointers of different types referring to the same address.
Even a code breaks the rule, it compiles, but the result is undefined.

For example following code gives different results depends on -O2 is used or not, because -O2 includes -fstrict-aliasing option.
When -fstrict-aliasing is used, compiler can optimize the code assuming the strict aliasing rule.
#include <stdio.h>
  
int main(){
   int ii = 1;
   float* ff = (float*)&ii;
   *ff = 2;
   printf("%d\n", ii);
   return 0;
} 

GCC warns this kind of code with a message like,
warning: dereferencing type-punned pointer will break strict-aliasing rules
The behavior differs also depending on compilers. GCC3 does not warn, while GCC4 warns. (GCC3 is the default on SL4, while GCC4 is
the default on SL5)
And results are different. GCC3 gives 0, while GCC4 gives 1.
#include <stdio.h>

typedef struct xxx {int ii;} XX;
  
int main(){
   XX a;
   a.ii = 1;
   *(short*)&a.ii = 0;
   printf("%d\n", a.ii);
   return 0;
}


More dangerous thing is that compilers do not always warn about it. For example, following code compiles without warnings even
when you use -Wall (including -Wstrict-aliasing). But the result changes depending on compile flags or compiler versions.
#include <stdio.h>
#include <string.h>
#include <malloc.h>
  
int main(){
   int *ii = (int*)malloc(8);
   ii[0] = 1;
   ii[1] = 2;
   float* ff = (float*)ii;
   ff[0] = 3;
   ff[1] = 4;
   printf("%d %d\n", ii[0], ii[1]);
   return 0;
}

A safer way is disabling -fstrict-aliasing compile flag. For example, you may change compile flag for midas like "-O2 -fno-strict-
aliasing".
Disadvantage is that there is a possibility that the speed is decreased.

The best way is modifying code to be in the strict aliasing rule.

Best regards
Entry  07 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline crash.out
Hello,

I am having a problem with the root-based analyzer. It crashes when I try to 
analyze multiple runs OFFLINE using the "-i run%05d.mid -o result%05d.root -r 
1 2" feature.

I can reproduce the problem with the example experiment which comes with the 
MIDAS distribution:
Running the analyzer ONLINE works fine: One can start and stop runs one after 
the other, roody shows the histograms being reset and then filled again and 
such.

But OFFLINE, the analyzer crashes when trying to analyze the SECOND run in a 
sequence. So
./analyzer -i run%05d.mid -o result%05d.root -r 1 1   works (only run 1)
./analyzer -i run%05d.mid -o result%05d.root -r 1 3   dies on run 2
Output attached (I added printf's to the "init"-modules, but that's irrelevant 
here)


My own analyzer shows the same effect. There I got the impression the segfault 
happens on the first attempt to Fill/Reset/SetName etc. a histogram in the 2nd 
run. But with the midas example it looks like the analyzer finishes filling 
histos even for run 2, but then dies in eor.

Can you reproduce the problem?

I run MIDAS on an Intel Quadcore, 64 bit SuSE Linux 10.2.
pohl@lamb2:~/midas/examples/root> gcc --version
gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux)

(maybe 4.1.2 "PRERELEASE" is the problem? See message ID 344)

I am using midas rev. 3674 (April 19, 2007), but I got the impression there 
has since not been a change relevant to this problem. Please correct me if I 
am wrong, then I would try it with Rev HEAD.
(My version includes already the fix to the x86_64 segfault problem of message 
ID 337)


Best regards,

Randolf
    Reply  08 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline 
Unfortunately I don't have time right now to debug the problem, but I could see
roughly what it could be. The analyzer crashes inside CloseRootOutputFile:

#5  <signal handler called>
#6  0x00002b5f52ad5ee5 in free () from /lib64/libc.so.6
#7  0x000000000040c89b in CloseRootOutputFile () at src/mana.c:1489

in the line 

    free(tree_struct.event_tree[i].branch);

If a "free" crashes, it might indicate that the memory beyond the allocated space
got corrupted. The branch gets allocated in book_ttree(), once for each
analyze_request[i]. The branch gets filled in write_event_ttree():

      /* fill tree both online and offline */
      if (!exclude_all)
         et->tree->Fill();

Maybe one should put printf debugging statements in these places to see what's
going on.
       Reply  09 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline 
Hello Stefan,

tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should 
presumably not be the case, since CloseRootOutputFile() frees the trees at eor().

------------------- output ---------------------------
lamb@lamb2:~/midas/root_3705> ./analyzer -e 
exa_root -i /tmp/midas/examples/root/run%05d.mid -o /tmp/midas/run%05d.root -r 1 2
Root server listening on port 9090...
Running analyzer offline. Stop with "!"
book_ttree: tree_struct.n_tree = 1
book_ttree: tree_struct.n_tree = 2
Set run number 1 in ODB
Load ODB from run 1...OK
/tmp/midas/examples/root/run00001.mid:2722  /tmp/midas/run00001.root:2720  events, 
0.21s
book_ttree: tree_struct.n_tree = 3     <<---- !!!!
book_ttree: tree_struct.n_tree = 4
Set run number 2 in ODB
Load ODB from run 2...OK
/tmp/midas/examples/root/run00002.mid:2347  /tmp/midas/run00002.root:2345  events, 
0.18s

 *** Break *** segmentation violation
----------------- \output ----------------------------

Adding this one line fixes the segfault problem for the root example expt.

----------------- code -------------------------
lamb@lamb2:/data/software/midas/midas_3705/src/src> svn diff mana.c
Index: mana.c
===================================================================
--- mana.c      (revision 3705)
+++ mana.c      (working copy)
@@ -1496,6 +1496,7 @@
    /* delete event tree */
    free(tree_struct.event_tree);
    tree_struct.event_tree = NULL;
+   tree_struct.n_tree = 0;
 
    // go to ROOT root directory
    gROOT->cd();
---------------- \code ---------------------------

Please check if this gives the intended behaviour. I am not very familiar with the 
midas internals.

Unfortunately my own analyzer's segfault problem is not solved by this patch. I 
guess I have to keep searching for a bug on my side.....  :-)


Cheers,

Randolf
          Reply  10 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline 
> tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should 
> presumably not be the case, since CloseRootOutputFile() frees the trees at eor().

Yes this indeed a bug. I applied your change and committed the new code.
    Reply  11 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline 
Hello again,

just for the record, in case somebody else runs into the same problem...

I have hunted down "my" segfault problem to the fact that I book histograms not 
in <module>_init, but in <module>_bor. I have to do so, because only in bor do I 
know which histograms to book, as this information comes from the ODB (booking 
only histograms for CAMAC modules which were set to "read" in the ODB). The core 
dump happens on the first access (->Fill, ->SetName,...) of one of these histos 
in the 2nd run analyzed offline ("./analyzer -r n m").

In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module 
bor() functions go to the output file", and then does a gManaOutputFile->cd();
Consequently, the histograms vanish after the file is closed, therefore the 
segfault when trying to access them in the 2nd run. (I keep track of existing 
histograms, only booking the missing histos in bor.)

The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with 
TFolders and booking the histogram.


I do, however, not really understand the intention why histos booked in bor() go 
to only the file, whereas histos booked in init() go to memory. Could you please 
comment briefly? Maybe I missed the most important point. And what about online 
mode, should this work?


Thanks a lot in advance,

Randolf
       Reply  11 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline 
> I have hunted down "my" segfault problem to the fact that I book histograms not 
> in <module>_init, but in <module>_bor. I have to do so, because only in bor do I 
> know which histograms to book, as this information comes from the ODB (booking 
> only histograms for CAMAC modules which were set to "read" in the ODB). The core 
> dump happens on the first access (->Fill, ->SetName,...) of one of these histos 
> in the 2nd run analyzed offline ("./analyzer -r n m").
> 
> In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module 
> bor() functions go to the output file", and then does a gManaOutputFile->cd();
> Consequently, the histograms vanish after the file is closed, therefore the 
> segfault when trying to access them in the 2nd run. (I keep track of existing 
> histograms, only booking the missing histos in bor.)
> 
> The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with 
> TFolders and booking the histogram.

ROOT has the strange concept of "current working directory", coming from the fact that
ROOT was written by Fortran and PAW people, being used to have directories and
subdirectories with a persistent state (not really object-oriented style). So one can
set the "current working directory" to the root (=memory) with gROOT->cd() and to a
subdirectory which will later be written into a file with gManaOutputFile->cd(). If
you do the first one, the histograms are created only in memory, while in the later
case they are also created in memory, but will later be written into the output file
in the routine CloseRootOutputFile(). So if you do a gROOT->cd() in <module>_bor,
these histograms will not be written to file. So I guess your solution is not a real
solution.

> I do, however, not really understand the intention why histos booked in bor() go 
> to only the file, whereas histos booked in init() go to memory. Could you please 
> comment briefly? Maybe I missed the most important point. And what about online 
> mode, should this work?

The root output file is opened in bor() and closed in eor(). For a histo to go to the
file, it must be booked after opening the file, that is after bor() in mana.c and
therefore after the gManaOutputFile->cd().

I agree with you that the current scheme is not satisfactory. When running online, you
want to keep the histos between the runs. When running offline, you delete and
re-create them for each run. It would be better to create all histos online and
offline under gROOT, and just copy them to gManaOutputFile before writing them. I have
to admit that this root code was never really used in a productive environment for
offline analysis, so there might be some issues here and there. Some people write
directly root files in the logger, and then do a root-only (without the midas
analyzer) analysis. Unfortunately I'm busy these days and cannot write any code right
now. But if you feel like something should be modified in mana.c, please send it to me
and I can incorporate it into the standard code.
          Reply  12 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline 
Hi

> So I guess your solution is not a real solution.

I was not precise enough on what I do. This way the histograms persist in memory, but 
they are also written to every file:

e.g. in module "trig_tdc":

  TDirectory *savedir = gDirectory;  // will restore this afterwards
  gROOT->cd();     // go to file

  // make sure we are in the right "analyzer module folder"
  TDC_Folder = (TFolder *) gROOT->FindObjectAny("trig_tdc");
  gHistoFolderStack->Add((TObject *) TDC_Folder);

  ...(loop over all TDCs, figure out which histos exist, and which need to be booked)

  open_subfolder("raw4208");
  hrTDC = h1_book(....);   // create histo in memory, but it shows up in the file, too.
  close_subfolder(); //raw4208

  // restore gHistoFolderStack (we added a folder when entering routine)
  gHistoFolderStack->Remove(gHistoFolderStack->Last());

  // restore current directory
  savedir->cd();

When deleting histos I do:

     gManaHistosFolder->RecursiveRemove(*pHisto);
    (*pHisto)->Delete();
    (*pHisto) = NULL;  // for my book-keeping of existing histos.

You don't have to clear the histos explicitly between runs. gManaHistosFolder does this 
magic to you.

> But if you feel like something should be modified in mana.c, please send it to me
> and I can incorporate it into the standard code.

No, the code is fine. I just wanted to explain my problem and a solution to it, because 
I thought that somebody might run into the same problem, too. 

Ciao,

Randolf
Entry  22 May 2007, Randolf Pohl, Bug Report, analyzer_init called by odb_load 
Hi,

I wonder why mana.c:odb_load() calls analyzer_init(). This way analyzer_init 
is called TWICE or more times:
first from mana.c:mana_init(), for each invocation of the analyzer, and 
second from mana.c:odb_load(), for each run to be analyzed

Isn't this a bug? It can mess up several things (like mallocs) if you don't 
take the necessary precautions. Other module_init functions are correctly 
called only once, before all runs are analyzed.

I have the feeling, that odb_load should NOT call analyzer_init. Or am I wrong 
(probably, but please explain to me)? Do I have to live with it and make sure 
that my beautiful global initialization in analyzer_init is only done once?
:-)

Cheers,

Randolf

And here is the annotated log using the ROOT example experiment 
(several modules changed/added to print their respective names)

:~/midas/examples/root> ./analyzer -e exa_root -i run%05d.mid -r 1 3
 
analyzer_init        <-- ok

Root server listening on port 9090...
adc_calib_init       <-- ok
adc_summing_init     <-- ok
scaler_init          <-- ok
Running analyzer offline. Stop with "!"
Set run number 1 in ODB
Load ODB from run 1...
analyzer_init        <-- not ok, or is it?

OK
run00001.mid:777  events, 0.00s
Set run number 2 in ODB
Load ODB from run 2...
analyzer_init        <-- not ok, or is it?

OK
run00002.mid:7227  events, 0.03s
Set run number 3 in ODB
Load ODB from run 3...
analyzer_init        <-- not ok, or is it?

OK
run00003.mid:13866  events, 0.06s
adc_calib_exit
adc_summing_exit
scaler_exit

analyzer_exit
    Reply  22 May 2007, Stefan Ritt, Bug Report, analyzer_init called by odb_load 
The reason to call analyzer_init in odb_load is the following:

Assume you run the analyzer offline, analyzing many files in series. Then assume
that you have /Experiment/Run Parameters, which is actively used by the analyzer
(like beam settings etc.). In this case you do a db_open_record() to map
/Experiment/Run Parameters to the exp_param C structure. For this mapping to work,
the ODB structure and the C structure have to be exactly the same. Now assume that
you changed your run parameters over time, like you added some comment later. Now
you want to analyzer several runs, some before and some after the modification.
Both sets have a different structure in /Experiment/Run Parameters, which is a
problem, since the compiled analyzer can only have a single C structure. My "poor"
solution was to call analyzer_init after each loading of the ODB from the *.mid
file. The db_create_record() call matches the C structure to the ODB structure by
modifying the ODB structure if necessary. So if you added one parameter later, this
(modified) structure gets loaded by odb_load, but then it gets adjusted in
analyzer_init().

I understand now that this case might not happen so often, and you are more
bothered by the fact that analyzer_init gets called several time. There must
however be a hook for offline analysis that the user code can correct the ODB
structure. So I propose to add a flag to analyzer_init, such as

INT analyzer_init(BOOL bFirst)
{
}

If bFirst equals TRUE, the function got called from mana_init(), if FALSE, it got
called from odb_load. Then you can put code like

INT analyzer_init(BOOL bFirst)
{
   if (bFirst) {
      p = malloc()
      ...
   }
}

If you agree, I will modify the code and commit the change.

- Stefan
       Reply  22 May 2007, Randolf Pohl, Bug Report, analyzer_init called by odb_load 
Thanks for the quick reply, Stefan.

Please don't change anything in the code unless you find it really important. I guess 
changing the analyzer_init prototype will break a lot of code out there?

In fact, I think I do understand this behavior now.
And even without your suggested fix there is a simple workaround: I add a static 
variable to my analyzer_init.cxx file, and do something similar to your bFirst fix.

In conclusion, commit your fix if it does not harm others. Postpone this commit to a 
future new version of midas which breaks a lot of things anyway...

A last question, for me to understand: Why not call db_open_record in 
ana_begin_of_run then?

Cheers,

Randolf
          Reply  22 May 2007, Stefan Ritt, Bug Report, analyzer_init called by odb_load 
> Thanks for the quick reply, Stefan.
> 
> Please don't change anything in the code unless you find it really important.
I guess 
> changing the analyzer_init prototype will break a lot of code out there?
> 
> In fact, I think I do understand this behavior now.
> And even without your suggested fix there is a simple workaround: I add a static 
> variable to my analyzer_init.cxx file, and do something similar to your bFirst
fix.
> 
> In conclusion, commit your fix if it does not harm others. Postpone this
commit to a 
> future new version of midas which breaks a lot of things anyway...
> 
> A last question, for me to understand: Why not call db_open_record in 
> ana_begin_of_run then?

I fully agree with you that db_open_record would better go into ana_begin_of_run
(and
analyzer_init not being called in odb_load), and I fully agree with you that
changing the
code would break many experiments. ;-)

So I guess we leave it as it is right now as you suggested.
Entry  21 May 2007, Konstantin Olchanski, Info, mhttpd changes to use /History/Tags data 
I am slowly commiting the changes to the history code. This installement adds
code to mhttpd to use the /History/Tags data (to be) generated by the mlogger.

In the nutshell, the logger fills /History/Tags to "remember" what events,
variables and tags exist in the history files.

This replaces the old code that attempts to guess the contents of history files
by looking at /Equipment tree.

To ease the transition to the new system, I am leaving all the old code alive
and active in the absense of "/History/Tags" entries.

As soon as one starts using the new mlogger (to be commited), the new tags based
mhttpd code will activate itself.

K.O.
Entry  09 May 2007, Carl Metelko, Forum, Splitting data transfer and control onto different networks 
Hi,
   I'm setting up a system with two networks with the intension of having
control info (odb, alarm) on the 192.168.0.x
and the frontend readout on 192.168.1.x

Is there any easy way of doing this?
I'm also trying to separate processes onto different machines, is there
any way to not have mserver,mhttpd and (mlogger,mevt) all run on the same
machine?
Thanks,
       Carl Metelko
    Reply  09 May 2007, Stefan Ritt, Forum, Splitting data transfer and control onto different networks 
Hi Carl,

so far I did not experience any problems of running odb&alarm on the same link as
the readout, since the data goes usually frontend->backend, and all other messages
from backend->frontend. So before you do something complicated, try it first the
easy way and check if you have problems at all. So far I don't know anybody who
did separate the network interfaces so I have not description for that.

You can however separate processes. The easiest is to buy a multi-core machine. If
you want to use however separate computers, note that receiving events over the
network is not very optimized. So you should run mserver connected to the frontend
, the event builder and mlogger on the same machine. mhttpd can easily live on
another machine, but there is not much CPU consumption from that (unless you don't
plot long history trends). Running mserver, the event builder and mlogger on the
same machine (dual Xenon mainboard) gave me easily 50 MB/sec (actually disk
limited), and not both CPUs were near 100%. If you put any receiving process (like
the event builder or mlogger or the analyzer) on a separate machine, you might see
a bottlened on the event receiving side of maybe 10MB/sec or so (never really
tried recently).

Best regards,

  Stefan

> Hi,
>    I'm setting up a system with two networks with the intension of having
> control info (odb, alarm) on the 192.168.0.x
> and the frontend readout on 192.168.1.x
> 
> Is there any easy way of doing this?
> I'm also trying to separate processes onto different machines, is there
> any way to not have mserver,mhttpd and (mlogger,mevt) all run on the same
> machine?
> Thanks,
>        Carl Metelko
    Reply  09 May 2007, Konstantin Olchanski, Forum, Splitting data transfer and control onto different networks 
> I'm setting up a system with two networks with the intension of having
> control info (odb, alarm) on the 192.168.0.x
> and the frontend readout on 192.168.1.x

We have some experience with this at TRIUMF - the TWIST experiment we run with the main data 
generating frontends on a private network - it is a supported configuration and it works fine.

We ran into one problem after adding some code to the frontends for stopping the run upon detecting 
some data errors - stopping runs requires sending RPC transactions to every midas client, so we had to 
add static network routes for routing packets between midas nodes on the private network and midas 
nodes on the normal network.

> I'm also trying to separate processes onto different machines, is there
> any way to not have mserver,mhttpd and (mlogger,mevt) all run on the same machine?

mserver runs on the machine with the ODB shared memory by definition (think of it as "nfs server").

mhttpd typically runs on the machine with the ODB shared memory and until recently it had no code for 
connecting to the mserver. I recently fixed some of it, and now you can run mhttpd in "history mode" 
through the mserver. This is useful for offloading the generation of history plots to another cpu or 
another machine. In our case, we run the "history mhttpd" on the machine that holds the history files.

mlogger could be made to run remotely via the mserver, but presently it will refuse to do so, as it has 
some code that requires direct access to midas shared memory. If data has to be written to a remote 
filesystem, the consensus is that it is more efficient to run mserver locally and let the OS handle remote 
filesystem access (NFS, etc).

All other midas programs should be able to run remotely via the mserver.

K.O.
    Reply  14 May 2007, Carl Metelko, Forum, Splitting data transfer and control onto different networks 
Hi,
   thanks for the advice. We do have dual core Xeons so we'll try running
most things on the server. Unless it proves to be a problem we'll run all
MIDAS signals on one network and NFS etc on the other.

I do have one more query about running systems like Konstantin.
What we would like to do is have a 'mirror' server serving multiple
online monitoring machines so that the load on the server is constant nomatter
the demands on the mirror.

Is there a way to set this up? Or would it be best to have a remote analyser
making short (1min) root files shared with the online monitoring? 
Entry  10 May 2007, Konstantin Olchanski, Bug Fix, Fix error reporting from cm_transition() 
For some time now, error reporting from cm_transition() was broken.

Typical symptom was when starting a run from mhttpd, when a transition error occurred, the run does not 
start (good) but the user is presented with a message "Success" in big letters (confusing the user).

Part of the problem was caused by user-written frontends that return an empty error string. Code in 
cm_transition() now detects this and shows the numeric value of the error status returned by the frontend.

This is fixed in revision 3681.

The error string "Success" is now returned only when cm_transition() was successful, and other error 
reporting inside this function was cleaned up.

K.O.
Entry  10 May 2007, Konstantin Olchanski, Bug Fix, mhttpd: fix broken boolean arrays in "edit on start" 
For some time now, boolean arrays did not work correctly in "/experiment/edit on start". This is now fixed 
in rev 3680. K.O.
Entry  10 Apr 2007, Dan Gastler, Forum, Interrupt code for VME? 
Hello, 
   Is there any example code for using midas for interrupt driven data
collection over VME? I am using a Struck SIS3100 PCI/VME setup to connect to my
VME crate.  Thanks,
  -Dan
ELOG V3.1.4-2e1708b5