Back Midas Rome Roody Rootana
  Midas DAQ System, Page 116 of 136  Not logged in ELOG logo
IDdown Date Author Topic Subject
  405   03 Sep 2007 Stefan RittBug Reporthow to handle end of run?
> I am having problems with handling the end-of-run situation in my midas
> frontend. I have a device that continuously sends data (over USB) and I read
> this data in my "read_event" function.
> 
> Everything is good until the end-of-run, at which time this happens:
> 0) mfe.c calls my read_event() to read the data (loop until the end-of-run
> transition)
> 1) mfe.c calls my end_of_run()
> 2) here, I tell the device "please stop sending data"
> 3) all seems good, but wait!!!
> 4) there is all this data generated between step 0 and step 2 still sitting
> inside the device and it has nowhere to go: the run is ended, the output file is
> closed, my read_event() will never be called ever again (well, until the next run).
> 
> It seems to me mfe.c needs to have one more function, something like
> "pre_end_of_run()" that works like this:
> 0) mfe.c calls my read_event() to read the data (loop until the end-of-run
> transition)
> 1) mfe.c calls pre_end_of_run(), here I tell the device to stop sending data
> 2) mfe.c calls read_event() for the very last time, to give me the opportunity
> to read and send away any data I still may have.
> 3) mfe.c calls the end_of_run(). The run is truely finished.
> 
> Any thoughts?

You can achieve the desired functionality without changing mfe.c:

0) mfe.c calls read_event
1) mfe.c calls end_of_run. Your end_of_run tells the device to stop data and flushes
the remaining data. At this point you have to re-make actually a part of the mfe.c
functionality, but basically you need a bm_compose_event() and a bm_send_event(), so
just a few lines of code. If you want to have the final event number right in your
equipment, you also need to update eq->events_sent accordingly. 

Given the fact that 99% of the experiments do not need this functionality, I propose
that we keep mfe.c and you add the few lines of code into your user part of the
specific frontend.

Stefan
  404   29 Aug 2007 Konstantin OlchanskiInfoAdded data compression to mlogger
I now commited the changes to mlogger (mlogger.c, msystem.h) implementing data
compression using zlib (svn revision 3845)

To enable compression, observe that mlogger is compiled with -DHAVE_ZLIB (see
the Makefile), in "/Logger/Channels/NNN/Settings", set "compression" to "1" and
the filename to "run%05d.mid.gz" (note the suffix ".gz").

In the Makefile, I only enabled HAVE_ZLIB for Linux, as that is the only
platform I tested. If somebody can test compression on Windows, please do and
let us know.

My ROOT analyzer (rootana) package can read compressed MIDAS files directly and
if one wants to add this capability to other MIDAS-related packages, one is
welcome to use my TMidasFile.cxx as an example
(http://ladd00.triumf.ca/viewcvs/rootana/trunk/TMidasFile.cxx?view=markup).

K.O.
  403   29 Aug 2007 Konstantin OlchanskiForumODBv3, second try - Midas on a x86_64 - incompatible with x86_32
> > > I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
> > > in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
> > 1) midas.h: remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
> > 2) msystem.h: add 32-bit padding to CHN_STATISTICS and CHN_SETTINGS

I am now trying a different solution of to fixing the issue of CHN_STATISTICS and CHN_SETTINGS changing size.

1) midas.h: (same as before) remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
2) msystem.h: in CHN_STATISTICS and CHN_SETTINGS change type of "event_limit" and "files_written" from int to "double".

Below are the latest ODBv3 meta patches:

ladd03:midas$ svn diff
Index: include/midas.h
===================================================================
--- include/midas.h     (revision 3844)
+++ include/midas.h     (working copy)
 /* has to be changed whenever binary ODB format changes */
-#define DATABASE_VERSION 2
+#define DATABASE_VERSION 3
.........
    short int trigger_mask;       /**< trigger mask                    */
    INT sampling_type;            /**< GET_ALL, GET_SOME, GET_FARM     */
-                                 /**< dispatch function */
-   void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
 } EVENT_REQUEST;

Index: include/msystem.h
===================================================================
--- include/msystem.h   (revision 3845)
+++ include/msystem.h   (working copy)
-"Event limit = DWORD : 0",\
+"Event limit = DOUBLE : 0",\
..................
-"Files written = INT : 0",\
+"Files written = DOUBLE : 0",\
..................
-   DWORD event_limit;
+   double event_limit;
..................
-   INT files_written;
+   double files_written;

K.O.
  402   22 Aug 2007 Konstantin OlchanskiBug Fixcommit latest ccusb.c CAMAC-USB driver
> > I commited the latest driver for the Wiener CCUSB USB-CAMAC driver. It
> > implements all functions from mcstd.h and has been tested to be plug-compatible
> > with at least one of our CAMAC frontends. K.O.

Well, it took almost a year to finish an updated driver, which has now been
commited to MIDAS SVN (see http://savannah.psi.ch/viewcvs/trunk/drivers/camac/ccusb/?root=midas).

This supports ccusb firmware release 0x402. With earlier firmware, simple CAMAC operations should work,
but to use the readout list feature one has to have the latest main firmware (0x402 as of today) and the latest CPLD
firmware.

The driver kit includes:
- the "ccusb" driver which implements the MIDAS mcstd.h CAMAC interface;
- test_ccusb to probe the interface and generally make the lights flash;
- ccusb_flash for updating the ccusb main firmware (assembled from bits and pieces found on the CCUSB driver CD);
- feccusb, an example midas frontend, which uses the ccusb readout list feature and has extensive error handling,
should be good enough for production use (unlike the Wiener libxxusb drivers, which lack basic error handling).
- analyzer.cxx, an rootana-based example on how to decode the ccusb data;
- README file with release notes.

If you use this driver, please drop me an email (even if it works perfectly for you, hah!) - the ccusb device is very
nice but can be hard to use and I would like to hear about problems other people have with it.

Today's version of the README files is attached below:

MIDAS driver for the Wiener/JTec CC-USB CAMAC-USB interface.

Date: 22-AUG-2007/KO

Note 1: The CC-USB interface comes with a CD which contains manuals,
firmware files, Windows and Linux software. The Wiener/JTec driver
is called "libxxusb". These MIDAS/musbstd drivers were written before
libxxusb bacame available and do not use libxxusb.

This driver implements the MIDAS CAMAC interafce "mcstd.h" using
the MIDAS USB interface musbstd.h.

Note 2: There exist many revisions of CCUSB firmware. Basic CAMAC
access works in all of them, but the "readout list" feature seems
to be only functional with firmware revision 0x402 or older and
with CPLD revisions CC_atmmgr_101406.jed, CC_datamgr_021905.jed,
CC_lammgr_brdcst_041906.jed or older.

To upgrade the main CCUSB firmware, follow instructions from
the CCUSB manual. On Linux, one can use the ccusb_flash
program included with these MIDAS drivers. It is a copy
of ccusb_flash from the Wiener CD, with all the pieces
assembled into one place and with a working Makefile. (I am too
lazy to add the flashing bits to the ccusb.c driver).

To upgrade the CPLD firmware, one needs a Xilinx JTag programmer
cable (we use a "parallel port to JTag" cable provided by Wiener),
and the Xilinx software (on Linux, we use Xilinx91i). For successful
upgrade, follow instructions from Xilinx and Wiener.

Note 3: Before starting to use the CCUSB interface, one should obtain
the latest version of the CCUSB manual and firmware by downloading
the latest version the CCUSB driver CD from the Wiener web
site (registration required)

Note 4: The example CCUSB frontend assumes this hardware configuration:
LeCroy 2249A 12 channel ADC in slot 20, Kinetic Systems 3615 6 channel
scaler in slot 12. NIM trigger input connected to CCUSB input "I1"
firing at 10-100 Hz. Without the external trigger CCUSB will not
generate any data and the frontend will only give "data timeout"
errors. With the trigger, the LED on the scaler should flash at 1 Hz
and the LEDs on the CCUSB should flash at the trigger rate.

Note 5: The CCUSB interface does not reliably power up in some CAMAC
crates (this has something to do with the sequence in which
different voltages start at different times with different CAMAC
power supplies). Some newer CCUSB modules may have this
problem fixed in the hardware and in the CPLD firmware. For modules
exhibiting this problem (i.e. no USB communication after power up),
try to cycle the power several time, or implement the "hardware reset
switch" (ask Wiener).

Note 6: The CCUSB firmware is very fickle and would crash if you look
at it the wrong way. This MIDAS driver tries to avoid all known crashers
and together with the example frontend, can recover from some
of them. Other crashes cannot be recovered from other than by
a hardware reset or power cycle.

//end
  401   20 Aug 2007 Konstantin OlchanskiBug Reporthow to handle end of run?
I am having problems with handling the end-of-run situation in my midas
frontend. I have a device that continuously sends data (over USB) and I read
this data in my "read_event" function.

Everything is good until the end-of-run, at which time this happens:
0) mfe.c calls my read_event() to read the data (loop until the end-of-run
transition)
1) mfe.c calls my end_of_run()
2) here, I tell the device "please stop sending data"
3) all seems good, but wait!!!
4) there is all this data generated between step 0 and step 2 still sitting
inside the device and it has nowhere to go: the run is ended, the output file is
closed, my read_event() will never be called ever again (well, until the next run).

It seems to me mfe.c needs to have one more function, something like
"pre_end_of_run()" that works like this:
0) mfe.c calls my read_event() to read the data (loop until the end-of-run
transition)
1) mfe.c calls pre_end_of_run(), here I tell the device to stop sending data
2) mfe.c calls read_event() for the very last time, to give me the opportunity
to read and send away any data I still may have.
3) mfe.c calls the end_of_run(). The run is truely finished.

Any thoughts?

K.O.
  400   20 Aug 2007 Konstantin OlchanskiForumMidas on a x86_64 - incompatible with x86_32
> > I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
> > in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
> 
> I now have the patches to implement this. Changes turned out to be minimal:
> 
> 1) midas.h: remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
> 2) msystem.h: add 32-bit padding to CHN_STATISTICS and CHN_SETTINGS

The padding of CHN_STATISTICS and CHN_SETTINGS is not working right - somehow mhttpd and mlogger keep recreating the
data in ODB and erasing the padding fields. I am looking into this.

K.O.
  399   12 Aug 2007 Konstantin OlchanskiForumMidas on a x86_64 - incompatible with x86_32
> I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
> in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.

I now have the patches to implement this. Changes turned out to be minimal:

1) midas.h: remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
2) msystem.h: add 32-bit padding to CHN_STATISTICS and CHN_SETTINGS

(Pedantic note: the C/C++ languages permit compilers to arbitrary pad data members inside structures and one is
not supposed to rely on the specific layout of "struct"s, they could changing from day to day depending on
compiler vendor, version, 32/64 bit, optimization level, etc. This is quite silly, but I guess it was the only way
"they" could agree on a standard)

In practice, compilers are will behaved and one can follow simple rules and stay out of trouble.
1) if all data members are of the same size -> no padding
2) do not use "double" (64-bit) and "short" (16-bit), make all char[] arrays divisible by 4 -> size of everything
is 32-bit, see rule 1
3) if you have to use "short", they have to come in pairs to keep everything else aligned to 32-bit
4) if you have to use "double" (or uint64_t), keep them aligned to 64-bit, i.e. struct { int a,b,c; double x;} is
*bad* (4-byte padding may be added between c and x). struct { int a,b,c,d; double x; } is good.

Below are is "svn diff include/midas.h include/msystem.h". These changes have been tested on SL4 32-bit and
64-bit, SL5 32/64, F7 32/64 and SL4/ICC (Intel compiler) 32 bit and 64 bit.

The testing was done by adding checks on sizes of all struct's kept on ODB, i.e.
   assert(sizeof(CHN_SETTINGS        ) ==    640); // ODB v3 with padding
   assert(sizeof(CHN_STATISTICS      ) ==     32); // ODB v3 with padding
   ... etc ...

K.O.

ladd03:midas$ svn diff include/midas.h include/msystem.h
Index: include/midas.h
===================================================================
--- include/midas.h     (revision 3798)
+++ include/midas.h     (working copy)
@@ -38,7 +38,7 @@
  *  @{  */

 /* has to be changed whenever binary ODB format changes */
-#define DATABASE_VERSION 2
+#define DATABASE_VERSION 3

 /* MIDAS version number which will be incremented for every release */
 #define MIDAS_VERSION "2.0.0"
@@ -810,8 +810,6 @@
    short int event_id;           /**< event ID                        */
    short int trigger_mask;       /**< trigger mask                    */
    INT sampling_type;            /**< GET_ALL, GET_SOME, GET_FARM     */
-                                 /**< dispatch function */
-   void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
 } EVENT_REQUEST;

 typedef struct {
Index: include/msystem.h
===================================================================
--- include/msystem.h   (revision 3798)
+++ include/msystem.h   (working copy)
@@ -454,6 +454,7 @@
    INT event_id;
    INT trigger_mask;
    DWORD event_limit;
+   INT pad; // FIXME 64-bit "double" should be 64-bit aligned
    double byte_limit;
    double tape_capacity;
    char subdir_format[32];
@@ -465,6 +466,7 @@
    double bytes_written;
    double bytes_written_total;
    INT files_written;
+   INT pad; // FIXME pad data structure to be 64-bit aligned
 } CHN_STATISTICS;

 typedef struct {
ladd03:midas$
  398   12 Aug 2007 Konstantin OlchanskiInfoChange of pointer type in mvmestd.h
> I had to change the pointer type of mvme_read and mvme_write to (void *) instead
> to (mvme_locaddr_t *) to avoid warnings under 64-bit linux. Please adjust your
> VME drivers if necessary.

Updated: vmicvme.c (VMIVME-7750/7805) and gefvme.c (GEFANUC V7865)

K.O.
  397   26 Jul 2007 Stefan RittInfoChange of pointer type in mvmestd.h
I had to change the pointer type of mvme_read and mvme_write to (void *) instead
to (mvme_locaddr_t *) to avoid warnings under 64-bit linux. Please adjust your
VME drivers if necessary.

- Stefan
  396   13 Jul 2007 Stefan RittForumMidas on a x86_64 - incompatible with x86_32
> The biggest problem here is that making 32-bit ODB and 64-bit ODB compatible requires breaking one or
> the other (My proposed changes break the 64-bit version. Alternatively, one could add explicit padding
> to these data structures and break the 32-bit ODB).
> 
> I think it is important to make 32-bit and 64-bit code compatible: at TRIUMF we have to use a mixed
> environment because out latest host computers all run 64-bit Linux while all our VME processors and all
> older machines can only run 32-bit code; this incompatibility causes us weekly headaches.
> 
> Any thoughts?

I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
This ensures to keep the native 64-bit packing, which probably will be somehow optimized for 64-bit
architectures and therefore might be a bit faster in the long run, when most systems are 64-bit. After this
has been implemented and well tested, I would go with an official announcement of the 32-bit break in the ODB,
and release a new version, so people can update from a TAR file if necessary. Existing ODB's can be converted
to the new format by exporting them in XML form and importing them again after the upgrade.
  395   12 Jul 2007 Konstantin OlchanskiForumMidas on a x86_64 - incompatible with x86_32
> We run 64-bit MIDAS on RHEL4 with 64-bit ROOT and everything generally works,
> except for compatibility problems with 32-bit MIDAS.
> 
> The big problem is that 64-bit and 32-bit ODB turned out to be incompatible ...

I have now identified 3 data structures that change size when compiled with "-m64":

EVENT_REQUEST: stores a pointer to a function. Pointer size is 4 bytes with -m32 and 8 bytes with -m64.
This structure is part of an array inside BUFFER_HEADER, resulting in a sizable size mismatch between 32
bit and 64 bit shared memory data buffers.

The fix is simple: the function pointer is not used anywhere. Replace is with a "DWORD unused_filler"
makes -m32 and -m64 data buffers compatible. (But breaks compatibility with previous -m64 compiled midas).

CHN_SETTINGS and CHN_STATISTICS: apparently, -m32 and -m64 GCC has different packing rules and in -m64
mode, 4 bytes of padding are added to these data structures. Size size mismatch appears to be benign,
but will result in "size mismatch" complaints from ODB.

The fix is simple: adding "__attribute__ ((__packed__))" to the definition of the data structure makes
-m64 identical to -m32.

The "svn diff" of changes involved is attached below.

The biggest problem here is that making 32-bit ODB and 64-bit ODB compatible requires breaking one or
the other (My proposed changes break the 64-bit version. Alternatively, one could add explicit padding
to these data structures and break the 32-bit ODB).

I think it is important to make 32-bit and 64-bit code compatible: at TRIUMF we have to use a mixed
environment because out latest host computers all run 64-bit Linux while all our VME processors and all
older machines can only run 32-bit code; this incompatibility causes us weekly headaches.

Any thoughts?

K.O.

(this output of svn diff is doctored for clarity)

ladd00:midas$ svn diff
Index: include/midas.h
===================================================================
--- include/midas.h     (revision 3744)
+++ include/midas.h     (working copy)
-   void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
+   INT unused; // was void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
 } EVENT_REQUEST;
 
--- include/msystem.h   (revision 3744)
+++ include/msystem.h   (working copy)

+#define PACKED __attribute__ ((__packed__))  <--- this goes into midas.h inside the #ifdef "we use GCC"
 
-typedef struct {
+typedef struct PACKED { ... CHN_SETTINGS
 
-typedef struct {
+typedef struct PACKED { ... CHN_STATISTICS
  394   06 Jul 2007 Konstantin OlchanskiBug Fixmscb, musbstd fixed on Linux, MacOS
> I commited a few minor changes to musbstd and mscb code...
>
> The basic functions work with the MSCB USB master, but I still need to
> investigate some cases where the connection hangs and usb communications do not
> work until the USB cable is unplugged and plugged back in. I see this problem
> both on MacOS and Linux.

I think I fixed the hangs we see on linux and macos - at the end all I had to do is
issue a usb reset to make mscb communicate again.

Also tested on Linux FC6 and SL4.5.

K.O.
  393   03 Jul 2007 Ryu SawadaInfoRHEL5/SL5 success!
> P.S. For the record, the compiler produces two sets of warnings:
> - warning: dereferencing type-punned pointer will break strict-aliasing rules
> (I do not understand the meaning of the second warning. type-punned pointer, huh?)

This is because strict aliasing rule is broken in this code.

In ISO C++99 standard, it is illegal to create two pointers of different types referring to the same address.
Even a code breaks the rule, it compiles, but the result is undefined.

For example following code gives different results depends on -O2 is used or not, because -O2 includes -fstrict-aliasing option.
When -fstrict-aliasing is used, compiler can optimize the code assuming the strict aliasing rule.
#include <stdio.h>
  
int main(){
   int ii = 1;
   float* ff = (float*)&ii;
   *ff = 2;
   printf("%d\n", ii);
   return 0;
} 

GCC warns this kind of code with a message like,
warning: dereferencing type-punned pointer will break strict-aliasing rules
The behavior differs also depending on compilers. GCC3 does not warn, while GCC4 warns. (GCC3 is the default on SL4, while GCC4 is
the default on SL5)
And results are different. GCC3 gives 0, while GCC4 gives 1.
#include <stdio.h>

typedef struct xxx {int ii;} XX;
  
int main(){
   XX a;
   a.ii = 1;
   *(short*)&a.ii = 0;
   printf("%d\n", a.ii);
   return 0;
}


More dangerous thing is that compilers do not always warn about it. For example, following code compiles without warnings even
when you use -Wall (including -Wstrict-aliasing). But the result changes depending on compile flags or compiler versions.
#include <stdio.h>
#include <string.h>
#include <malloc.h>
  
int main(){
   int *ii = (int*)malloc(8);
   ii[0] = 1;
   ii[1] = 2;
   float* ff = (float*)ii;
   ff[0] = 3;
   ff[1] = 4;
   printf("%d %d\n", ii[0], ii[1]);
   return 0;
}

A safer way is disabling -fstrict-aliasing compile flag. For example, you may change compile flag for midas like "-O2 -fno-strict-
aliasing".
Disadvantage is that there is a possibility that the speed is decreased.

The best way is modifying code to be in the strict aliasing rule.

Best regards
  392   02 Jul 2007 Stefan RittBug Fixmscb, musbstd fixed on Linux, MacOS

KO wrote:
There supposed to be no changes to the Windows code, but I cannot test on Windows, so if somebody does and finds breakage, please let me know.


I can confirm that revision 3713 still works under Windows.
  391   29 Jun 2007 Konstantin OlchanskiBug Fixmscb, musbstd fixed on Linux, MacOS
I commited a few minor changes to musbstd and mscb code to make them work on
MacOSX (tested on 10.3.9) and Linux (tested on Fedora 6).

The basic functions work with the MSCB USB master, but I still need to
investigate some cases where the connection hangs and usb communications do not
work until the USB cable is unplugged and plugged back in. I see this problem
both on MacOS and Linux.

Important changes:
1) mscb_select_device() does not work on both Linux and MacOS and is disabled.
Please run "msc -d usb0".
2) on Linux, the Makefile should define -DOS_LINUX and -DHAVE_LIBUSB;
   on MacOS, the Makefile should define -DOS_LINUX and -DOS_DARWIN. (This is
because MacOS is treated as a funny type of Linux).
3) when doing USB communications, one has to use the correct endpoint numbers,
which seem to be system dependant and for now, I hard code them in mscb.c for
the tested systems.

There supposed to be no changes to the Windows code, but I cannot test on
Windows, so if somebody does and finds breakage, please let me know.

K.O.
  390   12 Jun 2007 Randolf PohlForumcrash when analyzing multiple runs offline
Hi

> So I guess your solution is not a real solution.

I was not precise enough on what I do. This way the histograms persist in memory, but 
they are also written to every file:

e.g. in module "trig_tdc":

  TDirectory *savedir = gDirectory;  // will restore this afterwards
  gROOT->cd();     // go to file

  // make sure we are in the right "analyzer module folder"
  TDC_Folder = (TFolder *) gROOT->FindObjectAny("trig_tdc");
  gHistoFolderStack->Add((TObject *) TDC_Folder);

  ...(loop over all TDCs, figure out which histos exist, and which need to be booked)

  open_subfolder("raw4208");
  hrTDC = h1_book(....);   // create histo in memory, but it shows up in the file, too.
  close_subfolder(); //raw4208

  // restore gHistoFolderStack (we added a folder when entering routine)
  gHistoFolderStack->Remove(gHistoFolderStack->Last());

  // restore current directory
  savedir->cd();

When deleting histos I do:

     gManaHistosFolder->RecursiveRemove(*pHisto);
    (*pHisto)->Delete();
    (*pHisto) = NULL;  // for my book-keeping of existing histos.

You don't have to clear the histos explicitly between runs. gManaHistosFolder does this 
magic to you.

> But if you feel like something should be modified in mana.c, please send it to me
> and I can incorporate it into the standard code.

No, the code is fine. I just wanted to explain my problem and a solution to it, because 
I thought that somebody might run into the same problem, too. 

Ciao,

Randolf
  389   11 Jun 2007 Stefan RittForumcrash when analyzing multiple runs offline
> I have hunted down "my" segfault problem to the fact that I book histograms not 
> in <module>_init, but in <module>_bor. I have to do so, because only in bor do I 
> know which histograms to book, as this information comes from the ODB (booking 
> only histograms for CAMAC modules which were set to "read" in the ODB). The core 
> dump happens on the first access (->Fill, ->SetName,...) of one of these histos 
> in the 2nd run analyzed offline ("./analyzer -r n m").
> 
> In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module 
> bor() functions go to the output file", and then does a gManaOutputFile->cd();
> Consequently, the histograms vanish after the file is closed, therefore the 
> segfault when trying to access them in the 2nd run. (I keep track of existing 
> histograms, only booking the missing histos in bor.)
> 
> The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with 
> TFolders and booking the histogram.

ROOT has the strange concept of "current working directory", coming from the fact that
ROOT was written by Fortran and PAW people, being used to have directories and
subdirectories with a persistent state (not really object-oriented style). So one can
set the "current working directory" to the root (=memory) with gROOT->cd() and to a
subdirectory which will later be written into a file with gManaOutputFile->cd(). If
you do the first one, the histograms are created only in memory, while in the later
case they are also created in memory, but will later be written into the output file
in the routine CloseRootOutputFile(). So if you do a gROOT->cd() in <module>_bor,
these histograms will not be written to file. So I guess your solution is not a real
solution.

> I do, however, not really understand the intention why histos booked in bor() go 
> to only the file, whereas histos booked in init() go to memory. Could you please 
> comment briefly? Maybe I missed the most important point. And what about online 
> mode, should this work?

The root output file is opened in bor() and closed in eor(). For a histo to go to the
file, it must be booked after opening the file, that is after bor() in mana.c and
therefore after the gManaOutputFile->cd().

I agree with you that the current scheme is not satisfactory. When running online, you
want to keep the histos between the runs. When running offline, you delete and
re-create them for each run. It would be better to create all histos online and
offline under gROOT, and just copy them to gManaOutputFile before writing them. I have
to admit that this root code was never really used in a productive environment for
offline analysis, so there might be some issues here and there. Some people write
directly root files in the logger, and then do a root-only (without the midas
analyzer) analysis. Unfortunately I'm busy these days and cannot write any code right
now. But if you feel like something should be modified in mana.c, please send it to me
and I can incorporate it into the standard code.
  388   11 Jun 2007 Randolf PohlForumcrash when analyzing multiple runs offline
Hello again,

just for the record, in case somebody else runs into the same problem...

I have hunted down "my" segfault problem to the fact that I book histograms not 
in <module>_init, but in <module>_bor. I have to do so, because only in bor do I 
know which histograms to book, as this information comes from the ODB (booking 
only histograms for CAMAC modules which were set to "read" in the ODB). The core 
dump happens on the first access (->Fill, ->SetName,...) of one of these histos 
in the 2nd run analyzed offline ("./analyzer -r n m").

In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module 
bor() functions go to the output file", and then does a gManaOutputFile->cd();
Consequently, the histograms vanish after the file is closed, therefore the 
segfault when trying to access them in the 2nd run. (I keep track of existing 
histograms, only booking the missing histos in bor.)

The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with 
TFolders and booking the histogram.


I do, however, not really understand the intention why histos booked in bor() go 
to only the file, whereas histos booked in init() go to memory. Could you please 
comment briefly? Maybe I missed the most important point. And what about online 
mode, should this work?


Thanks a lot in advance,

Randolf
  387   10 Jun 2007 Stefan RittForumcrash when analyzing multiple runs offline
> tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should 
> presumably not be the case, since CloseRootOutputFile() frees the trees at eor().

Yes this indeed a bug. I applied your change and committed the new code.
  386   09 Jun 2007 Randolf PohlForumcrash when analyzing multiple runs offline
Hello Stefan,

tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should 
presumably not be the case, since CloseRootOutputFile() frees the trees at eor().

------------------- output ---------------------------
lamb@lamb2:~/midas/root_3705> ./analyzer -e 
exa_root -i /tmp/midas/examples/root/run%05d.mid -o /tmp/midas/run%05d.root -r 1 2
Root server listening on port 9090...
Running analyzer offline. Stop with "!"
book_ttree: tree_struct.n_tree = 1
book_ttree: tree_struct.n_tree = 2
Set run number 1 in ODB
Load ODB from run 1...OK
/tmp/midas/examples/root/run00001.mid:2722  /tmp/midas/run00001.root:2720  events, 
0.21s
book_ttree: tree_struct.n_tree = 3     <<---- !!!!
book_ttree: tree_struct.n_tree = 4
Set run number 2 in ODB
Load ODB from run 2...OK
/tmp/midas/examples/root/run00002.mid:2347  /tmp/midas/run00002.root:2345  events, 
0.18s

 *** Break *** segmentation violation
----------------- \output ----------------------------

Adding this one line fixes the segfault problem for the root example expt.

----------------- code -------------------------
lamb@lamb2:/data/software/midas/midas_3705/src/src> svn diff mana.c
Index: mana.c
===================================================================
--- mana.c      (revision 3705)
+++ mana.c      (working copy)
@@ -1496,6 +1496,7 @@
    /* delete event tree */
    free(tree_struct.event_tree);
    tree_struct.event_tree = NULL;
+   tree_struct.n_tree = 0;
 
    // go to ROOT root directory
    gROOT->cd();
---------------- \code ---------------------------

Please check if this gives the intended behaviour. I am not very familiar with the 
midas internals.

Unfortunately my own analyzer's segfault problem is not solved by this patch. I 
guess I have to keep searching for a bug on my side.....  :-)


Cheers,

Randolf
ELOG V3.1.4-2e1708b5