ID |
Date |
Author |
Topic |
Subject |
371
|
09 May 2007 |
Konstantin Olchanski | Forum | Splitting data transfer and control onto different networks | > I'm setting up a system with two networks with the intension of having
> control info (odb, alarm) on the 192.168.0.x
> and the frontend readout on 192.168.1.x
We have some experience with this at TRIUMF - the TWIST experiment we run with the main data
generating frontends on a private network - it is a supported configuration and it works fine.
We ran into one problem after adding some code to the frontends for stopping the run upon detecting
some data errors - stopping runs requires sending RPC transactions to every midas client, so we had to
add static network routes for routing packets between midas nodes on the private network and midas
nodes on the normal network.
> I'm also trying to separate processes onto different machines, is there
> any way to not have mserver,mhttpd and (mlogger,mevt) all run on the same machine?
mserver runs on the machine with the ODB shared memory by definition (think of it as "nfs server").
mhttpd typically runs on the machine with the ODB shared memory and until recently it had no code for
connecting to the mserver. I recently fixed some of it, and now you can run mhttpd in "history mode"
through the mserver. This is useful for offloading the generation of history plots to another cpu or
another machine. In our case, we run the "history mhttpd" on the machine that holds the history files.
mlogger could be made to run remotely via the mserver, but presently it will refuse to do so, as it has
some code that requires direct access to midas shared memory. If data has to be written to a remote
filesystem, the consensus is that it is more efficient to run mserver locally and let the OS handle remote
filesystem access (NFS, etc).
All other midas programs should be able to run remotely via the mserver.
K.O. |
375
|
14 May 2007 |
Carl Metelko | Forum | Splitting data transfer and control onto different networks | Hi,
thanks for the advice. We do have dual core Xeons so we'll try running
most things on the server. Unless it proves to be a problem we'll run all
MIDAS signals on one network and NFS etc on the other.
I do have one more query about running systems like Konstantin.
What we would like to do is have a 'mirror' server serving multiple
online monitoring machines so that the load on the server is constant nomatter
the demands on the mirror.
Is there a way to set this up? Or would it be best to have a remote analyser
making short (1min) root files shared with the online monitoring? |
382
|
07 Jun 2007 |
Randolf Pohl | Forum | crash when analyzing multiple runs offline | Hello,
I am having a problem with the root-based analyzer. It crashes when I try to
analyze multiple runs OFFLINE using the "-i run%05d.mid -o result%05d.root -r
1 2" feature.
I can reproduce the problem with the example experiment which comes with the
MIDAS distribution:
Running the analyzer ONLINE works fine: One can start and stop runs one after
the other, roody shows the histograms being reset and then filled again and
such.
But OFFLINE, the analyzer crashes when trying to analyze the SECOND run in a
sequence. So
./analyzer -i run%05d.mid -o result%05d.root -r 1 1 works (only run 1)
./analyzer -i run%05d.mid -o result%05d.root -r 1 3 dies on run 2
Output attached (I added printf's to the "init"-modules, but that's irrelevant
here)
My own analyzer shows the same effect. There I got the impression the segfault
happens on the first attempt to Fill/Reset/SetName etc. a histogram in the 2nd
run. But with the midas example it looks like the analyzer finishes filling
histos even for run 2, but then dies in eor.
Can you reproduce the problem?
I run MIDAS on an Intel Quadcore, 64 bit SuSE Linux 10.2.
pohl@lamb2:~/midas/examples/root> gcc --version
gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux)
(maybe 4.1.2 "PRERELEASE" is the problem? See message ID 344)
I am using midas rev. 3674 (April 19, 2007), but I got the impression there
has since not been a change relevant to this problem. Please correct me if I
am wrong, then I would try it with Rev HEAD.
(My version includes already the fix to the x86_64 segfault problem of message
ID 337)
Best regards,
Randolf |
Attachment 1: crash.out
|
pohl@lamb:~/midas/examples/root> ./analyzer -e exa_root -i run%05d.mid -o /tmp/pohl/test%05d.root -r 1 3
analyzer_init
Root server listening on port 9090...
adc_calib_init
adc_summing_init
scaler_init
Running analyzer offline. Stop with "!"
Set run number 1 in ODB
Load ODB from run 1...
analyzer_init
OK
run00001.mid:777 /tmp/pohl/test00001.root:775 events, 0.00s
Set run number 2 in ODB
Load ODB from run 2...
analyzer_init
OK
run00002.mid:7227 /tmp/pohl/test00002.root:7225 events, 0.04s
*** Break *** segmentation violation
Using host libthread_db library "/lib64/libthread_db.so.1".
Attaching to program: /proc/10558/exe, process 10558
[Thread debugging using libthread_db enabled]
[New Thread 47688414288800 (LWP 10558)]
[New Thread 1082132800 (LWP 10559)]
0x00002b5f52afae1f in waitpid () from /lib64/libc.so.6
Thread 2 (Thread 1082132800 (LWP 10559)):
#0 0x00002b5f5234d0ab in __accept_nocancel () from /lib64/libpthread.so.0
#1 0x00002b5f4e4cc510 in TUnixSystem::AcceptConnection ()
from /usr/local/root/lib/root/libCore.so.5.14
#2 0x00002b5f4e4c1592 in TServerSocket::Accept () from /usr/local/root/lib/root/libCore.so.5.14
#3 0x000000000041075f in root_socket_server (arg=<value optimized out>) at src/mana.c:5453
#4 0x00002b5f5188428a in TThread::Function () from /usr/local/root/lib/root/libThread.so.5.14
#5 0x00002b5f5234609e in start_thread () from /lib64/libpthread.so.0
#6 0x00002b5f52b294cd in clone () from /lib64/libc.so.6
#7 0x0000000000000000 in ?? ()
Thread 1 (Thread 47688414288800 (LWP 10558)):
#0 0x00002b5f52afae1f in waitpid () from /lib64/libc.so.6
#1 0x00002b5f52aa3491 in do_system () from /lib64/libc.so.6
#2 0x00002b5f52aa3817 in system () from /lib64/libc.so.6
#3 0x00002b5f4e4d0851 in TUnixSystem::StackTrace ()
from /usr/local/root/lib/root/libCore.so.5.14
#4 0x00002b5f4e4cfa4a in TUnixSystem::DispatchSignals ()
from /usr/local/root/lib/root/libCore.so.5.14
#5 <signal handler called>
#6 0x00002b5f52ad5ee5 in free () from /lib64/libc.so.6
#7 0x000000000040c89b in CloseRootOutputFile () at src/mana.c:1489
#8 0x0000000000410b45 in eor (run_number=<value optimized out>, error=<value optimized out>)
at src/mana.c:1981
#9 0x0000000000412d9b in analyze_run (run_number=2,
input_file_name=0x7fff5cafd020 "run00002.mid", output_file_name=<value optimized out>)
at src/mana.c:4471
#10 0x00000000004130b4 in loop_runs_offline () at src/mana.c:4518
#11 0x0000000000413e05 in main (argc=<value optimized out>, argv=<value optimized out>)
at src/mana.c:5757
#0 0x00002b5f52afae1f in waitpid () from /lib64/libc.so.6
[midas.c:1592:] cm_disconnect_experiment not called at end of program
|
385
|
08 Jun 2007 |
Stefan Ritt | Forum | crash when analyzing multiple runs offline | Unfortunately I don't have time right now to debug the problem, but I could see
roughly what it could be. The analyzer crashes inside CloseRootOutputFile:
#5 <signal handler called>
#6 0x00002b5f52ad5ee5 in free () from /lib64/libc.so.6
#7 0x000000000040c89b in CloseRootOutputFile () at src/mana.c:1489
in the line
free(tree_struct.event_tree[i].branch);
If a "free" crashes, it might indicate that the memory beyond the allocated space
got corrupted. The branch gets allocated in book_ttree(), once for each
analyze_request[i]. The branch gets filled in write_event_ttree():
/* fill tree both online and offline */
if (!exclude_all)
et->tree->Fill();
Maybe one should put printf debugging statements in these places to see what's
going on. |
386
|
09 Jun 2007 |
Randolf Pohl | Forum | crash when analyzing multiple runs offline | Hello Stefan,
tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should
presumably not be the case, since CloseRootOutputFile() frees the trees at eor().
------------------- output ---------------------------
lamb@lamb2:~/midas/root_3705> ./analyzer -e
exa_root -i /tmp/midas/examples/root/run%05d.mid -o /tmp/midas/run%05d.root -r 1 2
Root server listening on port 9090...
Running analyzer offline. Stop with "!"
book_ttree: tree_struct.n_tree = 1
book_ttree: tree_struct.n_tree = 2
Set run number 1 in ODB
Load ODB from run 1...OK
/tmp/midas/examples/root/run00001.mid:2722 /tmp/midas/run00001.root:2720 events,
0.21s
book_ttree: tree_struct.n_tree = 3 <<---- !!!!
book_ttree: tree_struct.n_tree = 4
Set run number 2 in ODB
Load ODB from run 2...OK
/tmp/midas/examples/root/run00002.mid:2347 /tmp/midas/run00002.root:2345 events,
0.18s
*** Break *** segmentation violation
----------------- \output ----------------------------
Adding this one line fixes the segfault problem for the root example expt.
----------------- code -------------------------
lamb@lamb2:/data/software/midas/midas_3705/src/src> svn diff mana.c
Index: mana.c
===================================================================
--- mana.c (revision 3705)
+++ mana.c (working copy)
@@ -1496,6 +1496,7 @@
/* delete event tree */
free(tree_struct.event_tree);
tree_struct.event_tree = NULL;
+ tree_struct.n_tree = 0;
// go to ROOT root directory
gROOT->cd();
---------------- \code ---------------------------
Please check if this gives the intended behaviour. I am not very familiar with the
midas internals.
Unfortunately my own analyzer's segfault problem is not solved by this patch. I
guess I have to keep searching for a bug on my side..... :-)
Cheers,
Randolf |
387
|
10 Jun 2007 |
Stefan Ritt | Forum | crash when analyzing multiple runs offline | > tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should
> presumably not be the case, since CloseRootOutputFile() frees the trees at eor().
Yes this indeed a bug. I applied your change and committed the new code. |
388
|
11 Jun 2007 |
Randolf Pohl | Forum | crash when analyzing multiple runs offline | Hello again,
just for the record, in case somebody else runs into the same problem...
I have hunted down "my" segfault problem to the fact that I book histograms not
in <module>_init, but in <module>_bor. I have to do so, because only in bor do I
know which histograms to book, as this information comes from the ODB (booking
only histograms for CAMAC modules which were set to "read" in the ODB). The core
dump happens on the first access (->Fill, ->SetName,...) of one of these histos
in the 2nd run analyzed offline ("./analyzer -r n m").
In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module
bor() functions go to the output file", and then does a gManaOutputFile->cd();
Consequently, the histograms vanish after the file is closed, therefore the
segfault when trying to access them in the 2nd run. (I keep track of existing
histograms, only booking the missing histos in bor.)
The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with
TFolders and booking the histogram.
I do, however, not really understand the intention why histos booked in bor() go
to only the file, whereas histos booked in init() go to memory. Could you please
comment briefly? Maybe I missed the most important point. And what about online
mode, should this work?
Thanks a lot in advance,
Randolf |
389
|
11 Jun 2007 |
Stefan Ritt | Forum | crash when analyzing multiple runs offline | > I have hunted down "my" segfault problem to the fact that I book histograms not
> in <module>_init, but in <module>_bor. I have to do so, because only in bor do I
> know which histograms to book, as this information comes from the ODB (booking
> only histograms for CAMAC modules which were set to "read" in the ODB). The core
> dump happens on the first access (->Fill, ->SetName,...) of one of these histos
> in the 2nd run analyzed offline ("./analyzer -r n m").
>
> In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module
> bor() functions go to the output file", and then does a gManaOutputFile->cd();
> Consequently, the histograms vanish after the file is closed, therefore the
> segfault when trying to access them in the 2nd run. (I keep track of existing
> histograms, only booking the missing histos in bor.)
>
> The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with
> TFolders and booking the histogram.
ROOT has the strange concept of "current working directory", coming from the fact that
ROOT was written by Fortran and PAW people, being used to have directories and
subdirectories with a persistent state (not really object-oriented style). So one can
set the "current working directory" to the root (=memory) with gROOT->cd() and to a
subdirectory which will later be written into a file with gManaOutputFile->cd(). If
you do the first one, the histograms are created only in memory, while in the later
case they are also created in memory, but will later be written into the output file
in the routine CloseRootOutputFile(). So if you do a gROOT->cd() in <module>_bor,
these histograms will not be written to file. So I guess your solution is not a real
solution.
> I do, however, not really understand the intention why histos booked in bor() go
> to only the file, whereas histos booked in init() go to memory. Could you please
> comment briefly? Maybe I missed the most important point. And what about online
> mode, should this work?
The root output file is opened in bor() and closed in eor(). For a histo to go to the
file, it must be booked after opening the file, that is after bor() in mana.c and
therefore after the gManaOutputFile->cd().
I agree with you that the current scheme is not satisfactory. When running online, you
want to keep the histos between the runs. When running offline, you delete and
re-create them for each run. It would be better to create all histos online and
offline under gROOT, and just copy them to gManaOutputFile before writing them. I have
to admit that this root code was never really used in a productive environment for
offline analysis, so there might be some issues here and there. Some people write
directly root files in the logger, and then do a root-only (without the midas
analyzer) analysis. Unfortunately I'm busy these days and cannot write any code right
now. But if you feel like something should be modified in mana.c, please send it to me
and I can incorporate it into the standard code. |
390
|
12 Jun 2007 |
Randolf Pohl | Forum | crash when analyzing multiple runs offline | Hi
> So I guess your solution is not a real solution.
I was not precise enough on what I do. This way the histograms persist in memory, but
they are also written to every file:
e.g. in module "trig_tdc":
TDirectory *savedir = gDirectory; // will restore this afterwards
gROOT->cd(); // go to file
// make sure we are in the right "analyzer module folder"
TDC_Folder = (TFolder *) gROOT->FindObjectAny("trig_tdc");
gHistoFolderStack->Add((TObject *) TDC_Folder);
...(loop over all TDCs, figure out which histos exist, and which need to be booked)
open_subfolder("raw4208");
hrTDC = h1_book(....); // create histo in memory, but it shows up in the file, too.
close_subfolder(); //raw4208
// restore gHistoFolderStack (we added a folder when entering routine)
gHistoFolderStack->Remove(gHistoFolderStack->Last());
// restore current directory
savedir->cd();
When deleting histos I do:
gManaHistosFolder->RecursiveRemove(*pHisto);
(*pHisto)->Delete();
(*pHisto) = NULL; // for my book-keeping of existing histos.
You don't have to clear the histos explicitly between runs. gManaHistosFolder does this
magic to you.
> But if you feel like something should be modified in mana.c, please send it to me
> and I can incorporate it into the standard code.
No, the code is fine. I just wanted to explain my problem and a solution to it, because
I thought that somebody might run into the same problem, too.
Ciao,
Randolf |
395
|
12 Jul 2007 |
Konstantin Olchanski | Forum | Midas on a x86_64 - incompatible with x86_32 | > We run 64-bit MIDAS on RHEL4 with 64-bit ROOT and everything generally works,
> except for compatibility problems with 32-bit MIDAS.
>
> The big problem is that 64-bit and 32-bit ODB turned out to be incompatible ...
I have now identified 3 data structures that change size when compiled with "-m64":
EVENT_REQUEST: stores a pointer to a function. Pointer size is 4 bytes with -m32 and 8 bytes with -m64.
This structure is part of an array inside BUFFER_HEADER, resulting in a sizable size mismatch between 32
bit and 64 bit shared memory data buffers.
The fix is simple: the function pointer is not used anywhere. Replace is with a "DWORD unused_filler"
makes -m32 and -m64 data buffers compatible. (But breaks compatibility with previous -m64 compiled midas).
CHN_SETTINGS and CHN_STATISTICS: apparently, -m32 and -m64 GCC has different packing rules and in -m64
mode, 4 bytes of padding are added to these data structures. Size size mismatch appears to be benign,
but will result in "size mismatch" complaints from ODB.
The fix is simple: adding "__attribute__ ((__packed__))" to the definition of the data structure makes
-m64 identical to -m32.
The "svn diff" of changes involved is attached below.
The biggest problem here is that making 32-bit ODB and 64-bit ODB compatible requires breaking one or
the other (My proposed changes break the 64-bit version. Alternatively, one could add explicit padding
to these data structures and break the 32-bit ODB).
I think it is important to make 32-bit and 64-bit code compatible: at TRIUMF we have to use a mixed
environment because out latest host computers all run 64-bit Linux while all our VME processors and all
older machines can only run 32-bit code; this incompatibility causes us weekly headaches.
Any thoughts?
K.O.
(this output of svn diff is doctored for clarity)
ladd00:midas$ svn diff
Index: include/midas.h
===================================================================
--- include/midas.h (revision 3744)
+++ include/midas.h (working copy)
- void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
+ INT unused; // was void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
} EVENT_REQUEST;
--- include/msystem.h (revision 3744)
+++ include/msystem.h (working copy)
+#define PACKED __attribute__ ((__packed__)) <--- this goes into midas.h inside the #ifdef "we use GCC"
-typedef struct {
+typedef struct PACKED { ... CHN_SETTINGS
-typedef struct {
+typedef struct PACKED { ... CHN_STATISTICS |
396
|
13 Jul 2007 |
Stefan Ritt | Forum | Midas on a x86_64 - incompatible with x86_32 | > The biggest problem here is that making 32-bit ODB and 64-bit ODB compatible requires breaking one or
> the other (My proposed changes break the 64-bit version. Alternatively, one could add explicit padding
> to these data structures and break the 32-bit ODB).
>
> I think it is important to make 32-bit and 64-bit code compatible: at TRIUMF we have to use a mixed
> environment because out latest host computers all run 64-bit Linux while all our VME processors and all
> older machines can only run 32-bit code; this incompatibility causes us weekly headaches.
>
> Any thoughts?
I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
This ensures to keep the native 64-bit packing, which probably will be somehow optimized for 64-bit
architectures and therefore might be a bit faster in the long run, when most systems are 64-bit. After this
has been implemented and well tested, I would go with an official announcement of the 32-bit break in the ODB,
and release a new version, so people can update from a TAR file if necessary. Existing ODB's can be converted
to the new format by exporting them in XML form and importing them again after the upgrade. |
399
|
12 Aug 2007 |
Konstantin Olchanski | Forum | Midas on a x86_64 - incompatible with x86_32 | > I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
> in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
I now have the patches to implement this. Changes turned out to be minimal:
1) midas.h: remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
2) msystem.h: add 32-bit padding to CHN_STATISTICS and CHN_SETTINGS
(Pedantic note: the C/C++ languages permit compilers to arbitrary pad data members inside structures and one is
not supposed to rely on the specific layout of "struct"s, they could changing from day to day depending on
compiler vendor, version, 32/64 bit, optimization level, etc. This is quite silly, but I guess it was the only way
"they" could agree on a standard)
In practice, compilers are will behaved and one can follow simple rules and stay out of trouble.
1) if all data members are of the same size -> no padding
2) do not use "double" (64-bit) and "short" (16-bit), make all char[] arrays divisible by 4 -> size of everything
is 32-bit, see rule 1
3) if you have to use "short", they have to come in pairs to keep everything else aligned to 32-bit
4) if you have to use "double" (or uint64_t), keep them aligned to 64-bit, i.e. struct { int a,b,c; double x;} is
*bad* (4-byte padding may be added between c and x). struct { int a,b,c,d; double x; } is good.
Below are is "svn diff include/midas.h include/msystem.h". These changes have been tested on SL4 32-bit and
64-bit, SL5 32/64, F7 32/64 and SL4/ICC (Intel compiler) 32 bit and 64 bit.
The testing was done by adding checks on sizes of all struct's kept on ODB, i.e.
assert(sizeof(CHN_SETTINGS ) == 640); // ODB v3 with padding
assert(sizeof(CHN_STATISTICS ) == 32); // ODB v3 with padding
... etc ...
K.O.
ladd03:midas$ svn diff include/midas.h include/msystem.h
Index: include/midas.h
===================================================================
--- include/midas.h (revision 3798)
+++ include/midas.h (working copy)
@@ -38,7 +38,7 @@
* @{ */
/* has to be changed whenever binary ODB format changes */
-#define DATABASE_VERSION 2
+#define DATABASE_VERSION 3
/* MIDAS version number which will be incremented for every release */
#define MIDAS_VERSION "2.0.0"
@@ -810,8 +810,6 @@
short int event_id; /**< event ID */
short int trigger_mask; /**< trigger mask */
INT sampling_type; /**< GET_ALL, GET_SOME, GET_FARM */
- /**< dispatch function */
- void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
} EVENT_REQUEST;
typedef struct {
Index: include/msystem.h
===================================================================
--- include/msystem.h (revision 3798)
+++ include/msystem.h (working copy)
@@ -454,6 +454,7 @@
INT event_id;
INT trigger_mask;
DWORD event_limit;
+ INT pad; // FIXME 64-bit "double" should be 64-bit aligned
double byte_limit;
double tape_capacity;
char subdir_format[32];
@@ -465,6 +466,7 @@
double bytes_written;
double bytes_written_total;
INT files_written;
+ INT pad; // FIXME pad data structure to be 64-bit aligned
} CHN_STATISTICS;
typedef struct {
ladd03:midas$ |
400
|
20 Aug 2007 |
Konstantin Olchanski | Forum | Midas on a x86_64 - incompatible with x86_32 | > > I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
> > in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
>
> I now have the patches to implement this. Changes turned out to be minimal:
>
> 1) midas.h: remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
> 2) msystem.h: add 32-bit padding to CHN_STATISTICS and CHN_SETTINGS
The padding of CHN_STATISTICS and CHN_SETTINGS is not working right - somehow mhttpd and mlogger keep recreating the
data in ODB and erasing the padding fields. I am looking into this.
K.O. |
403
|
29 Aug 2007 |
Konstantin Olchanski | Forum | ODBv3, second try - Midas on a x86_64 - incompatible with x86_32 | > > > I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
> > > in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
> > 1) midas.h: remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
> > 2) msystem.h: add 32-bit padding to CHN_STATISTICS and CHN_SETTINGS
I am now trying a different solution of to fixing the issue of CHN_STATISTICS and CHN_SETTINGS changing size.
1) midas.h: (same as before) remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
2) msystem.h: in CHN_STATISTICS and CHN_SETTINGS change type of "event_limit" and "files_written" from int to "double".
Below are the latest ODBv3 meta patches:
ladd03:midas$ svn diff
Index: include/midas.h
===================================================================
--- include/midas.h (revision 3844)
+++ include/midas.h (working copy)
/* has to be changed whenever binary ODB format changes */
-#define DATABASE_VERSION 2
+#define DATABASE_VERSION 3
.........
short int trigger_mask; /**< trigger mask */
INT sampling_type; /**< GET_ALL, GET_SOME, GET_FARM */
- /**< dispatch function */
- void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
} EVENT_REQUEST;
Index: include/msystem.h
===================================================================
--- include/msystem.h (revision 3845)
+++ include/msystem.h (working copy)
-"Event limit = DWORD : 0",\
+"Event limit = DOUBLE : 0",\
..................
-"Files written = INT : 0",\
+"Files written = DOUBLE : 0",\
..................
- DWORD event_limit;
+ double event_limit;
..................
- INT files_written;
+ double files_written;
K.O. |
412
|
17 Oct 2007 |
Randolf Pohl | Forum | Adding MIDAS .root-files | Dear MIDAS users,
I want to add several .root-files produced by the MIDAS analyzer, in a fast
and convenient way. ROOT's hadd fails because it does not know how to treat
TFolders. I guess this problem is not unique to me, so I hope that somebody of
you might already have found a solution.
Why don't I just run "analyzer -r 1 10000"?
We have taken lots of runs under (rapidly) varying conditions, so it would be
lots of "-r". And the analysis is quite involved, so rerunning all data takes
about one hour on a fast PC making this quite painful.
Therefore, I would like to rerun all data only once, and then add the result
files depending on different criteria.
Of course, I tried to write a script that does the adding. But somehow it is
incredibly slow. And I am not the Master Of C++, too.
Is there any deeper reason for MIDAS using TFolders, not TDirectorys? ROOT's
hadd can treat TDirectory. Can I simply patch "my" MIDAS? Is there general
interest in a change like this? (Does anyone have experience with the speed of
hadd?)
Looking forward to comments from the Forum.
Cheers,
Randolf |
413
|
17 Oct 2007 |
Randolf Pohl | Forum | Multi-core CPUs | Dear Forum,
I have this beautiful Intel Quadcore with fast disks, but MIDAS does obviously
only make use of one CPU at a time. Has anyboy of you already done some work
on making MIDAS parallel? Event-based data analysis should be the best
candidate for this.
Has anybody done this with PVM? There is some PVM-related stuff in the MIDAS
sources, but I got the impression this works only with HBOOK, not with ROOT.
Or am I wrong?
But then PVM is probably also not the most efficient thing one ONE machine
with multiple CPUs, right? And finally, with PVM we're back to
adding .root-files efficiently (see my previous post).
Any thoughts?
Cheers,
Randolf |
414
|
17 Oct 2007 |
Stefan Ritt | Forum | Multi-core CPUs | > I have this beautiful Intel Quadcore with fast disks, but MIDAS does obviously
> only make use of one CPU at a time. Has anyboy of you already done some work
> on making MIDAS parallel? Event-based data analysis should be the best
> candidate for this.
There are ring buffer routines rb_xxx for distributed event analysis, but this is
currently only implemented in the front-end framework. These routines are pretty
simple, and their integration into the analyzer should not be very difficult.
Unfortunately I don't have time for that right now. We do our analysis such that we
analyze four different runs in parallel on a quadcore machine.
- Stefan |
415
|
17 Oct 2007 |
John M O'Donnell | Forum | Adding MIDAS .root-files | The following program handles regular directories in a file, or folders (ugh).
Most histograms are added bin by bin.
For scaler events it is convenient to see the counts as a function of time (ala
sclaer history plots in mhttpd). If the histogram looks like a scaler plot versus
time, then new bins are added on to the end (or into the middle!) of the histogram.
All different versions of cuts are kept.
TTrees are not explicitly supported, so probably don't do the right thing...
John.
> Dear MIDAS users,
>
> I want to add several .root-files produced by the MIDAS analyzer, in a fast
> and convenient way. ROOT's hadd fails because it does not know how to treat
> TFolders. I guess this problem is not unique to me, so I hope that somebody of
> you might already have found a solution.
>
> Why don't I just run "analyzer -r 1 10000"?
> We have taken lots of runs under (rapidly) varying conditions, so it would be
> lots of "-r". And the analysis is quite involved, so rerunning all data takes
> about one hour on a fast PC making this quite painful.
> Therefore, I would like to rerun all data only once, and then add the result
> files depending on different criteria.
>
> Of course, I tried to write a script that does the adding. But somehow it is
> incredibly slow. And I am not the Master Of C++, too.
>
> Is there any deeper reason for MIDAS using TFolders, not TDirectorys? ROOT's
> hadd can treat TDirectory. Can I simply patch "my" MIDAS? Is there general
> interest in a change like this? (Does anyone have experience with the speed of
> hadd?)
>
> Looking forward to comments from the Forum.
>
> Cheers,
>
> Randolf |
Attachment 1: histoAdd.cxx
|
#include <iostream>
#include <vector>
#include <iterator>
#include <cstring>
using namespace std;
#include "TROOT.h"
#include "TFile.h"
#include "TString.h"
#include "TDirectory.h"
#include "TObject.h"
#include "TClass.h"
#include "TKey.h"
#include "TH1.h"
#include "TAxis.h"
#include "TMath.h"
#include "TCutG.h"
#include "TFolder.h"
bool verbose (false);
TFolder *histosFolder (0);
//==============================================================================
void addObject (TObject *o, const TString &prefix, TFolder *opFolder=0);
/** called for each file to do the addition.
* loops over each object in the file, and
* uses addObject to dispatch the actuall addition.
*/
void add (TFile *file, const TString &prefix) {
//------------------------------------------------------------------------------
TString dirName (file->GetName());
if (verbose) cout << " scanning TFile" << endl;
TString newPrefix (prefix + dirName + "/");
TIter i (file->GetListOfKeys());
while (TKey *k = static_cast<TKey *>( i())) {
TObject *o (file->Get( k->GetName()));
addObject( o, newPrefix);
}
return;
}
//==============================================================================
/** Most histograms are added bin by bin, but if simpleAdd == false,
* the xaxis values are assumed different, in which case we look up
* appropriate bin numbers, and use a new extended xaxis if needed.
*
* Use simpleAdd=false to accumulate scaler rate histograms.
*/
void add (const TH1 *newh, TH1 *&hsum, TFolder *opFolder) {
//------------------------------------------------------------------------------
const bool simpleAdd (newh->GetXaxis()->GetTimeDisplay() ? false : true);
if (!hsum) {
hsum = (TH1 *)newh->Clone();
TString title = "histoAdd: ";
title += hsum->GetTitle();
if (opFolder) hsum->SetDirectory( 0);
}
else if (simpleAdd) hsum->Add( newh);
else { // extend axis - for 1D histos with equal sized bins
size_t nBinsSum = hsum->GetNbinsX();
size_t nBinsNew = newh->GetNbinsX();
vector<Double_t>bin_contents;
vector<Double_t>histo_edges;
Int_t holder_bins;
/* foundBin is either the overflow bin (if the 2 histograms don't overlap)
* or it is the bin number in histoA that has the same low edge as the
* lowest bin edge in histoB. The only time that histograms can overlap
* is when the older scaler.cxx is used to create the histograms with
* fixed bin sizes.
*/
Int_t foundBin = hsum->FindBin( newh->GetBinLowEdge(1) );
histo_edges.resize(foundBin);
bin_contents.resize(foundBin);
for( int i = 1; i <= foundBin; i++ ) {
histo_edges[i-1] = hsum->GetBinLowEdge(i);
bin_contents[i-1] = hsum->GetBinContent(i);
}
if(foundBin < nBinsSum) {
//the histos overlap or we have already made holder bins
holder_bins = 0;
}
else {
//create a "place holder" histo
Int_t width = 10;
holder_bins = (int)((newh->GetXaxis()->GetXmin()
- hsum->GetXaxis()->GetXmax())/width);
if( holder_bins < width ) holder_bins = width;
TH1F *bin_holder = new TH1F("bin_holder", "bin_holder", holder_bins,
hsum->GetXaxis()->GetXmax(),
newh->GetXaxis()->GetXmin() );
histo_edges.resize( foundBin + holder_bins );
bin_contents.resize( foundBin + holder_bins );
for( int i = 0; i < holder_bins; i++ ) {
histo_edges[foundBin+i] = bin_holder->GetBinLowEdge(i+2);
bin_contents[foundBin+i] = 0;
}
delete bin_holder;
} //end else
histo_edges.resize( foundBin + holder_bins + nBinsNew+1 );
bin_contents.resize( foundBin + holder_bins + nBinsNew+1 );
for( int i = 0; i <= nBinsNew; i++ ) {
histo_edges[i+foundBin+holder_bins] = newh->GetBinLowEdge(i+1);
bin_contents[i+foundBin+holder_bins] = newh->GetBinContent(i+1);
}
hsum->SetBins( histo_edges.size()-1, &histo_edges[0] );
for ( int i=1; i<histo_edges.size(); ++i) {
hsum->SetBinContent( i, bin_contents[i-1]);
}
if (opFolder) {
//opFolder->Remove( hsum);
//opFolder->Add( exth);
hsum->SetDirectory( 0);
}
}
if (verbose) {
if (simpleAdd) cout << " adding counts";
else cout << " tagging on bins";
cout << endl;
}
return;
}
//==============================================================================
/** Most cuts are written out just once, but if a cut is different from
* the most recently written version of a cut with the same name, then
* another copy of the cut is written out. Thus if a cut changes during
* a series of runs, all versions of the cut will be present in the
* summed file.
*/
void add (const TCutG *o, TCutG *&oOut, TFolder *opFolder) {
//------------------------------------------------------------------------------
const char *name (o->GetName());
bool write (false);
if (!oOut) write = true;
else {
Int_t n (o->GetN());
if (n != oOut->GetN()) write = true;
else {
double x1, x2, y1, y2;
for (Int_t i=0; i<n; ++i) {
o ->GetPoint( i, x1, y1);
oOut->GetPoint( i, x2, y2);
if ((x1 != x2) || (y1 != y2)) write = true;
}
}
}
if (write) {
if (verbose) {
if (oOut) cout << " changed";
cout << " TCutG" << endl;
}
if (!opFolder) {
oOut = const_cast<TCutG *>( o);
o->Write( name, TObject::kSingleKey);
} else {
TCutG *clone = static_cast<TCutG *>( o->Clone());
if (oOut) opFolder->Add( clone);
oOut = clone;
}
} else if (verbose) cout << endl;
return;
}
//==============================================================================
/** for most objects, we keep just the first version.
*/
void add (const TObject *o, TObject *&oSum, TFolder *opFolder) {
//------------------------------------------------------------------------------
const char *name (o->GetName());
if (!oSum) {
if (verbose) cout << " saving TObject" << endl;
if (!opFolder) o->Write( name, TObject::kSingleKey);
} else {
if (verbose) cout << endl;
}
return;
}
//==============================================================================
/** create the new directory and then start adding its contents
*/
void add (TDirectory *dir, TDirectory *&sumDir,
TFolder *opFolder, const TString &prefix) {
//------------------------------------------------------------------------------
TDirectory *currentDir (gDirectory);
TString dirName (dir->GetName());
if (verbose) cout << " scanning TDirectory" << endl;
TString newPrefix (prefix + dirName + "/");
if (!sumDir) sumDir = gDirectory->mkdir( dirName);
sumDir->cd();
TIter i (dir->GetListOfKeys());
while (TKey *k = static_cast<TKey *>( i())) {
TObject *o (dir->Get( k->GetName()));
addObject( o, newPrefix, opFolder);
}
currentDir->cd();
return;
}
//==============================================================================
/** create a new folder and then start adding its contents
*/
void add (const TFolder *folder, TFolder *&sumFolder,
TFolder *parentFolder, const TString &prefix) {
//------------------------------------------------------------------------------
if (verbose) cout << " scanning TFolder" << endl;
const char *name (folder->GetName());
TString newPrefix (prefix + name + "/");
if (!sumFolder) sumFolder = new TFolder (name, name);
if (!histosFolder) histosFolder = sumFolder;
TIter i (folder->GetListOfFolders());
while (TObject *o = i()) addObject( o, newPrefix, sumFolder);
return;
}
//==============================================================================
int main (int argc, char **argv) {
//------------------------------------------------------------------------------
if (argc < 3) {
cerr << argv[0] << ": out_root_file in_root_file1 in_root_file2 ..." << endl;
return 1;
}
TROOT root ("histoadd", "histoadd");
root.SetBatch();
TString opFileName (argv[1]);
TFile *opFile (TFile::Open( opFileName, "RECREATE"));
if (!opFile) {
cerr << argv[0] << ": unable to open file: " << argv[1] << endl;
return 1;
}
--argc;
++argv;
... 95 more lines ...
|
417
|
21 Nov 2007 |
Konstantin Olchanski | Forum | ODBv3, second try - Midas on a x86_64 - incompatible with x86_32 |
These changes to make 32-bit and 64-bit ODB binary compatible with each other are now commited to midas svn, revision 4080.
Starting with this revision, ODB version changes from 2 to 3, breaking binary compatibility with previous releases.
Before upgrading to this revision, save your ODB as an XML file, *and* try to reload it, to catch any potential problems with parsing of the XML file.
Part of this commit are checks for sizes of important midas data structures stored in ODB shared memory - if the compiled size does not match the expected
value, binary compatibility is broken and the program will abort - to avoid further corruption of ODB shared memory. This feature is only enabled on Linux and
it is expected to trigger only on compiler malfunctions (generates wrong data size) and on accidental or intentional changes to important data structures in
midas, to warn the user that they broke ODB binary compatibility.
K.O.
> > > > I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
> > > > in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
> > > 1) midas.h: remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
> > > 2) msystem.h: add 32-bit padding to CHN_STATISTICS and CHN_SETTINGS
>
> I am now trying a different solution of to fixing the issue of CHN_STATISTICS and CHN_SETTINGS changing size.
>
> 1) midas.h: (same as before) remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
> 2) msystem.h: in CHN_STATISTICS and CHN_SETTINGS change type of "event_limit" and "files_written" from int to "double".
>
> Below are the latest ODBv3 meta patches:
>
> ladd03:midas$ svn diff
> Index: include/midas.h
> ===================================================================
> --- include/midas.h (revision 3844)
> +++ include/midas.h (working copy)
> /* has to be changed whenever binary ODB format changes */
> -#define DATABASE_VERSION 2
> +#define DATABASE_VERSION 3
> .........
> short int trigger_mask; /**< trigger mask */
> INT sampling_type; /**< GET_ALL, GET_SOME, GET_FARM */
> - /**< dispatch function */
> - void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
> } EVENT_REQUEST;
>
> Index: include/msystem.h
> ===================================================================
> --- include/msystem.h (revision 3845)
> +++ include/msystem.h (working copy)
> -"Event limit = DWORD : 0",\
> +"Event limit = DOUBLE : 0",\
> ..................
> -"Files written = INT : 0",\
> +"Files written = DOUBLE : 0",\
> ..................
> - DWORD event_limit;
> + double event_limit;
> ..................
> - INT files_written;
> + double files_written;
>
> K.O. |
420
|
04 Feb 2008 |
Robert Pattie | Forum | analyzer crashes at high rates | I'm using midas to read data from a waveform digitizer at event rates of
10-30kHz. To accomplish this the digitizer is read via Block transfers and the
raw data put into a single MIDAS event. Thus a MIDAS event could contain upto
250 physical events and at maximum 350kBytes. In the analyzer modules I had
been analyzing the first physics event contained in a MIDAS event with no
problem. Recently I tried to analyze all the physical events. At low rates,
100hz-1khz, this was no problem, 1-5 physical events in a MIDAS event. At
higher rates 10-20kHz, where there are about 40physical events per MIDAS event,
the analyzer keeps up for a few seconds then seg faults with " 'shared object
read from target memory' has disappear; keeping it symbols". Any suggestions as
to why the analyzer is crashing would be very helpful.
Thanks,
Robert |
|