Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  07 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline crash.out
    Reply  08 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline 
       Reply  09 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline 
          Reply  10 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline 
    Reply  11 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline 
       Reply  11 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline 
          Reply  12 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline 
Message ID: 382     Entry time: 07 Jun 2007     Reply to this: 385   388
Author: Randolf Pohl 
Topic: Forum 
Subject: crash when analyzing multiple runs offline 
Hello,

I am having a problem with the root-based analyzer. It crashes when I try to 
analyze multiple runs OFFLINE using the "-i run%05d.mid -o result%05d.root -r 
1 2" feature.

I can reproduce the problem with the example experiment which comes with the 
MIDAS distribution:
Running the analyzer ONLINE works fine: One can start and stop runs one after 
the other, roody shows the histograms being reset and then filled again and 
such.

But OFFLINE, the analyzer crashes when trying to analyze the SECOND run in a 
sequence. So
./analyzer -i run%05d.mid -o result%05d.root -r 1 1   works (only run 1)
./analyzer -i run%05d.mid -o result%05d.root -r 1 3   dies on run 2
Output attached (I added printf's to the "init"-modules, but that's irrelevant 
here)


My own analyzer shows the same effect. There I got the impression the segfault 
happens on the first attempt to Fill/Reset/SetName etc. a histogram in the 2nd 
run. But with the midas example it looks like the analyzer finishes filling 
histos even for run 2, but then dies in eor.

Can you reproduce the problem?

I run MIDAS on an Intel Quadcore, 64 bit SuSE Linux 10.2.
pohl@lamb2:~/midas/examples/root> gcc --version
gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux)

(maybe 4.1.2 "PRERELEASE" is the problem? See message ID 344)

I am using midas rev. 3674 (April 19, 2007), but I got the impression there 
has since not been a change relevant to this problem. Please correct me if I 
am wrong, then I would try it with Rev HEAD.
(My version includes already the fix to the x86_64 segfault problem of message 
ID 337)


Best regards,

Randolf
Attachment 1: crash.out  2 kB  | Hide | Hide all
pohl@lamb:~/midas/examples/root> ./analyzer -e exa_root -i run%05d.mid -o /tmp/pohl/test%05d.root -r 1 3

analyzer_init

Root server listening on port 9090...
adc_calib_init
adc_summing_init
scaler_init
Running analyzer offline. Stop with "!"
Set run number 1 in ODB
Load ODB from run 1...
analyzer_init

OK
run00001.mid:777  /tmp/pohl/test00001.root:775  events, 0.00s
Set run number 2 in ODB
Load ODB from run 2...
analyzer_init

OK
run00002.mid:7227  /tmp/pohl/test00002.root:7225  events, 0.04s

 *** Break *** segmentation violation
Using host libthread_db library "/lib64/libthread_db.so.1".
Attaching to program: /proc/10558/exe, process 10558
[Thread debugging using libthread_db enabled]
[New Thread 47688414288800 (LWP 10558)]
[New Thread 1082132800 (LWP 10559)]
0x00002b5f52afae1f in waitpid () from /lib64/libc.so.6
Thread 2 (Thread 1082132800 (LWP 10559)):
#0  0x00002b5f5234d0ab in __accept_nocancel () from /lib64/libpthread.so.0
#1  0x00002b5f4e4cc510 in TUnixSystem::AcceptConnection ()
   from /usr/local/root/lib/root/libCore.so.5.14
#2  0x00002b5f4e4c1592 in TServerSocket::Accept () from /usr/local/root/lib/root/libCore.so.5.14
#3  0x000000000041075f in root_socket_server (arg=<value optimized out>) at src/mana.c:5453
#4  0x00002b5f5188428a in TThread::Function () from /usr/local/root/lib/root/libThread.so.5.14
#5  0x00002b5f5234609e in start_thread () from /lib64/libpthread.so.0
#6  0x00002b5f52b294cd in clone () from /lib64/libc.so.6
#7  0x0000000000000000 in ?? ()

Thread 1 (Thread 47688414288800 (LWP 10558)):
#0  0x00002b5f52afae1f in waitpid () from /lib64/libc.so.6
#1  0x00002b5f52aa3491 in do_system () from /lib64/libc.so.6
#2  0x00002b5f52aa3817 in system () from /lib64/libc.so.6
#3  0x00002b5f4e4d0851 in TUnixSystem::StackTrace ()
   from /usr/local/root/lib/root/libCore.so.5.14
#4  0x00002b5f4e4cfa4a in TUnixSystem::DispatchSignals ()
   from /usr/local/root/lib/root/libCore.so.5.14
#5  <signal handler called>
#6  0x00002b5f52ad5ee5 in free () from /lib64/libc.so.6
#7  0x000000000040c89b in CloseRootOutputFile () at src/mana.c:1489
#8  0x0000000000410b45 in eor (run_number=<value optimized out>, error=<value optimized out>)
    at src/mana.c:1981
#9  0x0000000000412d9b in analyze_run (run_number=2, 
    input_file_name=0x7fff5cafd020 "run00002.mid", output_file_name=<value optimized out>)
    at src/mana.c:4471
#10 0x00000000004130b4 in loop_runs_offline () at src/mana.c:4518
#11 0x0000000000413e05 in main (argc=<value optimized out>, argv=<value optimized out>)
    at src/mana.c:5757
#0  0x00002b5f52afae1f in waitpid () from /lib64/libc.so.6
[midas.c:1592:] cm_disconnect_experiment not called at end of program
ELOG V3.1.4-2e1708b5