Back Midas Rome Roody Rootana
  Midas DAQ System, Page 93 of 143  Not logged in ELOG logo
    Reply  23 Jan 2007, Stefan Ritt, Bug Report, buffer bugs 

Denis Bilenko wrote:
1 & 3 - thanks for the fix and the explanation, as for 2 - I've tried consume and produce
and still has a problem


Acknowledged. I could reproduce it with the information you supplied, thank you very much. Also the data rate is slower than what I expect. I will investigate and fix this, but it could take some time.
    Reply  24 Jan 2007, Stefan Ritt, Bug Report, buffer bugs 
I tried again and could not reproduce the problem. Last time I was probably confused by some old mserver.exe executable I had lying around. I updated to the most recent version (3516) and did a C:\midas> nmake -f makefile.nt. Last time I was also confused about the low rate, but that was caused by a mserver.exe executable which was not compiled with optimization. For small event sizes (such as 10 bytes) there is a big difference between optimized and non-optimized code. So I got:


First Console wrote:
ID of event to produce: 1
Host to connect: localhost
Event size: 10
Level:   0.0 %, Rate: 0.46 MB/sec
flush
Level:   0.0 %, Rate: 0.43 MB/sec
Level:   0.0 %, Rate: 0.43 MB/sec
Level:   0.0 %, Rate: 0.42 MB/sec
Level:   0.0 %, Rate: 0.42 MB/sec
Level:   0.0 %, Rate: 0.43 MB/sec
Level:   0.0 %, Rate: 0.43 MB/sec
Level:   0.0 %, Rate: 0.44 MB/sec
Level:   0.0 %, Rate: 0.42 MB/sec
Level:   0.0 %, Rate: 0.43 MB/sec
Level:   0.0 %, Rate: 0.43 MB/sec
flush
Level:   0.0 %, Rate: 0.44 MB/sec
Level:   0.0 %, Rate: 0.44 MB/sec
Level:   0.0 %, Rate: 0.40 MB/sec
Level:   0.0 %, Rate: 0.42 MB/sec
Level:   0.0 %, Rate: 0.43 MB/sec
Level:   0.0 %, Rate: 0.43 MB/sec
Level:   0.0 %, Rate: 0.44 MB/sec
Level:   0.0 %, Rate: 0.43 MB/sec
Level:   0.0 %, Rate: 0.43 MB/sec
Level:   0.0 %, Rate: 0.43 MB/sec
flush


and


Second Console wrote:
C:\midas\NT\bin>.\consume
ID of event to request: 1
Host to connect:
Get all events (0/1): 1
Receive via callback ([y]/n):
[consume.c:73:process_event] Serial number mismatch: Ser: 1169666, OldSer: 0, ID
: 1, size: 10
Level:   0.0 %, Rate: 0.00 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.42 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.42 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   2.4 %, Rate: 0.35 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.50 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level:   0.0 %, Rate: 0.40 MB/sec, ser mismatches: 1
Received break. Aborting...


Actually sending remote and receiving local is a very common thing. Most experiments use that. They have a remote frontend, and the logger and analyzer work locally. If that would not work, all these experiments would have a problem. So I only can encourage you to try again, make sure to update and recompile the executables. Maybe delete any old *.SHM file. Maybe try on another PC or under Linux.
Entry  30 Jan 2007, Stefan Ritt, Bug Report, Large files under Windows XP 
Hello,

We have problems analyzing large files under Windows XP. For small file sizes,
everything is ok. We have events of 2.8 MB each, and we can read ~30 events per
second. But if the file gets larger than typically 600-800 MB, then access
becomes very slow, about 1 event per second. This is not the case under Linux,
where it stays at 30 Hz (~90 MB/sec). 

Looking at the low level file access, it is obvious that this has nothing to do
with midas, this problem can be reproduced with a simple program reading chunks
of 3MB from a 1GB file. The Windows XP file system is NTFS, default formatting.
Does anyone else have observed a similar problem or maybe even have some
suggestions? Unfortunately many people here want to analyze midas data under
Windows...

Stefan Ritt
Entry  02 Feb 2007, Exaos Lee, Bug Report, Compiling failed with SVN3562 under Ubuntu 6.10 
The error log is as the following:
cc -c -g -O2 -Wall -Wuninitialized -Iinclude -Idrivers -I../mxml -Llinux/lib -DINCLUDE_FTPLIB   -D_LARGEFILE64_SOURCE -DHAVE_MYSQL -DHAVE_ROOT -pthread -I/opt/root/current/include -DOS_LINUX -fPIC -Wno-unused-function -o linux/lib/system.o src/system.c
src/system.c:958: error: expected declaration specifiers or ‘...’ before ‘gettid’
src/system.c:958: warning: data definition has no type or storage class
src/system.c:958: warning: type defaults to ‘int’ in declaration of ‘_syscall0’
src/system.c: In function ‘ss_gettid’:
src/system.c:1005: warning: implicit declaration of function ‘gettid’
src/system.c: In function ‘ss_suspend_init_ipc’:
src/system.c:2948: warning: pointer targets in passing argument 3 of ‘getsockname’ differ in signedness
src/system.c: In function ‘ss_suspend’:
src/system.c:3414: warning: pointer targets in passing argument 6 of ‘recvfrom’ differ in signedness
src/system.c:3441: warning: pointer targets in passing argument 6 of ‘recvfrom’ differ in signedness
make: *** [linux/lib/system.o] 错误 1

The error might be here:
void ss_force_single_thread()
{
   _single_thread = TRUE;
}

#if defined(OS_DARWIN)
// blank
#elif defined(OS_LINUX)
_syscall0(pid_t,gettid);
#endif

INT ss_gettid(void)

I have no idea about the usage of _syscall0(...).
Entry  02 Feb 2007, Exaos Lee, Bug Report, Compiling failed with SVN3562 under Ubuntu 6.10 err.log
I tried to solve the problem by adding a ";". It was wrong. In fact, the macro "_syscall0(..)" doesn't need the ";".
I searched and found that somebody said "the overall _syscall$magicnumber will disappear". I don't mind whether the "_syscall" disappear or not. I just want to compile the code and do my job. I deleted the additional ";" and recompiled. The error output is as the attachment [elog:335/1].
Entry  05 Feb 2007, Fedor Ignatov, Bug Report, segmentation violation of analyzer on a x86_64 
Hello,

When I  connect to analyzer on a x86_64 processor(with Roody),  
a analyzer break with segmentation violation in the root_server_thread  function.
Same code are working fine on a 32bit processor.
As I found the problem are in exchanging of pointers between analyzer and client.
Before to send a pointer, it is saved a pointer in int (size=4, instead of 8) at
this place:
Index: src/mana.c
===================================================================
--- src/mana.c  (revision 3498)
+++ src/mana.c  (working copy)
@@ -5386,7 +5386,7 @@

             //write pointer
             message->Reset(kMESS_ANY);
-            int p = (POINTER_T) obj;
+            POINTER_T p = (POINTER_T) obj;
             *message << p;
             sock->Send(*message);


Sincerely Yours,
Fedor Ignatov 
Entry  05 Feb 2007, Konstantin Olchanski, Bug Report, wrong version in include/midas.h? 
The present .../include/midas.h contains
[alpha@laddvme06 ~/online]$ grep 1.9.5 /home/alpha/packages/midas/include/*
/home/alpha/packages/midas/include/midas.h:#define MIDAS_VERSION "1.9.5"

All MIDAS utilities (odbedit ver) presently report version 1.9.5, even for svn
trunk, and this may confuse people as to what version of midas they are using,
and may complicate reporting of bugs.

Perhaps the trunk version should say something like "svn-22233344" (the svn
revision number)? The present "1.9.5" is wrong...

K.O.
    Reply  06 Feb 2007, Stefan Ritt, Bug Report, wrong version in include/midas.h? 
> The present .../include/midas.h contains
> [alpha@laddvme06 ~/online]$ grep 1.9.5 /home/alpha/packages/midas/include/*
> /home/alpha/packages/midas/include/midas.h:#define MIDAS_VERSION "1.9.5"
> 
> All MIDAS utilities (odbedit ver) presently report version 1.9.5, even for svn
> trunk, and this may confuse people as to what version of midas they are using,
> and may complicate reporting of bugs.
> 
> Perhaps the trunk version should say something like "svn-22233344" (the svn
> revision number)? The present "1.9.5" is wrong...

Fully agree. I added a svn_revision string into midas.h, which gets reported now
by "odbedit ver". Unfortunately this reflects only changes in midas.c. If one
changes odb.c for example, the svn revision in midas.c does not get modified by
the SVN system. In addition I changed the present version 1.9.5 to 2.0.0. I made
the tar and zip files. After some internal testing, it will be announced
officially in a few days.
    Reply  06 Feb 2007, Stefan Ritt, Bug Report, segmentation violation of analyzer on a x86_64 
> Hello,
> 
> When I  connect to analyzer on a x86_64 processor(with Roody),  
> a analyzer break with segmentation violation in the root_server_thread  function.
> Same code are working fine on a 32bit processor.
> As I found the problem are in exchanging of pointers between analyzer and client.
> Before to send a pointer, it is saved a pointer in int (size=4, instead of 8) at
> this place:
> Index: src/mana.c
> ===================================================================
> --- src/mana.c  (revision 3498)
> +++ src/mana.c  (working copy)
> @@ -5386,7 +5386,7 @@
> 
>              //write pointer
>              message->Reset(kMESS_ANY);
> -            int p = (POINTER_T) obj;
> +            POINTER_T p = (POINTER_T) obj;
>              *message << p;
>              sock->Send(*message);
> 
> 
> Sincerely Yours,
> Fedor Ignatov 

Do I understand you right? With your patch it works even on 64 bit, right? Or do you
mean there is still a segmentation violation? Anyhow I committed your patch since the
"int" is clearly incorrect.

- Stefan
    Reply  06 Feb 2007, Fedor Ignatov, Bug Report, segmentation violation of analyzer on a x86_64 
Yes right, Problem of a segmentation violation is solved with this patch. Now it works
fine on x86_64.

Fedor 

> Do I understand you right? With your patch it works even on 64 bit, right? Or do you
> mean there is still a segmentation violation? Anyhow I committed your patch since the
> "int" is clearly incorrect.
> 
> - Stefan
    Reply  17 Feb 2007, Konstantin Olchanski, Bug Report, segmentation violation of analyzer on a x86_64 
> Yes right, Problem of a segmentation violation is solved with this patch. Now it works
> fine on x86_64.

Right. I confirm this. I have this exact same fix in my stand-alone copy of the midas
histogram server, and should commit it to MIDAS CVS as well.

K.O.
Entry  22 May 2007, Randolf Pohl, Bug Report, analyzer_init called by odb_load 
Hi,

I wonder why mana.c:odb_load() calls analyzer_init(). This way analyzer_init 
is called TWICE or more times:
first from mana.c:mana_init(), for each invocation of the analyzer, and 
second from mana.c:odb_load(), for each run to be analyzed

Isn't this a bug? It can mess up several things (like mallocs) if you don't 
take the necessary precautions. Other module_init functions are correctly 
called only once, before all runs are analyzed.

I have the feeling, that odb_load should NOT call analyzer_init. Or am I wrong 
(probably, but please explain to me)? Do I have to live with it and make sure 
that my beautiful global initialization in analyzer_init is only done once?
:-)

Cheers,

Randolf

And here is the annotated log using the ROOT example experiment 
(several modules changed/added to print their respective names)

:~/midas/examples/root> ./analyzer -e exa_root -i run%05d.mid -r 1 3
 
analyzer_init        <-- ok

Root server listening on port 9090...
adc_calib_init       <-- ok
adc_summing_init     <-- ok
scaler_init          <-- ok
Running analyzer offline. Stop with "!"
Set run number 1 in ODB
Load ODB from run 1...
analyzer_init        <-- not ok, or is it?

OK
run00001.mid:777  events, 0.00s
Set run number 2 in ODB
Load ODB from run 2...
analyzer_init        <-- not ok, or is it?

OK
run00002.mid:7227  events, 0.03s
Set run number 3 in ODB
Load ODB from run 3...
analyzer_init        <-- not ok, or is it?

OK
run00003.mid:13866  events, 0.06s
adc_calib_exit
adc_summing_exit
scaler_exit

analyzer_exit
    Reply  22 May 2007, Stefan Ritt, Bug Report, analyzer_init called by odb_load 
The reason to call analyzer_init in odb_load is the following:

Assume you run the analyzer offline, analyzing many files in series. Then assume
that you have /Experiment/Run Parameters, which is actively used by the analyzer
(like beam settings etc.). In this case you do a db_open_record() to map
/Experiment/Run Parameters to the exp_param C structure. For this mapping to work,
the ODB structure and the C structure have to be exactly the same. Now assume that
you changed your run parameters over time, like you added some comment later. Now
you want to analyzer several runs, some before and some after the modification.
Both sets have a different structure in /Experiment/Run Parameters, which is a
problem, since the compiled analyzer can only have a single C structure. My "poor"
solution was to call analyzer_init after each loading of the ODB from the *.mid
file. The db_create_record() call matches the C structure to the ODB structure by
modifying the ODB structure if necessary. So if you added one parameter later, this
(modified) structure gets loaded by odb_load, but then it gets adjusted in
analyzer_init().

I understand now that this case might not happen so often, and you are more
bothered by the fact that analyzer_init gets called several time. There must
however be a hook for offline analysis that the user code can correct the ODB
structure. So I propose to add a flag to analyzer_init, such as

INT analyzer_init(BOOL bFirst)
{
}

If bFirst equals TRUE, the function got called from mana_init(), if FALSE, it got
called from odb_load. Then you can put code like

INT analyzer_init(BOOL bFirst)
{
   if (bFirst) {
      p = malloc()
      ...
   }
}

If you agree, I will modify the code and commit the change.

- Stefan
    Reply  22 May 2007, Randolf Pohl, Bug Report, analyzer_init called by odb_load 
Thanks for the quick reply, Stefan.

Please don't change anything in the code unless you find it really important. I guess 
changing the analyzer_init prototype will break a lot of code out there?

In fact, I think I do understand this behavior now.
And even without your suggested fix there is a simple workaround: I add a static 
variable to my analyzer_init.cxx file, and do something similar to your bFirst fix.

In conclusion, commit your fix if it does not harm others. Postpone this commit to a 
future new version of midas which breaks a lot of things anyway...

A last question, for me to understand: Why not call db_open_record in 
ana_begin_of_run then?

Cheers,

Randolf
    Reply  22 May 2007, Stefan Ritt, Bug Report, analyzer_init called by odb_load 
> Thanks for the quick reply, Stefan.
> 
> Please don't change anything in the code unless you find it really important.
I guess 
> changing the analyzer_init prototype will break a lot of code out there?
> 
> In fact, I think I do understand this behavior now.
> And even without your suggested fix there is a simple workaround: I add a static 
> variable to my analyzer_init.cxx file, and do something similar to your bFirst
fix.
> 
> In conclusion, commit your fix if it does not harm others. Postpone this
commit to a 
> future new version of midas which breaks a lot of things anyway...
> 
> A last question, for me to understand: Why not call db_open_record in 
> ana_begin_of_run then?

I fully agree with you that db_open_record would better go into ana_begin_of_run
(and
analyzer_init not being called in odb_load), and I fully agree with you that
changing the
code would break many experiments. ;-)

So I guess we leave it as it is right now as you suggested.
Entry  20 Aug 2007, Konstantin Olchanski, Bug Report, how to handle end of run? 
I am having problems with handling the end-of-run situation in my midas
frontend. I have a device that continuously sends data (over USB) and I read
this data in my "read_event" function.

Everything is good until the end-of-run, at which time this happens:
0) mfe.c calls my read_event() to read the data (loop until the end-of-run
transition)
1) mfe.c calls my end_of_run()
2) here, I tell the device "please stop sending data"
3) all seems good, but wait!!!
4) there is all this data generated between step 0 and step 2 still sitting
inside the device and it has nowhere to go: the run is ended, the output file is
closed, my read_event() will never be called ever again (well, until the next run).

It seems to me mfe.c needs to have one more function, something like
"pre_end_of_run()" that works like this:
0) mfe.c calls my read_event() to read the data (loop until the end-of-run
transition)
1) mfe.c calls pre_end_of_run(), here I tell the device to stop sending data
2) mfe.c calls read_event() for the very last time, to give me the opportunity
to read and send away any data I still may have.
3) mfe.c calls the end_of_run(). The run is truely finished.

Any thoughts?

K.O.
    Reply  03 Sep 2007, Stefan Ritt, Bug Report, how to handle end of run? 
> I am having problems with handling the end-of-run situation in my midas
> frontend. I have a device that continuously sends data (over USB) and I read
> this data in my "read_event" function.
> 
> Everything is good until the end-of-run, at which time this happens:
> 0) mfe.c calls my read_event() to read the data (loop until the end-of-run
> transition)
> 1) mfe.c calls my end_of_run()
> 2) here, I tell the device "please stop sending data"
> 3) all seems good, but wait!!!
> 4) there is all this data generated between step 0 and step 2 still sitting
> inside the device and it has nowhere to go: the run is ended, the output file is
> closed, my read_event() will never be called ever again (well, until the next run).
> 
> It seems to me mfe.c needs to have one more function, something like
> "pre_end_of_run()" that works like this:
> 0) mfe.c calls my read_event() to read the data (loop until the end-of-run
> transition)
> 1) mfe.c calls pre_end_of_run(), here I tell the device to stop sending data
> 2) mfe.c calls read_event() for the very last time, to give me the opportunity
> to read and send away any data I still may have.
> 3) mfe.c calls the end_of_run(). The run is truely finished.
> 
> Any thoughts?

You can achieve the desired functionality without changing mfe.c:

0) mfe.c calls read_event
1) mfe.c calls end_of_run. Your end_of_run tells the device to stop data and flushes
the remaining data. At this point you have to re-make actually a part of the mfe.c
functionality, but basically you need a bm_compose_event() and a bm_send_event(), so
just a few lines of code. If you want to have the final event number right in your
equipment, you also need to update eq->events_sent accordingly. 

Given the fact that 99% of the experiments do not need this functionality, I propose
that we keep mfe.c and you add the few lines of code into your user part of the
specific frontend.

Stefan
Entry  08 Oct 2007, Carl Metelko, Bug Report, Error in data format- ending blocks on 32bit boundary x86_64 
Hi,
    I found that midas banks can be given an extra 32 bits of zeros when
trying to keep to 32bit boundary on my x86_64. 

This can be fixed by changing (in midas.h)
#define ALIGN8(x)  (((x)+7) & ~7)
to
#define ALIGN8(x)  (((x)+3) & ~3)

Is there any bad consequences doing this?
    Reply  08 Oct 2007, Stefan Ritt, Bug Report, Error in data format- ending blocks on 32bit boundary x86_64 
> Hi,
>     I found that midas banks can be given an extra 32 bits of zeros when
> trying to keep to 32bit boundary on my x86_64. 
> 
> This can be fixed by changing (in midas.h)
> #define ALIGN8(x)  (((x)+7) & ~7)
> to
> #define ALIGN8(x)  (((x)+3) & ~3)
> 
> Is there any bad consequences doing this?

Yes. ALIGN8 means 'align to 8-byte boundary' (64-bit), and if you change that, you
break the code at various locations. Furthermore, 8-byte aligned access is faster
on x86_64 than 4-byte aligned access, so you will get a performance penalty. If
course if you have very many small banks, the zero padding can cause some
overhead, but in that case you could combine some data into a single bank.
Entry  11 Oct 2007, Stefan Ritt, Bug Report, _syscall0 not available on gcc 4.1.1 
Dear Stephan,

I am writting on behalf of the LiBeRACE collaboration
at Berkeley/Livermore.

We are trying to use midas (2.0.0) for our acquisition system.
However we had some difficulties to compile it on LINUX Fedora
Core 6 with gcc 4.1.1
I tried to trace back the problem and I found that _syscall0 in
system.c is actually an obsolete call (since gcc 4.x apparently).
Playing with assembly language being behond my competence, I would 
like to know if you ever came across this situation recently and
if you have any suggestion(s).

With my best regards
Julien GIBELIN


------------------------------------------------------
GIBELIN Julien

Lawrence Berkeley National Laboratory
Nuclear Science Division
One Cyclotron Rd.
MS 88R0192
BERKELEY, CA 94720-8101

Tel: +1 (510) 495-2695
Fax: +1 (510) 486-7983
------------------------------------------------------
ELOG V3.1.4-2e1708b5