Back Midas Rome Roody Rootana
  Midas DAQ System, Page 134 of 142  Not logged in ELOG logo
ID Datedown Author Topic Subject
  172   04 Nov 2004 Jan WoutersForumFrontend code and the ODB
I would like to know whether all parameters used by the frontend code have to be in the "Experiment/
Run Parameters" section.  This section can become big and difficult to maintain, because it is one single 
big section of experim.h (EXP_PARAM_DEFINED).  I have parameters the various frontends read at the 
beginning of each run, which set the hardware settings of various devices.  I would like to place these in 
a section all their own, organized by device.  Is this doable? 
  171   02 Nov 2004 Renee PoutissouInfoEvent Builder info in mhttpd Status page
Information about the Event Builder statistics has been removed from the 
Status page in mhttpd.  I heard from Pierre that this information might 
be redundant when using the new Event Builder format??? 
For the TWIST experiment, we are running and cannot change on the fly
to a new format Event Builder.  It is very important for us to show the users
the rates and statistics coming out of the EventBuilder.  I had  to put this
piece of code back in mhttpd.  
Can I put it back in the distribution? or do I have to put a special TWIST flag? 
or do I have to keep reinserting this every time there is an update to mhttpd.c? 
At the moment, TWIST is generating a couple of updates/week to mhttpd.c
  170   22 Oct 2004 Konstantin OlchanskiBug Fixmhttpd message colouring
I commited a fix to mhttpd logic that decides which messages should be shown in
"red" colour- before, any message with square brackets and colons would be
highlighted in red. Now only messages matching the pattern [...:...] are
highlighted. The decision logic was moved into a function message_red(). K.O.
  169   14 Oct 2004 Konstantin OlchanskiBug ReportTWIST upgrade bombed...
> The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> crashes during shared memory data buffer accesses. I am looking into it and I
> will add information as I figure things out. K.O.

On second try, it looks like we are in business- the first try did not work
because of two mistakes:

1) I did not delete *all* old .SHM files (.ODB.SHM, .SYSTEM.SHM, .YBUF1.SHM,
.YBUF2.SHM). I deleted ODB.SHM, so odb worked, but forgot about the data buffers
SYSTEM.SHM & co and ended up with segmentation faults and core dumps in the buffer
management code caused by a mismatch of the old-midas buffers and new-midas code.
2) while debugging these core dumps, I made an error in my test code, so even
after I deleted the old data buffers, things still did not work. Talk about
over-debugging a problem...

K.O.
  168   14 Oct 2004 Konstantin OlchanskiBug Reportlazylogger complains about zero-size files
With latest midas, I see this:

Thu Oct 14 19:31:17 2004 [Lazy_Tape] [lazylogger.c:1717:Lazy] lazy_file_exists
file run17567.ybs doesn't exists
Thu Oct 14 19:31:27 2004 [Lazy_Tape] [lazylogger.c:1717:Lazy] lazy_file_exists
file run17567.ybs doesn't exists

The file run17567.ybs has size zero:

-rw-r--r--    1 twistonl users      950272 Oct 13 19:29
/twist/data_onl/current/run17565.ybs
-rw-r--r--    1 twistonl users      950272 Oct 13 19:45
/twist/data_onl/current/run17566.ybs
-rw-r--r--    1 twistonl users           0 Oct 13 20:00
/twist/data_onl/current/run17567.ybs
-rw-r--r--    1 twistonl users      983040 Oct 13 20:03
/twist/data_onl/current/run17568.ybs
-rw-r--r--    1 twistonl users      950272 Oct 13 20:26
/twist/data_onl/current/run17569.ybs

I am not sure how to fix this lazylogger logic. Please help.

K.O.
  167   14 Oct 2004 Stefan RittBug ReportTWIST upgrade bombed...
Agree.

Once you did the modification, please check following situation: Create a fresh
ODB withe increased size ("odbedit -s 2000000" for example). Then check that the
other clients "adopt" this increased size. Note that some experiments need a
bigger ODB, and I don't want to have them recompile all clients, that's why the
code in ss_shm_open() can attach to a *larger* shared memory. However, it should
not matter to the process, since the ODB (or SYSTEM) shared memory size is
stored in the pheader->key_size and pheader->data_size of each participating
process. So they should never write beyond the limits defined in that header.
The size to ss_shm_open() is only a "hint" if the shared memory does not exist,
and is nowhere later used in the code.
  166   13 Oct 2004 Konstantin OlchanskiBug ReportTWIST upgrade bombed...
> > The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> > crashes during shared memory data buffer accesses. I am looking into it and I
> > will add information as I figure things out. K.O.
> 
> Since 1.9.5 the EventBuilder has been modified. Please consult the documentation
> where the new mevb scheme is explained.
> Test of the mevb with up to 16 frontends (15 different CPUs) has been tested
> successfully. Data rate at the EventBuilder were measured about 50MB/s without the
> logger and ~30MB/s with the logger.

It turns out that TWIST uses a private mevb.c. We will consider upgrading to the
standard one.

K.O.
  165   13 Oct 2004 Konstantin OlchanskiBug ReportTWIST upgrade bombed...
> The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> crashes during shared memory data buffer accesses. I am looking into it and I
> will add information as I figure things out. K.O.

I traced buffer memory corruption to a logic error in system.c::ss_shm_open(). If
a .SHM file exists, it's size is used as the size of the sysv shared memory
segment, even if the requested shared memory size is bigger, but the caller of
ss_shm_open()  thinks it got all the requested memory. Eventually we try to use
the unallocated memory and crash. This is the proposed fix and I will commit it
after I retest the upgrade during the next few days.

[olchansk@send src]$ cvs diff -u system.c
olchansk@midas.psi.ch's password: 
Index: system.c
===================================================================
RCS file: /usr/local/cvsroot/midas/src/system.c,v
retrieving revision 1.83
diff -u -r1.83 system.c
--- system.c    4 Oct 2004 07:04:01 -0000       1.83
+++ system.c    14 Oct 2004 05:51:16 -0000
@@ -544,8 +544,14 @@
       } else {
          /* if file exists, retrieve its size */
          file_size = (INT) ss_file_size(file_name);
-         if (file_size > 0)
+         if (file_size > 0) {
+            if (file_size < size) {
+               cm_msg(MERROR, "ss_shm_open", "Shared memory segment \'%s\' size
%d is smaller than requested size %d. Please remove it and try
again",file_name,file_size,size);
+               return SS_NO_MEMORY;
+            }
+            
             size = file_size;
+         }
       }
 
       /* get the shared memory, create if not existing */

K.O.
  164   13 Oct 2004 Pierre-Andre AmaudruzBug ReportTWIST upgrade bombed...
> The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> crashes during shared memory data buffer accesses. I am looking into it and I
> will add information as I figure things out. K.O.

Since 1.9.5 the EventBuilder has been modified. Please consult the documentation
where the new mevb scheme is explained.
Test of the mevb with up to 16 frontends (15 different CPUs) has been tested
successfully. Data rate at the EventBuilder were measured about 50MB/s without the
logger and ~30MB/s with the logger.
  163   13 Oct 2004 Konstantin OlchanskiBug ReportTWIST upgrade bombed...
The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
crashes during shared memory data buffer accesses. I am looking into it and I
will add information as I figure things out. K.O.
  162   13 Oct 2004 Konstantin OlchanskiSuggestionNo al_clear_alarm()?
> > We have al_trigger_alarm(), but no matching al_clear_alarm(), and I need it to
> > clear my alarm once the alarm condition no longer exists. Any objections if I
> > add this function? K.O.
> 
> call al_reset_alarm()

Thanks. I must be quite blind as I did not see al_reset_alarm() in midas.h. I se eit
now. Thanks.

K.O.
  161   13 Oct 2004 Stefan RittSuggestionNo al_clear_alarm()?
> We have al_trigger_alarm(), but no matching al_clear_alarm(), and I need it to
> clear my alarm once the alarm condition no longer exists. Any objections if I
> add this function? K.O.

The idea is that once an alarm got triggered, it stays until the user
acknowledged, even if the alarm condition has been disappeared. Through mhttpd,
the user can press the "Reset" button, which then executes al_reset_alarm().
However, it is possible to call al_reset_alarm() directly from user code to
achieve the same thing.
  160   13 Oct 2004 Konstantin OlchanskiSuggestionNo al_clear_alarm()?
We have al_trigger_alarm(), but no matching al_clear_alarm(), and I need it to
clear my alarm once the alarm condition no longer exists. Any objections if I
add this function? K.O.
  159   13 Oct 2004 Stefan RittBug Reportsilly odbedit "rename Display xxx/yyy"
> odbedit command "rename Display xxx/yyy" creates a key named "xxx/yyy" (yes,
> with a slash in the name) and this key cannot be deleted or renamed...
> K.O.

"rename" is "rename", not "mv" under Unix. If you want this functionality, put it
in and don't complain!
  158   13 Oct 2004 Konstantin OlchanskiBug Reportsilly odbedit "rename Display xxx/yyy"
odbedit command "rename Display xxx/yyy" creates a key named "xxx/yyy" (yes,
with a slash in the name) and this key cannot be deleted or renamed...
K.O.
  157   13 Oct 2004 Stefan RittBug Reportdb_paste: found string exceeding MAX_STRING_LENGTH
Can you attach 

/twist/data_onl/current/run17548.odb

so I can reproduce the problem?
  156   13 Oct 2004 Konstantin OlchanskiBug Reportdb_paste: found string exceeding MAX_STRING_LENGTH
I am updating TWIST to the latest MIDAS and when I load a saved .odb file, I get
these messages. Their text ought to say where and what strings it does not like.
K.O.



[twistonl@midtwist ~/online]$ odbedit
Please define environment variable 'MIDASSYS'
pointing to the midas installation directory.
[local:twist:S]/>load /twist/data_onl/current/run17548.odb
[odb.c:5600:db_paste] found string exceeding MAX_STRING_LENGTH
[odb.c:5600:db_paste] found string exceeding MAX_STRING_LENGTH
[odb.c:5600:db_paste] found string exceeding MAX_STRING_LENGTH
  155   08 Oct 2004 Konstantin OlchanskiInfoIncreased number of clients in midas.h, important!
>    A lot of programs have a commandline option, such as "--version", where they return the program version number then exit.  As well as the
> program version number, the version number of the midas library it's linked with could also be returned (There can be more than one
> libmidas.so on a system and this would show which one was currently being linked)

This would solve the versioning problem for midas built from versionned tarballs, and I am considering a similar scheme for midas installed from
RPMs. But what do I do for midas built from CVS?!? K.O.
  154   08 Oct 2004 chris pearsonInfoIncreased number of clients in midas.h, important!
> > For one thing, looking at a given midas-using executable, how do I tell what version of midas it has inside?
> 
> Ther is a function cm_get_version() returning the version. As for the executable, all you can do is a
> 
> strings <executable> | grep 1.9

   A lot of programs have a commandline option, such as "--version", where they return the program version number then exit.  As well as the
program version number, the version number of the midas library it's linked with could also be returned (There can be more than one
libmidas.so on a system and this would show which one was currently being linked)

   Something I would find useful would be for the version number to identify precisely which version you have, i.e. not to have different
versions of midas given the same version number.  I've had problems earlier this year due to midas-1.9.3 changing several times between
January and July, while keeping the same number.  I think if "in-between" versions of midas are to be made available, they should contain a
revision number or date or something in the version number to identify them.

> For remote connections (through mserver), there is already a version check. If the minor version differs, you get a warning, if the major
> versions differ (1.>>9<<.4), the client won't start. So at least for remote connection you get a clue.

   Safety measures like these can sometimes get in the way, if you know what you're doing.  So unless there is absolutely no possibility of
success, I think checks such as this one should be overrideable (by a client option).

Chris
  153   04 Oct 2004 Stefan RittInfoIncreased number of clients in midas.h, important!
> Right. We cannot fix the past, but we should fix the future. BTW, "do not mix versions" is hard to enforce and mismatches did, do and
> will happen

For remote connections (through mserver), there is already a version check. If the minor version differs, you get a warning, if the major
versions differ (1.>>9<<.4), the client won't start. So at least for remote connection you get a clue.

> For one thing, looking at a given midas-using executable, how do I tell what version of midas it has inside?

Ther is a function cm_get_version() returning the version. As for the executable, all you can do is a

strings <executable> | grep 1.9
ELOG V3.1.4-2e1708b5