Back Midas Rome Roody Rootana
  Midas DAQ System, Page 125 of 137  Not logged in ELOG logo
ID Date Author Topicdown Subject
  667   02 Nov 2009 Exaos LeeBug FixBuild error due to missing header
I encountered a build error as "sort undefined...". It is caused by missing C++ header <algorithm> in which "sort" is defined. It can be fixed as the attachment.

Environment:
G++: 4.3.4
Platform: Debian Linux testing

> I committed an updated lazylogger with updated documentation. The new version supports subruns and
> can save to external storage arbitrary files (i.e. odb dump files). It also moves most book keeping out of
> odb to permit handling more files on bigger storage disks.
>
> Example lazylogger scripts for castor (CERN) and dcache (TRIUMF) are in the directory "utils".
>
> The lazylogger documentation was updated to remove obsolete information and to describe the new
> functions. As usual "make dox; cd doxfiles/html; firefox index.html" or see my copy at:
>
> http://ladd00.triumf.ca/~olchansk/midas/Utilities.html#lazylogger_task
>
> svn rev 4615, 4616.
> K.O.
Attachment 1: lazylogger.diff
diff --git a/src/lazylogger.c b/src/lazylogger.c
index 34be17a..a7ba4a3 100644
--- a/src/lazylogger.c
+++ b/src/lazylogger.c
@@ -17,6 +17,7 @@ $Id$
 
 #include <vector>
 #include <string>
+#include <algorithm>
 
 #define NOTHING_TODO  0
 #define FORCE_EXIT    1
  671   20 Nov 2009 Konstantin OlchanskiBug Fixfix odb corruption from too long client names
odb.c rev 4622 fixes ODB corruption by db_connect_database() if client_name is
too long. Also fixed is potential ODB corruption by too long key names in
db_create_key(). Problem kindly reported by Tim Nichols of T2K/ND280 experiment.
K.O.
  672   20 Nov 2009 Konstantin OlchanskiBug Fixdisallow client names with slash '/' characters
> odb.c rev 4622 fixes ODB corruption by db_connect_database() if client_name is
> too long. Also fixed is potential ODB corruption by too long key names in
> db_create_key(). Problem kindly reported by Tim Nichols of T2K/ND280 experiment.


Related bug fix - db_connect_database() should not permit client names that contain
the slash (/) character. Names like "aaa/bbb" create entries /Programs/aaa/bbb (aaa
is a subdirectory) and names like "../aaa" create entries in the ODB root directory.

svn rev 4623.
K.O.
  675   25 Nov 2009 Konstantin OlchanskiBug Fixsubrun file size
Please be aware of mlogger.c update rev 4566 on Sept 23rd 2009, when Stefan
fixed a buglet in the subrun file size computations. Before this fix, the first
subrun could be of a short length. If you use subruns, please update your
mlogger to at least rev 4566 (or newer, Stefan added the run and subrun time
limits just recently).
K.O.
  677   26 Nov 2009 Konstantin OlchanskiBug Fixmdump max number of banks and dump of 32-bit banks
By request from Renee, I increased the MIDAS BANKLIST_MAX from 64 to 1024 and
after fixing a few buglets where YB_BANKLIST_MAX is used instead of (now bigger)
BANKLIST_MAX, I can do a full dump of ND280 FGD events (96 banks).

I also noticed that "mdump -b BANK" did not work, it turns out that it could not
handle 32bit-banks at all. This is now fixed, too.

svn rev 4624
K.O.
  679   26 Nov 2009 Konstantin OlchanskiBug Fixmserver network routing fix
mserver update svn rev 4625 fixes an anomaly in the MIDAS RPC network code where
in some network configurations MIDAS mserver connections work, but some RPC
transactions, such as starting and stopping runs, do not (use the wrong network
names or are routed over the wrong network).

The problem is a possible discrepancy between network addresses used to
establish the mserver connection and the value of "/System/Clients/xxx/Host"
which is ultimately set to the value of "hostname" of the remote client. This
ODB setting is then used to establish additional network connections, for
example to start or stop runs.

Use the client "hostname" setting works well for standard configurations, when
there is only one network interface in the machine, with only one IP address,
and with "hostname" set to the value that this IP address resolves to using DNS.

However, if there are private networks, multiple network interfaces, or multiple
network routes between machines, "/System/Clients/xxx/Host" may become set to an
undesirable value resulting in asymmetrical network routing or complete failure
to establish RPC connections.

Svn rev 4625 updates mserver.c to automatically set "/System/clients/xxx/Host"
to the same network name as was used to establish the original mserver connection.

As always with networking, any fix always breaks something somewhere for
somebody, in which case the old behavior can be restored by "setenv
MIDAS_MSERVER_DO_NOT_USE_CALLBACK_ADDR 1" before starting mserver.

The specific problem fixed by this change is when the MIDAS client and server
are on machines connected by 2 separate networks ("client.triumf.ca" and
"client.daq"; "server.triumf.ca" and "server.daq"). The ".triumf.ca" network
carries the normal SSH, NFS, etc traffic, and the ".daq" network carries MIDAS
data traffic.

The client would use the "server.daq" name to connect to the server and this
traffic would go over the data network (good).

However, previously, the client "/System/Clients/xxx/Host" would be set to
"client.triumf.ca" and any reverse connections (i.e. RPC to start/stop runs)
would go over the normal ".triumf.ca" network (bad).

With this modification, mserver will set "/System/Clients/xxx/Host" to
"client.daq" (the IP address of the interface on the ".daq" network) and all
reverse connections would also go over the ".daq" network (good).

P.S. This modification definitely works only for the default "mserver -m" mode,
but I do not think this is a problem as using "-s" and "-t" modes is not
recommended, and the "-s" mode is definitely broken (see my previous message).

svn rev 4625
K.O.
  744   15 Feb 2011 Konstantin OlchanskiBug Fixmlogger stop run on disk full!
The mlogger has a function for detecting when the output disk becomes full - when this condition is 
detected, the run should be stopped. But this did not work if disk is already full and the user tries to start 
a run - the "disk full?" check happened too early and the attempt to stop the run was not succeeding 
because the original start-run transition is still running. Now if "disk full" condition is detected, mlogger 
tries to stop the run every 10 seconds until the run is finally stopped (or dies because disk is full).

mlogger.c svn rev 4976
K.O.
  775   10 Jul 2011 Konstantin OlchanskiBug Fixmidas shared memory changes
> > 2) the shared memory type used by an experiment is recorded in the file .SHM_TYPE.TXT.
> > 3) the hostname of the computer where the ODB shared memory is meant to reside is now
> > recorded in the file .SHM_HOST.TXT.

Due to a typo in src/system.c svn rev 5125, ss_shm_delete() did not work at all. This broke "odbedit -R", "odbedit -s 5000000" (to change ODB size), etc. 
Fixed in src/system.c svn rev 5134. (It is safe to update just tis one file to fix this problem).

Sorry for the inconvenience,
K.O.
  776   11 Jul 2011 Konstantin OlchanskiBug Fixmidas shared memory changes
> > > 2) the shared memory type used by an experiment is recorded in the file .SHM_TYPE.TXT.
> > > 3) the hostname of the computer where the ODB shared memory is meant to reside is now
> > > recorded in the file .SHM_HOST.TXT.


Because the mserver did not setup correct experiment name and path, POSIX shared memory did not work at all when used with the mserver. Fixed in mserver.c rev 5135


Sorry for the inconvenience,
K.O.
  828   16 Aug 2012 Cheng-Ju LinBug Fixlaunching roody kills the analyzer
OK, I've found the solution in the roody forum.  The solution for 64bit machine is to replace
   uint32_t p =0;
   with
   uintptr_t p =0;

in the roody header file roody/include/DataSourceTNetFolder.h

Cheng-Ju



> Hi All,
> 
> I've installed midas (Rev:5294) on SLC6.3 (64bit), along with recent trunk versions of rootana and roody. 
> All the packages compiled OK. The example code in $MIDASSYS/examples/experiment also runs OK 
> provided that I don't launch roody. If I try to launch roody, then it immediately crashes the analyzer with 
> the following trace:
> 
> #6 root_server_thread (arg=ox7f54fc001150) at src/mana.c:5154
> #7 0x0000003219a1e13a in TThread::Function(void*) () from /usr/lib64/root/libThread.so.5.28
> #8 0x0000003dd1207851 in start_thread () from /lib64/libpthread.so.0
> #9 0x0000003dd0ee76dd in clone () from /lib64/libc.so.6
> 
> The line src/mana.c:5154 points to the following:
> 
> TObject *obj;
>             if (strncmp(request + 10, "Any", 3) == 0)
>                obj = folder->FindObjectAny(request + 14);
>             else
>                obj = folder->FindObject(request + 11);    // LINE 5154
> 
> 
> Any suggestions on what may be going on here?  Thanks.
> 
> 
> Cheng-Ju
  838   27 Sep 2012 Randolf PohlBug Fix[PATCH] mana.c compile fix, gz files
Hi,

I had to apply the attached patch to convince SuSE Linux 12.2 to compile mana.c
gcc version is "(SUSE Linux) 4.6.2"

Problem is that gz{write,close, etc.} expect a 1st argument of type gzFile (see
zlib.h), whereas out_file is FILE*. In fact, out_file is a cast to FILE*, even
in the case when we work on a gzfile (HAVE_ZLIB).

Could you please confirm that the patch is correct, and possibly apply it to trunk?

I haven't checked if mana works as advertised now.

Cheers,


Randolf
Attachment 1: diff.mana
Index: src/mana.c
===================================================================
--- src/mana.c	(revision 5334)
+++ src/mana.c	(working copy)
@@ -1987,7 +1987,7 @@
       } else {
 #ifdef HAVE_ZLIB
          if (out_gzip)
-            gzclose(out_file);
+            gzclose((gzFile)out_file);
          else
 #endif
             fclose(out_file);
@@ -2311,7 +2311,7 @@
    /* write record to device */
 #ifdef HAVE_ZLIB
    if (out_gzip)
-      status = gzwrite(file, buffer, size) == size ? SS_SUCCESS : SS_FILE_ERROR;
+      status = gzwrite((gzFile)file, buffer, size) == size ? SS_SUCCESS : SS_FILE_ERROR;
    else
 #endif
       status =
@@ -2430,7 +2430,7 @@
    /* write record to device */
 #ifdef HAVE_ZLIB
    if (out_gzip)
-      status = gzwrite(file, pevent_copy, size) == size ? SUCCESS : SS_FILE_ERROR;
+      status = gzwrite((gzFile)file, pevent_copy, size) == size ? SUCCESS : SS_FILE_ERROR;
    else
 #endif
       status =
@@ -4119,7 +4119,7 @@
             size = pevent->data_size + sizeof(EVENT_HEADER);
 #ifdef HAVE_ZLIB
             if (out_gzip)
-               status = gzwrite(out_file, pevent, size) == size ? SUCCESS : SS_FILE_ERROR;
+               status = gzwrite((gzFile)out_file, pevent, size) == size ? SUCCESS : SS_FILE_ERROR;
             else
 #endif
                status =
@@ -4390,7 +4390,7 @@
       } else {
 #ifdef HAVE_ZLIB
          if (out_gzip)
-            gzclose(out_file);
+            gzclose((gzFile)out_file);
          else
 #endif
             fclose(out_file);
  839   09 Oct 2012 Stefan RittBug Fix[PATCH] mana.c compile fix, gz files
> Hi,
> 
> I had to apply the attached patch to convince SuSE Linux 12.2 to compile mana.c
> gcc version is "(SUSE Linux) 4.6.2"
> 
> Problem is that gz{write,close, etc.} expect a 1st argument of type gzFile (see
> zlib.h), whereas out_file is FILE*. In fact, out_file is a cast to FILE*, even
> in the case when we work on a gzfile (HAVE_ZLIB).
> 
> Could you please confirm that the patch is correct, and possibly apply it to trunk?
> 
> I haven't checked if mana works as advertised now.
> 
> Cheers,
> 
> 
> Randolf

I applied your patch to the trunk.

Best,
Stefan
  885   10 May 2013 Konstantin OlchanskiBug FixFixed: crash if alarm "write elog message" is enabled
If the MIDAS Alarm property "write elog message" is enabled, an uninitialized variable "tag" is passed to 
el_submit() and depending on your luck, cause a crash. "tag" is supposed to be and is now a NUL-
terminated string. The only other use of el_submit() is in mhttpd.cxx and mserver.c, where it is called 
correctly.

alarm.c svn rev 5361
K.O.
  897   02 Aug 2013 Konstantin OlchanskiBug Fixmultithreaded run transitions work!
As of commit
https://bitbucket.org/tmidas/midas/commits/dfa5fb1a93cae11a2960d441044c7fd277e1f0ec
(we are now liberated from the tyranny of SVN IDs),
multithreaded run transitions seem to work reliably and are now the default in mhttpd.

In odbedit and mtransition, the default is the old sequential transitions. "-m" and "-a" flags activate 
the new multithread run transitions. mhttpd now uses the equivalent of "mtransition -a" 
(mutithreaded asynchronous).

This is one of the new features implemented by Stefan while at TRIUMF.

K.O.

(We hope to write up all the recent changes soon).
  900   26 Aug 2013 Konstantin OlchanskiBug FixEnable cross-site requests in mhttpd
Javascript "AJAX" functions (and their MIDAS wrappers - ODBGet/ODBSet) are subject to something called 
"same origin policy" intended to prevent something called "cross-site scripting attacks", i.e. see
http://en.wikipedia.org/wiki/Same-origin_policy

In practice it means that if you load the MIDAS custom web page from test.foo.com and try to access 
mhttpd at midas.foo.com, ODBSet/ODBGet will not work.

I always thought that this meant that the requests are blocked by the browser and are a form of 
protection of the web server - only scripts loaded from mhttpd can do AJAX (ODBGet/ODBSet) to mhttpd.

It turns out that I was wrong. This is what actually happens: the "cross-site" requests are still sent to the 
server (mhttpd), the response it received, parsed and discarded if "same origin" conditions are not met.

This means that the "same origin" policy does not protect mhttpd at all - any script from any page 
anywhere can issue AJAX requests into any mhttpd, these requests will be successfully sent, received
and processed by mhttpd, including requests for writing into ODB ("jset" command using the HTTP GET 
method).

So for the case of MIDAS, "same origin" does not prevent malicious (or buggy) scripts from writing into the 
wrong mhttpd of the wrong experiment.

All it does is prevent desired and intentional access to mhttpd (ODBGet) from scripts that happen to have 
been loaded outside of mhttpd (i.e. from a developer own test page).

Then it turns out that there is an "official" way to disable this unwanted protection policy, called CORS, see
http://www.w3.org/TR/cors/

I have now implemented this in mhttpd and added an mhttpd.js function ODBSetURL() to explicitly set the 
URL of mhttpd that we want to talk to.

This work is on the feature/ajax branch, to be merged soon. For the impatient, here is what you need to 
do in mhttpd:

diff --git a/src/mhttpd.cxx b/src/mhttpd.cxx
index 1d9d1cc..0460cec 100755
--- a/src/mhttpd.cxx
+++ b/src/mhttpd.cxx
@@ -1070,6 +1070,7 @@ void show_text_header()
 {
    rsprintf("HTTP/1.0 200 Document follows\r\n");
    rsprintf("Server: MIDAS HTTP %d\r\n", mhttpd_revision());
+   rsprintf("Access-Control-Allow-Origin: *\r\n");
    rsprintf("Pragma: no-cache\r\n");
    rsprintf("Expires: Fri, 01 Jan 1983 00:00:00 GMT\r\n");
    rsprintf("Content-Type: text/plain; charset=iso-8859-1\r\n\r\n");

K.O.
  921   25 Oct 2013 Konstantin OlchanskiBug Fixfixed mlogger run auto restart bug
A problem existed in midas for some time: when recording long data sets of time (or event) limited runs 
with logger run auto restart set to "yes", the runs will automatically stop and restart as expected, but 
sometimes the run will stop and never restart and beam will be lost until the experiment operator on shift 
wakes up and restarts the run manually.

I have now traced this problem to a race condition inside the mlogger - when a run is being stopped from 
the mlogger, the mlogger run transition handler (tr_stop) triggers an immediate attempt to start the next 
run, without waiting for the run-stop transition to actually complete. If the run-stop transition does not 
finish quickly enough, a safety check in start_the_run() will cause the run restart attempt to silently fail 
without any error message.

This race condition is pretty rare but somehow I managed to replicate it while debugging the 
multithreaded transitions. It is fixed by making mlogger wait until the run-stop transition completes.

https://bitbucket.org/tmidas/midas/commits/b2631fbed5f7b1ec80e8a6c8781ada0baed7702b

K.O.
  923   25 Oct 2013 Stefan RittBug Fixfixed mlogger run auto restart bug
> A problem existed in midas for some time: when recording long data sets of time (or event) limited runs 
> with logger run auto restart set to "yes", the runs will automatically stop and restart as expected, but 
> sometimes the run will stop and never restart and beam will be lost until the experiment operator on shift 
> wakes up and restarts the run manually.
> 
> I have now traced this problem to a race condition inside the mlogger - when a run is being stopped from 
> the mlogger, the mlogger run transition handler (tr_stop) triggers an immediate attempt to start the next 
> run, without waiting for the run-stop transition to actually complete. If the run-stop transition does not 
> finish quickly enough, a safety check in start_the_run() will cause the run restart attempt to silently fail 
> without any error message.
> 
> This race condition is pretty rare but somehow I managed to replicate it while debugging the 
> multithreaded transitions. It is fixed by making mlogger wait until the run-stop transition completes.
> 
> https://bitbucket.org/tmidas/midas/commits/b2631fbed5f7b1ec80e8a6c8781ada0baed7702b
> 
> K.O.

More generally I kind of consider the mlogger auto restart facility as deprecated. It works in the background and the operator does not have a clue 
what is going on. We use now the sequencer to achieve exactly the same functionality. It just requires a few lines of sequencer code:

LOOP INFINITE 
  TRANSITION start 
  WAIT events, 5000 
  TRANSITION stop 
ENDLOOP 

So the run start is only executed after the runs has been successfully stopped. You can do things in the sequencer like "stop run and sequence 
immediately" or "stop after current run has finished" which are a bit hard to do with the old method. So people should move to the sequencer.

/Stefan
  924   28 Oct 2013 Konstantin OlchanskiBug Fixfixed mlogger run auto restart bug
> 
> More generally I kind of consider the mlogger auto restart facility as deprecated. It works in the background and the operator does not have a clue 
> what is going on. We use now the sequencer to achieve exactly the same functionality.
>

Before subruns were available, most experiments at TRIUMF have used the "auto restart" function. Now, I think most of them use subruns,
with the notable exception of PIENU where the analysis framework could not handle subruns. (PIENU is now shutdown and disassembled).

>
> It just requires a few lines of sequencer code:
> 
> LOOP INFINITE 
>   TRANSITION start 
>   WAIT events, 5000 
>   TRANSITION stop 
> ENDLOOP 
> 

Mouse click "auto restart" to "yes" is a little bit simpler than setting up a sequencer file, and it survives a crash of mhttpd.

Does the sequencer survive a crash or a restart of mhttpd?

K.O.
  925   28 Oct 2013 Stefan RittBug Fixfixed mlogger run auto restart bug
> Does the sequencer survive a crash or a restart of mhttpd?

Yes. Of course runs will not be started/stopped when mhttpd is not running, but when you restart it gracefully continues where it stopped, since all variables such as event count or current line number of 
the sequence are store in the ODB.

/Stefan
  943   16 Dec 2013 Konstantin OlchanskiBug FixAbolished SYNC and ASYNC defines
A few months ago, definitions of SYNC and ASYNC in midas.h have been changed away from "0" and "1", 
and this caused problems with some event buffer management functions bm_xxx().

For example, when event buffers are getting full, bm_send_event(SYNC) unexpectedly started returning 
BM_ASYNC_RETURN instead of waiting for free space, causing unexpected crashes of frontend programs.

Part of the problem was confusion between SYNC/ASYNC used by buffer management (bm_xxx) and by run 
transition (cm_transition()) functions. Adding to confusion, documentation of bm_send_event() & co used 
FALSE/TRUE while most actual calls used SYNC/ASYNC.

To sort this out, an executive decision was made to abolish the SYNC/ASYNC defines:

For buffer management calls bm_send_event(), bm_receive_event(), etc, please use:
SYNC -> BM_WAIT
ASYNC -> BM_NO_WAIT

For run transitions, please use:
SYNC -> TR_SYNC
ASYNC -> TR_ASYNC
MTHREAD -> TR_MTHREAD
DETACH -> TR_DETACH

K.O.
ELOG V3.1.4-2e1708b5