Back Midas Rome Roody Rootana
  Midas DAQ System, Page 111 of 138  Not logged in ELOG logo
ID Date Author Topic Subject
  547   01 Jan 2009 Konstantin OlchanskiInfoCustom page which executes custom function
> How can I add a button at the top of the "Status" webpage which will show a 
> page similar to the "CNAF" one after I click on it? and how can I make a 
> custom page similar to "CNAF" which allow me to call some custom funtions? I 
> want to make a page which is particularly for doing calibration.


I was going to say that you can do this by using the MIDAS "hot-link" function.

In your equipment program, you create a string /eq/xxx/Settings/Command, and hot-link
it to the function you want to be called. (See midas function db_open_record() for details
and examples). (To test it, you put a call to printf("Hello world!\n") into your handler function,
then change the value of "command" using odbedit or the mhttpd odb editor
and observe that your function gets called and that it receives the correct value of "command").

Then on your custom web page you create 2 buttons "aaa" and "bbb" attached to javascript
ODBset("/eq/xxx/Settings/Command","aaa") and "bbb" respectively. When you push the button,
the specified string is written into ODB, and your hot-link handler function is called with the contents
of "command", which you can then look at to find out which web button was pushed.


But after looking at the hot-link data paths (see https://ladd00.triumf.ca/elog/Midas/546), I see 2 
problems that make the above scheme unreliable and maybe unusable in some applications:

1) the data path contains one UDP communication and it is well known that UDP datagrams can be (and 
are) lost with low or high probability, depending on not-well-understood external factors.

The effect is that the hot-link fails to "fire": odb contents is changed but your function is not called.

2) there is a timing problem with multiple odb writes: the odb lock is dropped before the "hot-link" gets 
to see the new contents of odb: db_data_set()->lock odb->change data->send notification->unlock 
odb->xxx->notification received by client->read the data->call user function. If something else is 
written into odb during "xxx" above, the client may never see the data written by the first odb write. For 
local clients, the delay between "send notification" and "notification is received by client" is not bounded in 
time (can be arbitrary long, depending on the system load, etc). For remote clients, there is an additional 
delay as the udp datagram is received by the local mserver and is forwarded to the remote client through 
a tcp rpc connection (another source of unbounded delay).

The effect is that if buttons "aaa" and "bbb" are pushed quickly one right after the other, while your 
function will be called 2 times (if neither udp packet is dropped), you may never see the value of "aaa"
as is it will be overwritten by "bbb" by the time you receive the first notification.

Probability of malfunction increases with code written like this: { ODBset("command", "open door"); 
ODBset("command", "walk through doorway"); }. You may see the "open door" command sometimes 
mysteriously disappear...

The net effect is that sometimes you will push the button but nothing will happen. This may be okey,
depending on your application and depending on how often it happens in practice on your specific system 
If you are lucky, you may never see either of the 2 problems listed above ad hot-links will work for you 
perfectly. At TRIUMF, in the past, we have seen hot-links misbehave in the TWIST experiment, and now I 
think I understand why (because of the 2 problems described above).

K.O.
  546   01 Jan 2009 Konstantin OlchanskiInfoodb "hot link" magic explored
Here are my notes on the MIDAS ODB "hot link" function. Perhaps others can find them useful.

Using db_open_record(key,function), the user can tell MIDAS to call the specified user function when 
the specified ODB key is modified by any other MIDAS program. This function works both locally 
(shared memory odb access) and remotely (odb access through mserver tcp rpc). For example, the 
MIDAS "history" mechanism is implemented in the mlogger by "hot-linking" ODB 
"/equipment/xxx/Variables".

First, the relevant data structures defined in midas.h and msystem.h (ODB database headers, etc)

(in midas.h)
#define NAME_LENGTH            32            /**< length of names, mult.of 8! */
#define MAX_CLIENTS            64            /**< client processes per buf/db */
#define MAX_OPEN_RECORDS       256           /**< number of open DB records   */

(in msystem.h)
DATABASE buf <--- local, private to each client)
  DATABASE_HEADER* database_header <--- odb in shared memory
    char name[NAME_LENGTH]
    DATABASE_CLIENT client[MAX_CLIENTS]
      char name[NAME_LENGTH]
      OPEN_RECORD open_record[MAX_OPEN_RECORDS]
        handle
        access_mode
        flags

(the above means that each midas client has access to the list of all open records through
buf->database_header.client[i].open_record[j])

Second, the data path through db_set_data & co: (other odb "write" functions work the same way)

db_set_data(key)
  lock db
  update odb <--- memcpy(), really
  db_notify_clients(key)
  unlock db
  return

db_notify_clients(key)
  loop: <--- data for this key changed and so data for all keys containing it
             also changed, and we need to notify anybody who has an open record
             on the parents of this key. need to loop over parents of this key (follow "..")
  if (key->notify_count)
    foreach client
      foreach open_record
        if (open_record.handle == key)
          ss_resume(client->port, "O hDB hKey")
  key = key.parent
  goto loop;

ss_resume(port, message)
  idx = ss_suspend_get_index()  <--- magic here
  send udp message ("O hDB hKey") to localhost:port <-- notifications sent only to local host!

note 1: I do not completely understand the ss_suspend_xxx() stuff. The best I can tell
is it creates a number of udp sockets bound to the local host and at least one udp rpc
receive socket ultimately connected to the cm_dispatch_rpc() function.

note 2: More magic here: database_header->client[i].port appears to be the udp rpc server
port of the mserver, while ODB /Clients/xxx/Port is the tcp rpc server port
of the client itself, on the remote host

note 3: the following is for remote odb clients connected through the mserver. For local
clients, cm_dispatch_rpc() calls the local db_update_record() as shown at the very end.

note 4: this uses udp rpc. If the udp datagram is lost inside the os kernel (it looks like these udp/rpc 
datagrams never go out to the network), "hot-link" silently fails: code below is not executed. Some 
OSes (namely, Linux) are known to lose udp datagrams with high probability under certain
not very well understood conditions.

local mserver receives the udp datagram
  ...
  cm_dispatch_ipc()
    if (message=="O hDB hKey")
      decode message (hDB, hKey)
      db_update_record(hDB, hKey)
        send tcp rpc with args(MSG_ODB, hDB, hKey)

(note- unlike udp rpc, tcp rpc are never "lost")

remote client receives tcp rpc:
rpc_client_dispatch()
  recv_tcp(net_buffer)
  if (net_buffer.routine_id == MSG_ODB)
    db_update_record(hDB, hKey)

db_update_record(hDB, hKey)
  if remote delivery, see cm_dispatch_ipc() above
  <--- local delivery
  foreach (_recordlist)
    if (recordlist.handle == hKey)
      if (!recordlist.access_mode&MODE_WRITE)
        db_get_record(hDB,hKey,recordlist.data,recordlist.size)
        recordlist.dispatcher(hDB,hKey,recordlist.info); <-- user-supplied handler

Note: the dispatcher() above is the function supplied by the user in db_open_record().

K.O.
  545   22 Dec 2008 Stefan RittBug ReportOverflow on "cm_msg" command generates segfault
> The following error has been reported to me by T2K colleagues:
> 
> When using  "odbedit -c "msg my_message", the following behavior 
> has been observed depending on the length "n" of the message. 
> 
> 1)  n < 100        All is well
> 2)  100 <= n < 245 Log not written but exit code = 0
> 3)  245 <= n < 280 Error: "Experiment not defined" and exit code = 1
> 4)  280 <= n       Error: "Cannot connect to remote host" and exit code = 1
> 
> Also, when logging from compiled C code - when messages reach some magic length
> the MIDAS client sending them segfaults.
> 
> Please fix

Uhhh, who wants this long messages? You should consider to split this into several 
smaller messages. Anyhow, having the above behavior is not good, so I fixed it in 
SVN revision 4422. I increased the maximum length to 1000 characters. Above that, 
the message gets truncated. If you need even more, we can make it a #define.

The second problem you describe (logging from compiled C code) I could not 
reproduce, so maybe it was related to the first one. Please try again and report 
if it persists.
  544   21 Dec 2008 Konstantin OlchanskiBug Fixmhttpd minor bug fixes and improvements
Committed minor bug fixes and improvements to mhttpd:
1) when generating history plots, use type "double" instead of "float" because "float" does not have enough 
significant digits to plot values of large integer numbers. For example, serial numbers of T2K FGD FEB 
cards are large integers, i.e. 99000001, 99000002, etc, but when we plot them with offset "-99000000", 
the plots show "0" for all cards because when these numbers are converted to "float", they are truncated to 
about 5 digits and the least significant digit (the only one of interest, the "1", "2", etc) is lost. Switching to 
type "double" makes the plots come out with correct values.
2) fixed breakage of "/History/URL" ODB setting used to offload generation of history plots to a separate 
mhttpd process, greatly improving responsiveness of the main mhttpd.
3) fixed memory leak in processing the new javascript requests (jset, jget & co).
svn revisions 4415-4417
K.O.
  543   17 Dec 2008 Renee PoutissouBug ReportOverflow on "cm_msg" command generates segfault
The following error has been reported to me by T2K colleagues:

When using  "odbedit -c "msg my_message", the following behavior 
has been observed depending on the length "n" of the message. 

1)  n < 100        All is well
2)  100 <= n < 245 Log not written but exit code = 0
3)  245 <= n < 280 Error: "Experiment not defined" and exit code = 1
4)  280 <= n       Error: "Cannot connect to remote host" and exit code = 1

Also, when logging from compiled C code - when messages reach some magic length
the MIDAS client sending them segfaults.

Please fix
  542   14 Dec 2008 Stefan RittInfoCustom page which executes custom function
> How can I add a button at the top of the "Status" webpage which will show a 
> page similar to the "CNAF" one after I click on it? and how can I make a 
> custom page similar to "CNAF" which allow me to call some custom funtions? I 
> want to make a page which is particularly for doing calibration.

The CNAF page calls directly functions through the RPC layer of midas, which is 
not possible from custom pages. All you can do is to execute a scrip on the 
server side, which then causes some action. For details please consult the 
documentation.
  541   12 Dec 2008 Jimmy NgaiInfoCustom page which executes custom function
Dear All,

How can I add a button at the top of the "Status" webpage which will show a 
page similar to the "CNAF" one after I click on it? and how can I make a 
custom page similar to "CNAF" which allow me to call some custom funtions? I 
want to make a page which is particularly for doing calibration.

Thank you for your attention!

Best Regards,
Jimmy Ngai
  540   02 Dec 2008 Stefan RittBug FixFix ss_file_size() on 32-bit Linux

K.O. wrote:
This does not work (observe the typoe in the #ifdef).


Sorry for that, I fixed and committed it.


K.O. wrote:
But you cannot know this because you already deleted the test program I wrote and committed to svn exactly to detect and prevent this kind of breakage (+ plus to give the Solaris, BSD and other wierdo users some way to check that ss_file_size() works on their systems)..


Well, you figured it out even without the test program in the distribution! But I'm sure no other user would have known how to use your test program to diagnose this problem. So 99% of the users would scratch their head about this undocumented program and get confused. I believe we two are responsible that the midas kernel functions work correctly and the average user should not have to bother with it. I agree that it's handy for you to have this little test program in the distribution, so you can run it everywhere you install midas. But for me it would be handy to have files with, let's say, nature's constants, particle decay life times, list of ASCII codes, and so on. But it would clutter up the distribution and the disadvantage of annoying users would be bigger than my personal benefit, so I don't do it.

If you absolutely want to keep a certain test functionality, you can add it into a "central" test program, write some help and documentation for it, educate users how to use it and how to report any errors back to you. Maybe some printout like "all tests ok" and some specific comment if a test fails would be helpful for the normal user. This test program could then also contain other tests like C structure alignment (which sometimes is a problem), some mutex tests and whatever we collected along the road. An alternative would be to add this into a "test" command inside odbedit.
  539   02 Dec 2008 Konstantin OlchanskiBug FixFix ss_file_size() on 32-bit Linux
> > I now fixed this problem by using the stat64() system call for "#ifdef OS_LINUX".
> That does not work if _LARGEFILE64_SOURCE is not defined.
> #ifdef _LARGEFILE64_SOURE
>    struct stat64 stat_buf;

This does not work (observe the typoe in the #ifdef). But you cannot know this because
you already deleted the test program I wrote and committed to svn exactly to detect and
prevent this kind of breakage (+ plus to give the Solaris, BSD and other wierdo users
some way to check that ss_file_size() works on their systems).

K.O.
  538   02 Dec 2008 Stefan RittBug FixFix ss_file_size() on 32-bit Linux
> I now fixed this problem by using the stat64() system call for "#ifdef OS_LINUX".

That does not work if _LARGEFILE64_SOURCE is not defined. In that case, the compiler 
complains that stat64 is undefined. Since many Makefiles for front-ends out there do 
not have _LARGEFILE64_SOURCE defined, I changed system.c so that stat64 is only used 
if that flag is defined:

#ifdef _LARGEFILE64_SOURE
   struct stat64 stat_buf;
   int status;

   /* allocate buffer with file size */
   status = stat64(path, &stat_buf);
   if (status != 0)
      return -1;
   return (double) stat_buf.st_size;
#else
   ...
  537   01 Dec 2008 Randolf PohlBug Reportgcc warning in melog.c for midas 4401
Hi all,

I have just compiled midas 4401 using SuSE 11.0.
gcc is some odd SuSE version:
gcc version 4.3.1 20080507 (prerelease) [gcc-4_3-branch revision 135036] (SUSE
Linux) 

Anyway, gcc stumbled over melog.c. I don't see the reason myself, but my
experience is that gcc is usually right when complaining about "array subscript
is above array bounds". So, just in case somebody knowlegeable wants to have a
look at this....

Cheers,

Randolf

The gcc output:

[...]
cc -g -O3 -Wall -Wuninitialized -Iinclude -Idrivers -I../mxml -Llinux/lib
-DINCLUDE_FTPLIB   -D_LARGEFILE64_SOURCE -DHAVE_MYSQL -I/usr/include/mysql
-DHAVE_ROOT -pthread -m64 -I/usr/local/root/root_v5.20.00/include/root
-DHAVE_ZLIB -DOS_LINUX -fPIC -Wno-unused-function -o linux/bin/melog
utils/melog.c linux/lib/libmidas.a -lutil -lpthread -lz
utils/melog.c: In function 'submit_elog':
utils/melog.c:224: warning: array subscript is above array bounds
utils/melog.c:224: warning: array subscript is above array bounds
utils/melog.c:224: warning: array subscript is above array bounds
utils/melog.c:224: warning: array subscript is above array bounds
utils/melog.c:224: warning: array subscript is above array bounds
utils/melog.c:224: warning: array subscript is above array bounds
utils/melog.c:224: warning: array subscript is above array bounds
utils/melog.c:224: warning: array subscript is above array bounds
cc -g -O3 -Wall -Wuninitialized -Iinclude -Idrivers -I../mxml -Llinux/lib
-DINCLUDE_FTPLIB   -D_LARGEFILE64_SOURCE -DHAVE_MYSQL -I/usr/include/mysql
-DHAVE_ROOT -pthread -m64 -I/usr/local/root/root_v5.20.00/include/root
-DHAVE_ZLIB -DOS_LINUX -fPIC -Wno-unused-function -o linux/bin/mlxspeaker
utils/mlxspeaker.c linux/lib/libmidas.a -lutil -lpthread -lz
  536   01 Dec 2008 Stefan RittBug FixFix ss_file_size() on 32-bit Linux
> I also changed ss_file_size(), ss_disk_size() and ss_disk_free() to return -1 if
> the system call returns an error. I also added a test program
> utils/test_ss_file_size.c.

The test program gave under 64-bit SL5:

For [(null)], file size: -1, disk size: -0.001, disk free -0.001
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `/bin/ls -ld (null)'
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `/bin/df -k (null)'

Anyhow I guess that this test program just accidentally slipped into the repository.
Test programs for the developers should not be in the repository since they are of
not much use for the average user. If I would have added every test I made as an
individual test program, we would by now have tons of test programs making the whole
distribution pretty bulky, which nobody would know how to use now. So I removed the
test program again. If people do not agree, I suggest to make a central "main" test
program which combines all tests. I know there are also some C structure alignment
tests etc., which then could all be combined into a single, well documented, test
program.
  535   27 Nov 2008 Konstantin OlchanskiInfoFixed mlogger crash, was Per-variable history implementation in the mlogger
> revision 4142+4143 are minor fixes, refactoring (switch the code to use helper
> functions) and implementation of history for structured banks

The implementation of "history for structured banks" had a bug - tags inside
structured banks were counted incorrectly, leading to memory overwrites and mlogger
crash in open_history().

This is problem is now fixed (plus added assert() checks to crash-out if overwrite of
tags[] array is detected).

svn revision 4398.
K.O.
  534   27 Nov 2008 Konstantin OlchanskiInfolazylogger updated
lazylogger was updated to improve handling of the list of runs still on disk
(odb /Lazy/xxx/List).

Previously, each and every run was listed in the List arrays. With modern
Terabyte-sized data disks, many many days worth of runs tend to remain on disk
and these List arrays were getting too big, inflating the size of ODB dumps
written by mlogger into the output data file and slowing down starting and
stopping of runs considerably.

Now, the runs are listed as ranges of "first run" - "last run", (see example below).

This significantly reduces the size of the "List" arrays and makes lazylogger
usable for the ALPHA experiment at CERN and for T2K/ND280 prototype DAQ at
TRIUMF (writing to Castor and Dcache respectively, using the newly added
"Script" method).

The new List format is fully compatible with the old format and you can update
and run the new lazylogger without changing anything in ODB. New runs will be
added to the List arrays in the new format and data in the old format will
eventually go away as old runs are removed from disk.

svn revision 4394.
K.O.

Example: this reads like this:
range from 7100 to 7154
range from 7157 to 7161 (7155-7156 are missing)
range from 7163 to 7168 (7162 is missing)
runs 7170, 7173, 7176
range from 7179 to 7182
and so forth.

ODB /Lazy/Dcache/List
007100
[0] 7100 (0x1BBC)
[1] -7154 (0xFFFFE40E)
[2] 7157 (0x1BF5)
[3] -7161 (0xFFFFE407)
[4] 7163 (0x1BFB)
[5] -7168 (0xFFFFE400)
[6] 7170 (0x1C02)
[7] 7173 (0x1C05)
[8] 7176 (0x1C08)
[9] 7179 (0x1C0B)
[10] -7182 (0xFFFFE3F2)
[11] 7184 (0x1C10)
[12] 7188 (0x1C14)
[13] -7199 (0xFFFFE3E1)
007200
[0] 7200 (0x1C20)
[1] -7225 (0xFFFFE3C7)
  533   27 Nov 2008 Konstantin OlchanskiBug Reportlazylogger complains about zero-size files
I now have a better understanding of this: lazylogger uses ss_file_size() to find
out if a file exists or not. This function used to return 0 (probably) for
non-existant files (there was no check for error status from stat() system call,
so the return value for non-existant files was never well defined).

With ss_file_size() returning 0 for nonexistant files, 0-size files clearly cause
problems to lazylogger.

Now, since svn revision 4397, ss_file_size() returns -1 for non-existant files,
but lazylogger still needs to be tought about this.

The problem "lazylogger does not like 0-size files" remains for now.

K.O.


> With latest midas, I see this:
> 
> Thu Oct 14 19:31:17 2004 [Lazy_Tape] [lazylogger.c:1717:Lazy] lazy_file_exists
> file run17567.ybs doesn't exists
> Thu Oct 14 19:31:27 2004 [Lazy_Tape] [lazylogger.c:1717:Lazy] lazy_file_exists
> file run17567.ybs doesn't exists
> 
> The file run17567.ybs has size zero:
> 
> -rw-r--r--    1 twistonl users      950272 Oct 13 19:29
> /twist/data_onl/current/run17565.ybs
> -rw-r--r--    1 twistonl users      950272 Oct 13 19:45
> /twist/data_onl/current/run17566.ybs
> -rw-r--r--    1 twistonl users           0 Oct 13 20:00
> /twist/data_onl/current/run17567.ybs
> -rw-r--r--    1 twistonl users      983040 Oct 13 20:03
> /twist/data_onl/current/run17568.ybs
> -rw-r--r--    1 twistonl users      950272 Oct 13 20:26
> /twist/data_onl/current/run17569.ybs
> 
> I am not sure how to fix this lazylogger logic. Please help.
> 
> K.O.
  532   27 Nov 2008 Konstantin OlchanskiBug FixFix ss_file_size() on 32-bit Linux
It turns out that on 32-bit Linux, ss_file_size() returns the wrong answer for
files bigger than 2 GB (4GB?). The Linux stat() system call returns an error
(which is ignored) and bogus file size data (returned to the caller).

On 64-bit Linux (compiled with -m64), stat() appears to return correct data.

Related functions ss_disk_size() and ss_disk_free() return correct answers on
both 32-bit and 64-bit Linux (biggest disk I tried was 5.5 TB).

I now fixed this problem by using the stat64() system call for "#ifdef OS_LINUX".

I also changed ss_file_size(), ss_disk_size() and ss_disk_free() to return -1 if
the system call returns an error. I also added a test program
utils/test_ss_file_size.c.

svn revision 4397.
K.O.
  531   26 Nov 2008 Stefan RittInfoSend email alert in alarm system
> We have a temperature/humidity sensor in MIDAS now and will add a liquid level 
> sensor to MIDAS soon. We want the operators to get alerted ASAP when the 
> laboratory environment or the liquid level reached some critical levels. Can 
> MIDAS send email alerts or SMS alerts to cell phones when the alarms are 
> triggered? If yes, how can I config it?

Sure that's possible, that's why MIDAS contains an alarm system. To use it, define 
an ODB alarm on your liquid level, like

/Alarms/Alarms/Liquid Level
Active	                 y
Triggered	         0 (0x0)
Type	                 3 (0x3)
Check interval	        60 (0x3C)
Checked last	1227690148 (0x492D10A4)
Time triggered first	(empty)
Time triggered last	(empty)
Condition	        /Equipment/Environment/Variables/Input[0] < 10
Alarm Class	        Level Alarm
Alarm Message	        Liquid Level is only %s

The Condition if course might be different in your case, just select the correct 
variable from your equipment. In this case, the alarm triggers an alarm of class 
"Level Alarm". Now you define this alarm class:

/Alarms/Classes/Level Alarm
Write system message	y
Write Elog message	n
System message interval	600 (0x258)
System message last	0 (0x0)
Execute command	        /home/midas/level_alarm '%s'
Execute interval	1800 (0x708)
Execute last	        0 (0x0)
Stop run	        n
Display BGColor	        red
Display FGColor	        black

The key here is to call a script "level_alarm", which can send emails. Use 
something like:

#/bin/csh
echo $1 | mail -s \"Level Alarm\" your.name@domain.edu
odbedit -c 'msg 2 level_alarm \"Alarm was sent to your.name@domain.edu\"'

The second command just generates a midas system message for confirmation. Most 
cell phones (depends on the provider) have an email address. If you send an email 
there, it gets translated into a SMS message.

The script file above can of course be more complicated. We use a perl script 
which parses an address list, so everyone can register by adding his/her email 
address to that list. The script collects also some other slow control variables 
(like pressure, temperature) and combines this into the SMS message.

For very sensitive systems, having an alarm via SMS is not everything, since the 
alarm system could be down (computer crash or whatever). In this case we use 
'negative alarms' or however you might call it. The system sends every 30 minutes 
an SMS with the current levels etc. If the SMS is missing for some time, it might 
be an indication that something in the midas system is wrong and one can go there 
and investigate.
  530   26 Nov 2008 Jimmy NgaiInfoSend email alert in alarm system
Dear All,

We have a temperature/humidity sensor in MIDAS now and will add a liquid level 
sensor to MIDAS soon. We want the operators to get alerted ASAP when the 
laboratory environment or the liquid level reached some critical levels. Can 
MIDAS send email alerts or SMS alerts to cell phones when the alarms are 
triggered? If yes, how can I config it?

Many thanks!

Best Regards,
Jimmy
  529   20 Nov 2008 Stefan RittInfoRecommended platform for running MIDAS
> Dear All,
> 
> Is there any recommended platforms for running MIDAS? Have anyone encountered 
> problems when running MIDAS on Scientific Linux?
> 
> Thanks.
> 
> Jimmy

I run MIDAS on scientific Linux 5.1 without any problem.
  528   20 Nov 2008 Jimmy NgaiInfoRecommended platform for running MIDAS
Dear All,

Is there any recommended platforms for running MIDAS? Have anyone encountered 
problems when running MIDAS on Scientific Linux?

Thanks.

Jimmy
ELOG V3.1.4-2e1708b5