Back Midas Rome Roody Rootana
  Midas DAQ System, Page 144 of 157  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
    Reply  18 Sep 2024, Marius Koeppel, Bug Report, Crash using ODB watch 
I created a PR to fix this issue https://bitbucket.org/tmidas/midas/pull-requests/42.
The crash happened since the change in commit 3ad98c5 always got the ODB via XML.
However, the creation from XML should only be used when a user wants to read fast (and when we are on a remote machine) so I added the flag use_from_xml to explicitly specify this.


> > {
> > odb new_settings("/Equipment/Test FE/Settings");
> > new_settings.watch(watch); // <-- here I am getting a segmentation fault
> > }
> 
> this code has a bug. "watch" is attached to object "new_settings" that is deleted
> after the closing curly bracket.

> I would say Stefan's odb API should not allow you to write code like this. an API defect.

As pointed out in the thread this feature is explicitly supported by odbxx.cxx:

void odb::watch(std::function<void(midas::odb &)> f) {
      if (m_hKey == 0 || m_hKey == -1)
         mthrow("watch() called for ODB key \"" + m_name +
                "\" which is not connected to ODB");

      // create a deep copy of current object in case it
      // goes out of scope
      midas::odb* ow = new midas::odb(*this);

      ow->m_watch_callback = f;
      db_watch(s_hDB, m_hKey, midas::odb::watch_callback, ow);

      // put object into watchlist
      g_watchlist.push_back(ow);
}

Also in the old way (see for example https://bitbucket.org/tmidas/midas/src/191d13f98626fae533cbca17b00df7ee361edf16/examples/crfe/crfe.cxx#lines-126) it was possible to create a watch in a scope without the user taking care that the "object" does not go out of scope.
I think this feature should be supported by the framework.

Best,
Marius
    Reply  20 Sep 2024, Stefan Ritt, Bug Report, Crash using ODB watch 
The problem has been fixed in the current version. Here is my analysis:

- the midas::odb object *can* go out of scope in the function, since the odb::watch() function creates a deep copy of the object. 
This does not cause a memory leak if one call odb::unwatch_all() at the end of a program.

- The creation from XML had a flaw where the ODB key handle ("hKey") is not initialized since it is not passed by the db_copy_xml() function.
I added code to db_copy_xml() to also fetch the key handle in the XML file, which now fixes the issue. Please note that you have to
update both the server and client side of midas to get this functionality if you are using it by a remote client.

- I saw the flag MK added on his pull request to the constructor of odb::odb(). This is a way to fight the symptoms (by creating an
object the "old" way if not otherwise needed, but how we have the cause cured. Nevertheless I added that parameter, but set to to true by default:

   odb::odb(const std::string &str, bool init_via_xml = true);

since this should be fully working now and should always be faster than the old method. I only keep it for debugging should we observe
another flaw in odb_from_xml(). 

Best regards,
Stefan
Entry  04 Jul 2012, Konstantin Olchanski, Bug Report, Crash after recursive use of rpc_execute() 
I am looking at a MIDAS kaboom when running out of space on the data disk - everything was freezing 
up, even the VME frontend crashed sometimes.

The freeze was traced to ROOT use in mlogger - it turns out that ROOT intercepts many signal handlers, 
including SIGSEGV - but instead of crashing the program as God intended, ROOT SEGV handler just hangs, 
and the rest of MIDAS hangs with it. One solution is to always build mlogger without ROOT support - 
does anybody use this feature anymore? Or reset the signal handlers back to the default setting somehow.

Freeze fixed, now I see a crash (seg fault) inside mlogger, in the newly introduced memmove() function 
inside the MIDAS RPC code rpc_execute(). memmove() replaced memcpy() in the same place and I am 
surprised we did not see this crash with memcpy().

The crash is caused by crazy arguments passed to memmove() - looks like corrupted RPC arguments 
data.

Then I realized that I see a recursive call to rpc_execute(): rpc_execute() calls tr_stop() calls cm_yield() calls 
ss_suspend() calls rpc_execute(). The second rpc_execute successfully completes, but leave corrupted 
data for the original rpc_execute(), which happily crashes. At the moment of the crash, recursive call to 
rpc_execute() is no longer visible.

Note that rpc_execute() cannot be called recursively - it is not re-entrant as it uses a global buffer for RPC 
argument processing. (global tls_buffer structure).

Here is the mlogger stack trace:

#0  0x00000032a8032885 in raise () from /lib64/libc.so.6
#1  0x00000032a8034065 in abort () from /lib64/libc.so.6
#2  0x00000032a802b9fe in __assert_fail_base () from /lib64/libc.so.6
#3  0x00000032a802bac0 in __assert_fail () from /lib64/libc.so.6
#4  0x000000000041d3e6 in rpc_execute (sock=14, buffer=0x7ffff73fc010 "\340.", convert_flags=0) at 
src/midas.c:11478
#5  0x0000000000429e41 in rpc_server_receive (idx=1, sock=<value optimized out>, check=<value 
optimized out>) at src/midas.c:12955
#6  0x0000000000433fcd in ss_suspend (millisec=0, msg=0) at src/system.c:3927
#7  0x0000000000429b12 in cm_yield (millisec=100) at src/midas.c:4268
#8  0x00000000004137c0 in close_channels (run_number=118, p_tape_flag=0x7fffffffcd34) at 
src/mlogger.cxx:3705
#9  0x000000000041390e in tr_stop (run_number=118, error=<value optimized out>) at 
src/mlogger.cxx:4148
#10 0x000000000041cd42 in rpc_execute (sock=12, buffer=0x7ffff73fc010 "\340.", convert_flags=0) at 
src/midas.c:11626
#11 0x0000000000429e41 in rpc_server_receive (idx=0, sock=<value optimized out>, check=<value 
optimized out>) at src/midas.c:12955
#12 0x0000000000433fcd in ss_suspend (millisec=0, msg=0) at src/system.c:3927
#13 0x0000000000429b12 in cm_yield (millisec=1000) at src/midas.c:4268
#14 0x0000000000416c50 in main (argc=<value optimized out>, argv=<value optimized out>) at 
src/mlogger.cxx:4431


K.O.
    Reply  04 Jul 2012, Konstantin Olchanski, Bug Report, Crash after recursive use of rpc_execute() 
>  ... I see a recursive call to rpc_execute(): rpc_execute() calls tr_stop() calls cm_yield() calls 
> ss_suspend() calls rpc_execute()
> ... rpc_execute() cannot be called recursively - it is not re-entrant as it uses a global buffer

It turns out that rpc_server_receive() also need protection against recursive calls - it also uses
a global buffer to receive network data.

My solution is to protect rpc_server_receive() against recursive calls by detecting recursion and returning SS_SUCCESS (to ss_suspend()).

I was worried that this would cause a tight loop inside ss_suspend() but in practice, it looks like ss_suspend() tries to call
us about once per second. I am happy with this solution. Here is the diff:


@@ -12813,7 +12815,7 @@
 
 
 /********************************************************************/
-INT rpc_server_receive(INT idx, int sock, BOOL check)
+INT rpc_server_receive1(INT idx, int sock, BOOL check)
 /********************************************************************\
 
   Routine: rpc_server_receive
@@ -13047,7 +13049,28 @@
    return status;
 }
 
+/********************************************************************/
+INT rpc_server_receive(INT idx, int sock, BOOL check)
+{
+  static int level = 0;
+  int status;
 
+  // Provide protection against recursive calls to rpc_server_receive() and rpc_execute()
+  // via rpc_execute() calls tr_stop() calls cm_yield() calls ss_suspend() calls rpc_execute()
+
+  if (level != 0) {
+    //printf("*** enter rpc_server_receive level %d, idx %d sock %d %d -- protection against recursive use!\n", level, idx, sock, check);
+    return SS_SUCCESS;
+  }
+
+  level++;
+  //printf(">>> enter rpc_server_receive level %d, idx %d sock %d %d\n", level, idx, sock, check);
+  status = rpc_server_receive1(idx, sock, check);
+  //printf("<<< exit rpc_server_receive level %d, idx %d sock %d %d, status %d\n", level, idx, sock, check, status);
+  level--;
+  return status;
+}
+
 /********************************************************************/
 INT rpc_server_shutdown(void)
 /********************************************************************\


ladd02:trinat~/packages/midas>svn info src/midas.c
Path: src/midas.c
Name: midas.c
URL: svn+ssh://svn@savannah.psi.ch/repos/meg/midas/trunk/src/midas.c
Repository Root: svn+ssh://svn@savannah.psi.ch/repos/meg/midas
Repository UUID: 050218f5-8902-0410-8d0e-8a15d521e4f2
Revision: 5297
Node Kind: file
Schedule: normal
Last Changed Author: olchanski
Last Changed Rev: 5294
Last Changed Date: 2012-06-15 10:45:35 -0700 (Fri, 15 Jun 2012)
Text Last Updated: 2012-06-29 17:05:14 -0700 (Fri, 29 Jun 2012)
Checksum: 8d7907bd60723e401a3fceba7cd2ba29

K.O.
    Reply  13 Jul 2012, Stefan Ritt, Bug Report, Crash after recursive use of rpc_execute() 
> Then I realized that I see a recursive call to rpc_execute(): rpc_execute() calls tr_stop() calls cm_yield() calls 
> ss_suspend() calls rpc_execute(). The second rpc_execute successfully completes, but leave corrupted 
> data for the original rpc_execute(), which happily crashes. At the moment of the crash, recursive call to 
> rpc_execute() is no longer visible.

This is really strange. I did not protect rpc_execute against recursive calls since this should not happen. rpc_server_receive() is linked to rpc_call() on the client side. So there cannot be 
several rpc_call() since there I do the recursive checking (also multi-thread checking) via a mutex. See line 10142 in midas.c. So there CANNOT be recursive calls to rpc_execute() because 
there cannot be recursive calls to rpc_server_receive(). But apparently there are, according to your stack trace.

So even if your patch works fine, I would like to know where the recursive calls to rpc_server_receive() come from. Since we have one subproces of mserver for each client, there should only 
be one client connected to each mserver process, and the client is protected via the mutex in rpc_call(). Can you please debug this? I would like to understand what is going on there. Maybe 
there is a deeper underlying problem, which we better solve, otherwise it might fall back on use in the future.

For debugging, you have to see what commands rpc_call() send and what rpc_server_receive() gets, maybe by writing this into a common file together with a time stamp.

SR
Entry  18 Aug 2009, Denis Calvet, Suggestion, Could not create strings other than 32 characters with odbedit -c "..." command 
Hi,
I am writing shell scripts to create some tree structure in an ODB. When 
creating an array of strings, the default length of each string element is 32 
characters. If odbedit is used interactively to create the array of strings, 
the user is prompted to enter a different length if desired. But if the 
command odbedit is called from a shell script, I did not succeed in passing 
the argument to get a different length.
I tried:
odbedit -c "create STRING Test[8][40]"
Or:
odbedit -c "create STRING Test[8] 40"
Or:
odbedit -c "create STRING Test[8] \n 40"
etc. all produce an array of 8 strings with 32 characters each.
I haven't tried all possible syntaxes, but I suspect the length argument is 
dropped. If it has not been fixed in a later release than the one I am using, 
could this problem be looked at?
Thanks,
Denis.
  
    Reply  03 Sep 2009, Stefan Ritt, Suggestion, Could not create strings other than 32 characters with odbedit -c "..." command 
> Hi,
> I am writing shell scripts to create some tree structure in an ODB. When 
> creating an array of strings, the default length of each string element is 32 
> characters. If odbedit is used interactively to create the array of strings, 
> the user is prompted to enter a different length if desired. But if the 
> command odbedit is called from a shell script, I did not succeed in passing 
> the argument to get a different length.
> I tried:
> odbedit -c "create STRING Test[8][40]"
> Or:
> odbedit -c "create STRING Test[8] 40"
> Or:
> odbedit -c "create STRING Test[8] \n 40"
> etc. all produce an array of 8 strings with 32 characters each.
> I haven't tried all possible syntaxes, but I suspect the length argument is 
> dropped. If it has not been fixed in a later release than the one I am using, 
> could this problem be looked at?

Ok, I added a command

odbedit -c "create STRING Test[8][40]"

which works now. Please update to SVN revision 4555 of odbedit.c

- Stefan
    Reply  06 Sep 2009, Exaos Lee, Suggestion, Could not create strings other than 32 characters with odbedit -c "..." command 
> Ok, I added a command
> 
> odbedit -c "create STRING Test[8][40]"
> 
> which works now. Please update to SVN revision 4555 of odbedit.c
> 
> - Stefan

If I want to create only one string, should I write like this:

  odbedit -c "create STRING Test[] [256]"

OK. I need it. I will try the new odbedit.
    Reply  06 Sep 2009, Exaos Lee, Suggestion, Could not create strings other than 32 characters with odbedit -c "..." command 
> > Ok, I added a command
> > 
> > odbedit -c "create STRING Test[8][40]"
> > 
> > which works now. Please update to SVN revision 4555 of odbedit.c
> > 
> > - Stefan
> 
> If I want to create only one string, should I write like this:
> 
>   odbedit -c "create STRING Test[] [256]"
> 
> OK. I need it. I will try the new odbedit.

"create STRING test[1][256]" works.
Entry  21 Apr 2005, Konstantin Olchanski, Suggestion, Correct MIDASSYS setting? 
Current MIDAS versions nag me about setting the env.variable MIDASSYS to the
"midas installation directory", but I do not have one, so what should I set
MIDASSYS to? I checkout MIDAS from cvs into /home/olchansk/daq/midas, build it
there, run it from there. I never do "make install" (I am not "root" on every
machine; I am not the only MIDAS user on every machine). What should I set
MIDASSYS to? K.O.
    Reply  22 Apr 2005, Stefan Ritt, Suggestion, Correct MIDASSYS setting? 
> Current MIDAS versions nag me about setting the env.variable MIDASSYS to the
> "midas installation directory", but I do not have one, so what should I set
> MIDASSYS to? I checkout MIDAS from cvs into /home/olchansk/daq/midas, build it
> there, run it from there. I never do "make install" (I am not "root" on every
> machine; I am not the only MIDAS user on every machine). What should I set
> MIDASSYS to? K.O.

Then set it to /home/olchansk/daq/midas. The reason for MIDASSYS is the same as
for ROOTSYS. Having it allows other packages like ROME to access the Midas source
code, include files and libraries.
Entry  19 Nov 2025, Stefan Mathis, Forum, Control external process from inside MIDAS 
Dear all,

I want to control (start / stop / monitor its stdout and stderr) an external process (systemd / EPICS IOC shell script) from within MIDAS.

In order to make this as convenient as possible for the user, I want the process to behave just like any other MIDAS client:
- I can start it from the ODB as a program
- The process gets regularly polled from MIDAS to see whether it is still running
- I can stop the process from the ODB like any other program
- Optional, but highly appreciated: Its stdout and stderr should be a MIDAS message.

Did anyone already solve a similar problem?

Best regards
Stefan
    Reply  19 Nov 2025, Nick Hastings, Forum, Control external process from inside MIDAS 
Hi,

what you describe is exactly how I normally run mhttpd, mlogger, mserver and some other
custom frontend programs. Eg:

[local:T2KGSC:Running]/>ls /programs/Logger/
Required                        y
Watchdog timeout                100000
Check interval                  180000
Start command                   systemctl --user start mlogger
Auto start                      n
Auto stop                       n
Auto restart                    n
Alarm class                     AlarmNotify
First failed                    0

The only exception is your last point about stdout and stderr
being midas messages. I use journalctl to see these.

Cheers,

Nick.

> I want to control (start / stop / monitor its stdout and stderr) an external process (systemd / EPICS IOC shell script) from within MIDAS.
> 
> In order to make this as convenient as possible for the user, I want the process to behave just like any other MIDAS client:
> - I can start it from the ODB as a program
> - The process gets regularly polled from MIDAS to see whether it is still running
> - I can stop the process from the ODB like any other program
> - Optional, but highly appreciated: Its stdout and stderr should be a MIDAS message.
> 
> Did anyone already solve a similar problem?
> 
> Best regards
> Stefan
    Reply  20 Nov 2025, Stefan Mathis, Forum, Control external process from inside MIDAS 
Thanks a lot,

Nick. Regarding the messages: Zaher showed me that it is possible to simply place a custom log file generated by the systemd next to midas.log - then it shows up next to the "midas" tab in "Messages".

One follow-up question: Is it possible to use the systemctl status for the "Running on host" column? Or does this even happen automatically?

Best regards
Stefan

> Hi,
> 
> what you describe is exactly how I normally run mhttpd, mlogger, mserver and some other
> custom frontend programs. Eg:
> 
> [local:T2KGSC:Running]/>ls /programs/Logger/
> Required                        y
> Watchdog timeout                100000
> Check interval                  180000
> Start command                   systemctl --user start mlogger
> Auto start                      n
> Auto stop                       n
> Auto restart                    n
> Alarm class                     AlarmNotify
> First failed                    0
> 
> The only exception is your last point about stdout and stderr
> being midas messages. I use journalctl to see these.
> 
> Cheers,
> 
> Nick.
> 
> > I want to control (start / stop / monitor its stdout and stderr) an external process (systemd / EPICS IOC shell script) from within MIDAS.
> > 
> > In order to make this as convenient as possible for the user, I want the process to behave just like any other MIDAS client:
> > - I can start it from the ODB as a program
> > - The process gets regularly polled from MIDAS to see whether it is still running
> > - I can stop the process from the ODB like any other program
> > - Optional, but highly appreciated: Its stdout and stderr should be a MIDAS message.
> > 
> > Did anyone already solve a similar problem?
> > 
> > Best regards
> > Stefan
    Reply  20 Nov 2025, Nick Hastings, Forum, Control external process from inside MIDAS 
Hi,
 
> Nick. Regarding the messages: Zaher showed me that it is possible to simply place
> a custom log file generated by the systemd next to midas.log - then it shows up
> next to the "midas" tab in "Messages".

Interesting. I'm not familiar with that feature. Do you have link to documentation?

> One follow-up question: Is it possible to use the systemctl status for the
> "Running on host" column? Or does this even happen automatically?

On the programs page that column is populated by the odb key /System/Clients/<PID>/Host
so no. However, there is nothing stopping you from writing your own version of
programs.html to show whatever you want. For example I have a custom programs
page the includes columns to enable/disable and to reset watchdog alarms.

Cheers,

Nick.
    Reply  20 Nov 2025, Stefan Mathis, Forum, Control external process from inside MIDAS 
Hi,

unfortunately I don't have a documentation link to the feature, I just know that it works on my machine ;-) The general idea is that you place a custom whatever.log file in Logger/Data Dir (where midas.log is stored). Then, in the Messages page, there will be a "midas" tab and a "whatever" tab - the latter showing the content of whatever.log. One problem here is that timestamping does not work automatically - you have to prepend every line with the same Hours:Minutes:Seconds.Milliseconds Year/Month/Day format that midas.log is using.

So you have a custom Programs page which does systemctl status on your systemd? Does the status then transfer over automatically to the Status page? Is there an example how to write such a custom page?

Best regards
Stefan

> Hi,
>  
> > Nick. Regarding the messages: Zaher showed me that it is possible to simply place
> > a custom log file generated by the systemd next to midas.log - then it shows up
> > next to the "midas" tab in "Messages".
> 
> Interesting. I'm not familiar with that feature. Do you have link to documentation?
> 
> > One follow-up question: Is it possible to use the systemctl status for the
> > "Running on host" column? Or does this even happen automatically?
> 
> On the programs page that column is populated by the odb key /System/Clients/<PID>/Host
> so no. However, there is nothing stopping you from writing your own version of
> programs.html to show whatever you want. For example I have a custom programs
> page the includes columns to enable/disable and to reset watchdog alarms.
> 
> Cheers,
> 
> Nick.
    Reply  24 Nov 2025, Stefan Ritt, Forum, Control external process from inside MIDAS 

Dear all,

Stefan wants to run an external EPICS driver process as a detached process and somehow "glue" it to midas to control it. Actually a similar requirement led to the development of MIDAS in the '90s. We had too many configuration files lying around, to many process to control and interact together with each other and so on. With the development of MIDAS I wanted to integrate all that. There is one ODB to control an parasitize everything, one central process handling to see if processes are alive, raise an alarm if they die, automatically restart them if necessary and so on. Doing this now externally again is orthogonal to the original design concept of MIDAS and will cause many problems. I therefore strongly recommend to to juggle around with systemctl and syslog, but to make everything a MIDAS process. It's simply a "cm_connect_experiment()" and "cm_disconnect_experiment()" in the end. Then you set

/programs/requited = y

and

/programs/start command = <cmd>

You can set the "alarm class" to raise an alarm if the program crashes, and you will see all messages if you use "cm_msg()" inside the program rather than "printf()". Injecting a separate .log file into the system will show things on the message page, but these messages do not go through the SYSMSG buffer, and cannot received by other programs. Maybe you noticed that mhttpd on the status page always shows the last message it received, which can be very helpful. To see if a program is running, you only need a cm_exist() call, which also exists for custom web pages.

Rather than investing time to re-invent the wheel here, better try to modify your EPICS driver process to become a midas process.

If you have an external process which you absolutely cannot modify, I would rather write a wrapper midas program to start the external process, intercept it's output via a pipe, and put its output properly into the midas message system with cm_msg(). In the main loop of your wrapper function you check the external process via whatever you want, and if it dies trigger an alarm or restart it from your wrapper program. You can then set an alarm on your wrapper program to make sure this one is always running.

Best regards,
StefanR

    Reply  27 Nov 2025, Konstantin Olchanski, Forum, Control external process from inside MIDAS 
> Rather than investing time to re-invent the wheel here, better try to modify your EPICS driver process to 
become a midas process.

I am with Stefan on this. Quite a bit of work went into the tmfe c++ framework to make it easy/easier to do 
this - take an existing standalone c/c++ program and midas-ize it: in main(), "just add" calls to connect to 
midas and to start the midas threads - rpc handler, watchdog, etc.

Alternatively, one can write a midas "stdout+stderr bridge", and start your standalone program
from the programs page like this:

myprogram |& cm_msg_bridge --name "myprogram" (redirect both stdout and stderr to cm_msg_bridge stdin)

cm_msg_bridge would read stdin and put them in cm_msg(). it will connect to midas using the name "myprogram" 
to make it show "green" on the status page and it will be stoppable from the programs page.

care will need to be taken for myprogram to die cleanly when stdout and stderr are closed after cm_msg_bridge 
exits.

K.O.
Entry  07 May 2020, Estelle, Bug Report, Conflic between Rootana and midas about the redefinition of TID_xxx data types  
Dear Midas and Rootana people,

We have tried to update our midas DAQ with the new TID definitions describe in https://midas.triumf.ca/elog/Midas/1871 

And we have noticed an incompatibility of this new definitions with Rootana when reading an XmlOdb in our offline analyzer. 

The problem comes from  the function FindArrayPath in XmlOdb.cxx and the comparison of bank type as strings.
Ex: comparing the strings "DWORD" and "UNINT32"

An naive solution would be to print the number associated to the type (ex: '6' for DWORD/UNINT32), but that would mean changing Rootana and Midas source code. Moreover, it does decrease the readability of the XmlOdb file. 


Thanks for your time.
Estelle
    Reply  20 May 2020, Konstantin Olchanski, Bug Report, Conflic between Rootana and midas about the redefinition of TID_xxx data types  
> Dear Midas and Rootana people,
> 
> We have tried to update our midas DAQ with the new TID definitions describe in https://midas.triumf.ca/elog/Midas/1871 
> 
> And we have noticed an incompatibility of this new definitions with Rootana when reading an XmlOdb in our offline analyzer. 
> 
> The problem comes from  the function FindArrayPath in XmlOdb.cxx and the comparison of bank type as strings.
> Ex: comparing the strings "DWORD" and "UNINT32"
> 
> An naive solution would be to print the number associated to the type (ex: '6' for DWORD/UNINT32), but that would mean changing Rootana and Midas source code. Moreover, it does decrease the readability of the XmlOdb file. 
> 

Hi, it is unfortunate that a change was made in MIDAS that is incompatible with existing analysis software. I shall update the ROOTANA package to deal with this ASAP.

K.O.
ELOG V3.1.4-2e1708b5