Back Midas Rome Roody Rootana
  Midas DAQ System, Page 119 of 131  Not logged in ELOG logo
ID Date Authorup Topic Subject
  2358   22 Mar 2022 Stefan RittInfoNew midas sequencer version
After several days of testing in various experiments, the new sequencer has
been merged into the develop branch. One more feature was added. The path to
the ODB can now contain variables which are substituted with their values.
Instead writing

ODBSET /Equipment/XYZ/Setting/1/Switch, 1
ODBSET /Equipment/XYZ/Setting/2/Switch, 1
ODBSET /Equipment/XYZ/Setting/3/Switch, 1

one can now write

LOOP i, 3
   ODBSET /Equipment/XYZ/Setting/$i/Switch, 1

Of course it is not possible for me to test any possible script. So if you 
have issues with the new sequencer, please don't hesitate to report them 
back to me.

  2360   22 Mar 2022 Stefan RittBug Fixfix for event buffer corruption in bm_flush_cache()
Thanks Konstantin for your detailed description. 

I wonder why we never saw this problem at PSI. Here is the reason: In multil-threaded environments, we never call bm_send_event() directly 
from all threads (since in the old days nothing was thread safe in midas). Instead, we use a collector thread which gets all events via the 
rb_xxx functions from the individual readout threads. This is well integrated into the mfe.cxx framework. Look at examples/mtfe/mfte.cxx. 
Each thread does (simplified):

while (true) {
  do {
     status = rb_get_wp(&pevent);
  } while (status == DB_TIMEOUT)

  bm_compose_event_threadsafe(pevent, ..., &serial_number);
  ... fill event ...

  rb_increment_wp(sizeof(EVENT_HEADER) + pevent->data_size);

The framework now collects all these events in receive_trigger_event() which runs in the main thread:

  for (i=0 ; i<n_thread ; i++) {
     rb_get_rp(i, pevent);
     if (pevent->serial_number == prev_serial+1)
  prev_serial = pevent->serial_number;
  rb_increment_rp(sizeof(EVENT_HEADER) + pevent->data_size);

This code ensures that all events are in the right sequence (before the serial numbers where mixed up) and that all events are sent only 
from a single thread, so the write buffer can be used effectively without complicated multi-thread locks.

This solution works nicely at PSI since many years, maybe you should put some thought to use it in your tmfe framework in Alpha-g as well 
instead of struggling with all your locks.

  2366   24 Mar 2022 Stefan RittBug Fixmhttpd bug fixed
> 1st wget stops (by ctrl-C), socket is closed, mongoose frees it's mg_connection object
> (corresponding worker is still labouring, hmm... actually sleeping, and now has a stale nc pointer)
> 2nd wget starts, new socket is opened, mongoose allocates a new mg_connection object,
> but malloc() gives it back the same memory we just freed(), and the 1st wget's worker thread
> nc pointer is no longer stale, but points to 2nd wget's connection.

Why don't we CLEAR the memory (memset(object,0,sizeof(object)) before the free(), this way it cannot be 
mistakenly re-used by the next thread.

  2367   24 Mar 2022 Stefan RittBug Fixfix for event buffer corruption in bm_flush_cache()
> > ... instead of struggling with all your locks.
> it is better to have midas fully thread safe. ODB has been so for a long time,
> event buffer partially (except for this bug), now fully.
> without that the problem still exists, because in many frontends,
> bm_flush_buffer() is called from the main thread, and will race
> against the "bm_send_event() thread". Of course you can do
> everything on the main thread, but this opens you to RPC timeouts
> during run transitions (if you sleep in bm_wait_for_free_space()).

Just for the record: in the mfe.cxx framework both bm_send_event() and 
bm_flush_buffer() are called from the main thread, as can be seen in the 
midas/examples/mtfe/mtfe.cxx example.

But I agree that having all buffer operations thread safe is a clear benefit.

  2371   24 Mar 2022 Stefan RittBug Fixmhttpd bug fixed
I see, now I understand.

As for the browser cache problem: This Chrome extension is your friend:

I use it all the time I change the CSS or a JS file. Having the "Developer Tools" open in Chrome helps as well 
(cache is then turned off). Firefox has similar extensions.

  2372   24 Mar 2022 Stefan RittBug Reportdata missing in runXXXXXX.mid
One idea: we should have a look at mlogger::close_channels(). There the SYSTEM buffer is emptied through the cm_yield() call. Instrumenting this with some debugging code will enlighten us. 

Another possible problem: If the frontend requested to be notified for a run stop AFTER the logger, then the problem might happen: Logger closes file, and THEN the frontend flushes events ending up in the SYSTEM buffer and being logged at the beginning of the next run. The mfe.cxx framework takes care of this by calling 

cm_register_transition(TR_SOP, 500);

while the mlogger does 

cm_register_transition(TR_STOP, tr_stop, 800);

and since 800 > 500 the logger will be called AFTER the frontend. If one use a framework different from mfe.cxx, this could however be different.

  2380   31 Mar 2022 Stefan RittSuggestionMaximum ODB size
Anybody some idea what the maximum ODB size can be? In the old days, the linux 
kernels had a severe limit on shared memory of usually 8MB, but in the age of 
64GB RAM being a standard, we should be able to grow bigger. Tried

odbinit -s 1024MB --cleanup

which went through without complain, even put that value in to .ODB_SIZE.TXT, but 
when I started odbedit doing "mem", I only see a size of 1MB. Probably somewhere 
deep inside we have a limit which prevents the user to create very large ODBs, 
but this should be mentioned more prominently in odbinit. Like "size too large, 
maximum allowd is xxx MB".

  2383   13 Apr 2022 Stefan RittInfoODB JSON support
> Per xkcd, there is a new json standard "json5". In addition to other things, numeric
> values NaN, +Infinity and -Infinity are encoded as literals NaN, Infinity and -Infinity (without quotes):

Just for curiosity: Is this implemented by the midas json library now?
  2385   15 Apr 2022 Stefan RittInfoNew midas sequencer version
I prepared some slides about the new features of the sequencer and post it here so 
people can have a quick look at get some inspiration.

Attachment 1: sequencer.pdf
sequencer.pdf sequencer.pdf sequencer.pdf sequencer.pdf sequencer.pdf sequencer.pdf sequencer.pdf sequencer.pdf
  2394   02 May 2022 Stefan RittInfoadded web page for "mdump"
Here are some of my thoughts:

- I volunteer to write the JavaScript midas bank decoder. Just a couple of pure javascript functions, no 
midasio.cxx library needed.

- If different javascript connections "steal" events from each other, I would not be concerned. Actually I 
would rather like that all connections see the SAME event. So mhttpd keeps one event, serves it to all 
links, so displays are consistent. If a browser wants to see the "next" event, it send the old serial 
number and days "please send next event AFTER serial number". If the serial number is larger than the 
event in the buffer, mhttpd fetches a new event and puts it into its buffer.

- Since javascript connections are connectionless, I would rather pass event_id and trigger_mask with each 
request. Then mhttpd can retrieve events until event_id and trigger_mask match, then serve that event. 
Since reading events from a midas buffer is fast (many 10'000s of events per second), the won't be much of 
a delay.

- GET_ALL does not make sense for browsers, you don't want to slow down any frontend. If someone wants to 
do histogramming in the browser, then GET_SOME (which is kind of GET_OLD) would make sense, but most of 
the cases we have some single event display, and there a GET_RECENT is most appropriate.
  2395   04 May 2022 Stefan RittInfoadded web pages for "show odb clients" and "show open records"
Concerning the "scl" page, we are currently having a discussion. At the moment, one can 
see midas clients in three different places:

1) the main status page at the bottom, only names and hosts are there

2) the programs page, where one can also start/stop program

3) now the new page "Show ODB clients" in the ODB editor page, which shows also the 
alive status, PID and timeout

I'm thinking that three locations are two too much, so we are considering to merge the 
tree pages into one. That would mean that 1) goes away, and the "Programs" page will 
show more information. We have some rare cases that programs are removed from 
/System/Clients in the ODB but still attached to the ODB. For those "zombies" we would 
add a "hard kill" function.

I would like to hear feedback from the midas community before we proceed with the 
plans. Anybody desperately in need of the programs shown on the status page?

  2397   06 May 2022 Stefan RittInfoIncreased timeout for program shut down
We had the problem in our lab that a frontend took about 6 seconds to gracefully 
shut down, mainly it needed to park some motors. I found that the shutdown command 
had a hard-coded timeout of 5 seconds, after which the frontend gets killed, and 
cannot finish the park operation. I change the code so that the client timeout 
stored in the ODB is taken instead of the hard-coded 5 seconds. This allows each 
client to fine-tune its timeout, to allow graceful shutdown, but also not let the 
user wait too long if the client gets stuck and needs a hard kill.

The default timeout for mfe.cxx based frontends has been changed to 10 seconds 
now, but in the frontend_init function this can be changed by the user code 

I hope this char does not trigger any bad side effects, but if it does, please 
report here.

  2398   08 May 2022 Stefan RittInfoRO_STOPPED with triggered events
We had issues in one of our experiment that people used RO_STOPPED in the 
equipment list together with triggered events (EQ_USER). If events are sent when 
a run is stopped, this leads to many unexpected results, so I added a check in 
the mfe.cxx code which prevents RO_STOPPED (or RO_ALWAYS which includes 
type of events.

I got now complaints that some old front-end are not running any more since they 
do use RO_ALWAYS together with triggered events. Can the author of these frontend 
please tell me the rationale why this is needed, then I can maybe add a better 
fix for that.

  2406   17 May 2022 Stefan RittInfoRO_STOPPED with triggered events
> > > some old front-end are not running any more since they do use RO_ALWAYS together with 
> > triggered events.
> > 
> > I confirm, if you have mfe.c frontends that have RO_ALWAYS, after you update MIDAS, 
> > some of these frontends will fail to start.
> >
> > 
> > tmfe c++ frontends do not have this restriction but by default only read data when run 
> > is active (per-equipment fEqConfReadOnlyWhenRunning default is true).
> As of commit 
> mfe.c 
> frontends will no longer fail to start. an error will still be issued "Equipment \"%s\" 
> contains RO_STOPPED or RO_ALWAYS. This can lead to undesired side-effect and should be 
> removed."
> BTW 1:
> Some of our old frontends use EQ_MULTITHREAD to implement multithreaded periodic equipments. 
> They do not generate any events when there is no run (some of them do not generate any 
> events at all). Now they will start printing this error message, for no reason. (no we will 
> not be rewriting them justy to get rid of this message. life is too short).
> BTW 2:
> the c++ tmfe frontend does not have any protections against these "undersired side-effects".
> What are these undesired side effects and should we add protection against them?
> K.O.

The undesired side-effects are the following: The logger tries to collect all events at the end of 
the run by emptying the SYSTEM buffer. If events keep coming after the run is stopped, this loop in 
the logger might be an endless loop, crashing the whole experiment in the end. 

Another issue (and actually the reason for this change) is the funciton receive_trigger_event() in 
mfe.cxx which will get confused if events are still coming in after a run has been stopped and 
actually enters an infinite loop.

Combining EQ_MULTITHREAD with EQ_PERIODIC or EQ_SLOW is a wrong parameter combination as written in 
the documentation. If one wants to have multi-threaded slow control events, one has to use the 
DF_MULTITHREAD flag in the DEVICE_DRIVER structure.

Having triggered events being sent to the system after a run has been stopped I would consider 
simply wrong. Why should we ever use a run start/stop if events are always flowing? Adding 
protections in all places for this case is certainly much more work than just changing one flag for 
frontends which produce this error message now for a wrong parameter combination.
  2413   20 Jun 2022 Stefan RittForumAlarm on variable not updating
There are two functions to do that, one check the last write access, the other the last write access if the run is running. The alarm condition looks like:

access(/Equipment/.../Variables/Input[10]) > 60

which will cause an alarm if the Input[10] is not written for more than 60 seconds. The other function which checks the run status as well is like:

access_running(...odb key...) > 60

You can actually see an example on the MEG alarm page.

Rather than having an alarm for that I would however recommend that you program you frontend such that it realizes if it looses connections, then tries automatically to reconnect or trigger an alarm itself (so-called "internal" alarm). This is also how the MSCB system is working and is much more robust.

  2417   05 Aug 2022 Stefan RittInfoInformation for midas updates though git
Several submodules of midas have been re-organized, so if you want to pull the 
newest version, you need a 

git pull --recurse-submodules
git submodule update --init --recursive

before you can build again. To do this automatically the next time, you can do

git config submodule.recurse true

which needs git 2.14 or later. I hope this works for everybody. If there is a 
better way to do that (I'm not a big expert on git) please reply here.

  2418   06 Aug 2022 Stefan RittInfoImprovement of odbxx API
While the odbxx API has been successfully used since the last months, a potential 
problem with large ODBs surfaced. If you have lots of data in the ODB and load it 
into an object like

midas::odb o("/Equipment");

this might take quite long, since each ODB value is fetched separately, which is 
very quick on a local machine but can take long over a client-server connection. 
For large experiments this can take up to minutes (!).

To get rid of this problem, the underlying object model has been modified. When an 
object is instantiated like above, then the whole ODB tree is fetched in an XML 
buffer in a single transfer, which even for large ODBs usually takes much less 
than a second. Then the XML buffer is decomposed on the client side and converted 
into the proper midas::odb objects. In one case this gave an improvement from 35 
seconds to 0.5 seconds which is significant. To enable the new method, the object 
can be created with a flag like

midas::odb o("/Equipment", true);

which then switches to the new method. One has to take care not to fool oneself 
(like I did) by printing the object like

midas::odb o("/Equipment", true);
std::cout << o << std::endl;

because each read access to any sub-object of o causes a separate read request to 
the server which again can take long. Therefore, one has to switch off the auto 
refresh via

midas::odb o("/Equipment", true);
std::cout << o << std::endl;

Accessing any sub-object of o then does not cause a client-server request, which 
is not necessary if all objects just have been pulled from the server before. If 
one keeps the object however for a long time in memory, one has to be aware that 
it only contains "old" values from the time if instantiation. If one needs more 
current ODB values, the auto read refresh has to be turned on again.

  2419   08 Aug 2022 Stefan RittInfoImprovement of odbxx API
After some thought, I changed the API again and removed the flag in the constructor,
so the system now automatically choses the best algorithm depending if the client
is connected to a local or a remote API. So in all cases you use again the old syntax:

midas::odb o("/Equipment");

  2423   08 Aug 2022 Stefan RittInfoInformation for midas updates though git
> after I set "submodule.recurse true", I still have to type "git submodule update --
> init --recursive", without --recursive, mscb/mxml is empty and the build bombs.

Indeed, doesn't work for me either. If some git guru has some more insight, please post 

> P.S. the underlying issue is that the mxml submodule is now included twice 
> (midas/mxml and midas/mscb/mxml) and there is nothing to enforce that both copies are 
> the same. (No idea what happens if the two mxml's are different).

The version of each mxml is defined by last commit of the parent repository, which contains 
the hash of the submodule version. If we have to update mxml for some reason, we have to 
commit also mscb with the new version, and then midas with the same version of mxml. If one 
checks out midas then with 

git clone --recursive

one gets the same versions for mxml.
  2425   15 Aug 2022 Stefan RittBug Reportfirefox hangs due to mhistory
> Firefox is hanging/becoming unresponsive due to javascript code. After stopping the script manually to get firefox back in control I have the following message in the console
> 17:21:28.821 Script terminated by timeout at:
> MhistoryGraph.prototype.drawTAxis@
> MhistoryGraph.prototype.draw@
> mhistory.js:2828:7
> Any ideas how to resolve this??

I have to reproduce the problem. Can you send me the full URL from your browser when you see that problem? Probably you have some "special" axis limits, so we don't see that 
problem anywhere else.

ELOG V3.1.4-2e1708b5