Back Midas Rome Roody Rootana
  Midas DAQ System, Page 1 of 145  Not logged in ELOG logo
IDdown Date Author Topic Subject
  2925   19 Dec 2024 Stefan RittSuggestionNew alarm count added
Another modification has been done to the alarm system. 

We have often cases where an alarm is triggered on some readout noise. Like an analog voltage just over the alarm threshold for a very short period of time, triggered sometimes from environmental 
electromagnetic effects like turning on the light. 

To mitigate this problem, an "alarm trigger count" has been implemented. Every alarm has now a variable "Trigger count required". If this value is zero (the default), the alarm system works as before. If this 
value is nowever set to a non-zero value N, the alarm limit has to be met on N consecutive periods in order to trigger the alarm. Each alarm has a "Check interval" which determines the period of the alarm 
checking. If one has for example:

Check interval = 10
Trigger count required = 3

then the alarm condition has to be met for 3 consecutive periods of 10 seconds each, or for 30 seconds in total. 

The modification has been merged into the develop branch, and people have to be aware that the alarm structures in the ODB changed. The current code tries to fix this automatically, but it is important 
that ALL MIDAS CLIENTS GET RE-COMPILED after the new code is applied. Otherwise we could have "new" clients expecting the new ODB structure, and some "old" clients expecting the old structure, 
then both types of clients would fight against each other and change the ODB structure every few seconds, leading to tons of error messages. So if you pull the current develop branch, make sure to re-
compile ALL midas clients.

/Stefan
  2924   19 Dec 2024 Stefan RittBug Report[History plots] "Jump to current time" resets x range to 7d
I had put in a check which limits the range to 7d into the past if you press the "play" button, but now I'm not sure why this was needed. I removed it again and 
things seem to be fine. Change is committed to develop.

Stefan
  2923   17 Dec 2024 Lukas GerritzenBug Report[History plots] "Jump to current time" resets x range to 7d
To reproduce:
- Open a history plot, click [-] a few times until the x axis shows more than 7 days.
- Scroll to the past (left)
- Click "Jump to current time" (the triangle)

Expected result:
The upper limit of the x axis is at the current time and the lower range is now - whatever range you had before 
(>7d)

Actual result:
The upper limit is the current time, the lower limit is now - 7d

(The interval seems unchanged if the range was < 7d before clicking "Jump to current time")
  2922   13 Dec 2024 Marius KoeppelInfoNew Feature: Message Search
Dear all,

a new feature was implemented which allows to search the log messages in MIDAS. Attached one can find a more detailed explanation of how to use the feature.

If you see any issues / bugs feel don't hesitate to report them. For now the code was tested on Linux / Mac OS using Chrome, Firefox and Safari.

Best,
Marius 
  2921   12 Dec 2024 Stefan RittSuggestionNew alarm sound flag to be tested
We had the case in MEG that some alarms were actually just warnings, not very severe. This happens for example if we calibrate our detector 
once every other day and modify the hardware which actually triggers the alarm for about an hour or so.

The problem with this is now that the alarm sounds every few minutes, and people get annoyed during that hour. They turn down the volume 
of their speakers, or even disable the alarm sound. If the detector gets back into the default mode again, they might forget to re-enable the 
alarm, which causes some risk. 

Turning down the volume is also not good, since during that hour we could have a "real" alarm on which people have to react quickly in order 
not destroy the detector.

The art is now to configure the alarm system in a way that "normal" changes do not annoy people or cover up really severe alarms. After long 
discussions we came to following conclusion: We need a special class of alarm (let's call it 'warning') which does not annoy people. The 
warning should be visible on the screen, but not ring the alarm bell. 

While we have different alarm classes in midas, which let us customize the frequency of alarms and the screen colors, all alarms or warnings 
ring the alarm sound right now. This can be changed in the browser under "Config/Alarm sound" but that switch affects ALL alarms, which is 
not what we want.

The idea we came up with was to add a flag "Alarm sound" to the alarm classes. For the 'warning' we can then turn off the alarm sound, so 
only the banner is shown on top of the screen, and the spoken message is generated every 15 mins to remind people, but not to annoy them.

I added this "Alarm sound" flag in the branch feature/alarm_sound so everybody can test it. The downside is that all "Alarm/Classs/xxx" need 
to be modified to add this flag. While the new code will add this flag automatically (with a default value of 'true'), the size of the alarm class 
record changes by four bytes (one bool). Therefore, all running midas programs will complain about the changed size, until they get 
recompiled. 

Therefore, to test the new feature, you have to checkout the branch and  re-compile all midas programs you use, otherwise you will get errors 
like 

  Fixing ODB "/Alarms/Classes/Alarm" struct size mismatch (expected 352, odb size 348)

I will keep the branch for a few days for people to try it out and report any issue, and later merge it to develop.

Stefan
  2920   10 Dec 2024 Stefan RittSuggestionComma-separated indices in alarm conditions
These kind of alarm conditions have been implemented and committed. The documentation at 

  https://daq00.triumf.ca/MidasWiki/index.php/Alarm_System

has been updated.

/Stefan
  2919   06 Dec 2024 Konstantin OlchanskiBug ReportTMFE::Sleep() errors
> Commit [develop 06735d29] improve TMFE::Sleep()

Report of test_sleep on Ubuntu-22 (Intel E-2236) and MacOS 14.6.1 (M1 MAX Mac Studio).

It is easy to see Ubuntu-22 (kernel 6.2.x) sleep granularity is ~60 usec, MacOS sleep granularity is <2 usec. (Sub 1 usec sleep likely measures the syscall() speed, 500 ns on Intel and 200 
ns on ARM M1 MAX). (NOTE: long sleep is interrupted by an alarm roughly 10 seconds into the sleep, see progs/test_sleep.cxx)

daq00:midas$ uname -a
Linux daq00.triumf.ca 6.2.0-39-generic #40~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 16 10:53:04 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
daq00:midas$ ./bin/test_sleep 
test short sleep:
sleep      10 loops,   100000.000 usec per loop,  1.000000000 sec total,  1.002568007 sec actual total,   100256.801 usec actual per loop, oversleep  256.801 usec,    0.3%
sleep     100 loops,    10000.000 usec per loop,  1.000000000 sec total,  1.025897980 sec actual total,    10258.980 usec actual per loop, oversleep  258.980 usec,    2.6%
sleep    1000 loops,     1000.000 usec per loop,  1.000000000 sec total,  1.169670105 sec actual total,     1169.670 usec actual per loop, oversleep  169.670 usec,   17.0%
sleep   10000 loops,      100.000 usec per loop,  1.000000000 sec total,  1.573357105 sec actual total,      157.336 usec actual per loop, oversleep   57.336 usec,   57.3%
sleep   99999 loops,       10.000 usec per loop,  0.999990000 sec total,  6.963442087 sec actual total,       69.635 usec actual per loop, oversleep   59.635 usec,  596.4%
sleep 1000000 loops,        1.000 usec per loop,  1.000000000 sec total, 60.939687967 sec actual total,       60.940 usec actual per loop, oversleep   59.940 usec, 5994.0%
sleep 1000000 loops,        0.100 usec per loop,  0.100000000 sec total,  0.613572121 sec actual total,        0.614 usec actual per loop, oversleep    0.514 usec,  513.6%
sleep 1000000 loops,        0.010 usec per loop,  0.010000000 sec total,  0.576359987 sec actual total,        0.576 usec actual per loop, oversleep    0.566 usec, 5663.6%
test long sleep: requested 126.000000000 sec ... sleeping ...
alarm!
test long sleep: requested 126.000000000 sec, actual 126.000875950 sec

bash-3.2$ uname -a
Darwin send.triumf.ca 23.6.0 Darwin Kernel Version 23.6.0: Mon Jul 29 21:14:30 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T6000 arm64
bash-3.2$ ./bin/test_sleep 
test short sleep:
sleep      10 loops,   100000.000 usec per loop,  1.000000000 sec total,  1.032556057 sec actual total,   103255.606 usec actual per loop, oversleep 3255.606 usec,    3.3%
sleep     100 loops,    10000.000 usec per loop,  1.000000000 sec total,  1.245460033 sec actual total,    12454.600 usec actual per loop, oversleep 2454.600 usec,   24.5%
sleep    1000 loops,     1000.000 usec per loop,  1.000000000 sec total,  1.331466913 sec actual total,     1331.467 usec actual per loop, oversleep  331.467 usec,   33.1%
sleep   10000 loops,      100.000 usec per loop,  1.000000000 sec total,  1.281141996 sec actual total,      128.114 usec actual per loop, oversleep   28.114 usec,   28.1%
sleep   99999 loops,       10.000 usec per loop,  0.999990000 sec total,  1.410759926 sec actual total,       14.108 usec actual per loop, oversleep    4.108 usec,   41.1%
sleep 1000000 loops,        1.000 usec per loop,  1.000000000 sec total,  2.400593996 sec actual total,        2.401 usec actual per loop, oversleep    1.401 usec,  140.1%
sleep 1000000 loops,        0.100 usec per loop,  0.100000000 sec total,  0.188431025 sec actual total,        0.188 usec actual per loop, oversleep    0.088 usec,   88.4%
sleep 1000000 loops,        0.010 usec per loop,  0.010000000 sec total,  0.188102007 sec actual total,        0.188 usec actual per loop, oversleep    0.178 usec, 1781.0%
test long sleep: requested 126.000000000 sec ... sleeping ...
alarm!
test long sleep: requested 126.000000000 sec, actual 126.001244068 sec

K.O.
  2918   06 Dec 2024 Konstantin OlchanskiBug ReportTMFE::Sleep() errors
> >       status = select(1, &fdset, NULL, NULL, &timeout);
> >       On error, -1 is returned, ... timeout becomes undefined.

I confirm Linux and MacOS man pages and select() with EINTR work as I remember, Linux updates "timeout" to account for the 
time already slept, MacOS does not ("timeout" is unchanged).

So the original code is roughly correct, but long sleeps will not work right if SIGALRM fires during sleeping.

Note that MIDAS no longer uses SIGALRM to fire cm_watchdog() (it was moved to a thread) and MIDAS does not use signals,
so handling of EINTR is now moot.

(Please correct me if I missed something).

The original bug report was about EINVAL, and best I can tell, it was caused by calls to TMFE::Sleep()
with strange sleep times that caused invalid values to be computed into the select() timeout.

To improve on this, I make these changes:

1) TMFE::Sleep(0) will report an error and will not sleep
2) TMFE::Sleep(negative number) will report an error and will not sleep

(please check the sleep time before calling TMFE::Sleep())

3) TMFE::Sleep(1 sec or less) will sleep using select(). (I also looked into using poll(), ppoll() and pselect()).
4) TMFE::Sleep(more than 1 second) will use a loop to sleep in increments of 1 second and will use one additional syscall to 
read the current time to decide how much more to sleep.

5) if select() returns EINVAL, the error message will reporting the sleep time and the values in "timeout".

A side effect of this is that on both Linux and MacOS long sleeps work correctly if interrupted by SIGALRM,
because SIGALRM granularity is 1 sec and our sleep time is also 1 sec.

Commit [develop 06735d29] improve TMFE::Sleep()

K.O.
  2917   06 Dec 2024 Stefan RittInfoNew slow control framework "mdev"
A new slow control mini-framework has been developed for MIDAS and been successfully tested in the Mu3e experiment. It 
might be suited for other experiments as well.

Background

Since the late 90’s we have the three-tier bases slow control framework in MIDAS with class drivers, device drivers and bus 
drivers. While it was used successfully since many years, it is complicated to understand and limited in its flexibility. If we 
have a HV device with a demand value, a measured voltage and a current it’s fine, but if we want to control more things like 
trip voltage, temperature and status readout etc. it soon hits its limits. With the development of the new odbxx API 
(https://daq00.triumf.ca/MidasWiki/index.php/Odbxx) there is now an opportunity to make everything much simpler.

Design principles

Instead of a three-tier system, the new “mdev” framework (“m”idas “dev”ices) uses a simple base class which is attached to 
a certain MIDAS equipment. It implements five simple functions:

- odb_setup() to setup /Equipment/<name>/Settings and /Equipment/<name>/Variables to its desired structure

- init() to initialize the slow control device

- exit() to close the connection to the device

- loop() which is called periodically to read the device

- read_event() which returns a MIDAS event going to the data stream

A device driver inherits from this base class and implements the functions. A simple example can be found in 

  midas/drivers/mdev/mdev_mscb.[h,cxx]

for the MSCB field bus system used at TRIUMF and PSI. It basically boils down to two calls:

Init:
   m_variables.connect(“/Equipment/<name>/Variables”);
   m_variables[“Output”].watch(midas::odb &o) {
      m_mscb[“HV”] = o[0]; // transfer value from ODB to MSCB device
   }

Reading a value in the loop function:
   m_variables[“Input”][0] = m_mscb[“HVMeas”];

The member variable m_variables is a midas::odb variable attached to the “Input” and “Output” variables in the ODB. The 
watch() functions executes the lambda function whenever the “Output” in the ODB changes. It then simply transfers the new 
value to the device. The reading of measured values just work in the other direction from the device to the ODB.

If you look at the mdev_mscb.cxx code, you see of course some more things like connecting to the MSCB device with proper 
error handling, looping over several devices and variables, setting up the “Setting” directory in the ODB to define labels for 
all variables. In addition we have a mirror for output variables, so that new values are only sent to the device if they differ 
from the previous variable (needed to reduce some communication traffic). 

The midas/drivers/mdev directory contains also an example frontend in the mfe.cxx framework, but this is no a requirement. 
The mdev framework can also be used in the tmfe framework and others as well. Please note how compact the frontend 
code now looks.

User interface

Since the beginning, MIDAS allows access to the the slow control devices through the “equipment” page (on the main status 
page, click on one equipment). A few more options can control now the behavior of this page, allowing quite some flexibility 
without having to write a dedicated custom page (which of course can still be done). Attached is an example from Mu3e where 
the details of the equipment display are controlled through some options in the setting subdirectory as described in 
https://daq00.triumf.ca/MidasWiki/index.php//Equipment_ODB_tree (especially the “grid display”, “Editable” and “Format” 
flags).

Conclusions

The new “mdev” framework offers a compact and effective way to communicate from MIDAS to slow control devices. Since 
all interface code is now not “hidden” any more in system class and device drivers, the user has much higher flexibility in 
controlling different devices. If a device has a new parameter, the user can add a single line of code to connect this 
parameter to an ODB entry.

The framework is very simple and misses some features of the old system. Ramping of HV voltages and current trips are not 
available in the framework (like with the old HV class driver), but modern devices usually implement this in hardware which 
is much better. The new framework is not multi-threaded, but modern devices are these day much faster than in the ‘90s. 
Since the ODB is thread save, nothing prevents us from putting a device readout into its own thread in the frontend.

We will use the new system for all devices in Mu3e, with probably some new features being added soon, so stay tuned.

/Stefan
  2916   05 Dec 2024 Konstantin OlchanskiBug ReportODB key picker does not close when creating link / Edit-on-run string box too large
> > Another more minor visual problem is the edit-on-start dialog. There seems to be no upper bound to the 
> > size of the text box. In the attached screenshot, ShortString has a maximum length of 32 characters, 
> > LongString has 255. Both are empty at the time of the screenshot. Maybe, the size should be limited to a 
> > reasonable width.
> 
> I limited the input size now to (arbitrarily) 100 chars. The string can still be longer than 100 chars, and you start then scrolling inside the input box. Let me know if 
> that's ok this way.

I am moving the dragon experiment to the new midas and we see this problem on the begin-of-run page.

Old midas: no horizontal scroll bar, edit-on-start names, values and comments are all squeezed in into the visible frame.

New midas: page is very wide, values entry fields are very long and there is a horizontal scroll bar.

So something got broken in the htlm formatting or sizing. I should be able to spot the change by doing
a diff between old resources/start.html and the new one.

K.O.
  2915   04 Dec 2024 Konstantin OlchanskiBug ReportODB key picker does not close when creating link / Edit-on-run string box too large
> > Actual result:
> > The key picker does not close.
> 
> Thanks for reporting that bug. It has been fixed in the current commit (installed already on megon02)
>

Stefan, thank you for fixing both problems, I have seen them, too, but no time to deal with them.

K.O.
  2914   02 Dec 2024 Stefan RittForum"Safe" abort of sequencer scripts
The atexit() function has been implemented in the current develop branch of midas, see

  https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#ATEXIT_subroutine


Stefan
  2913   02 Dec 2024 Stefan RittBug ReportODB key picker does not close when creating link / Edit-on-run string box too large
> Another more minor visual problem is the edit-on-start dialog. There seems to be no upper bound to the 
> size of the text box. In the attached screenshot, ShortString has a maximum length of 32 characters, 
> LongString has 255. Both are empty at the time of the screenshot. Maybe, the size should be limited to a 
> reasonable width.

I limited the input size now to (arbitrarily) 100 chars. The string can still be longer than 100 chars, and you start then scrolling inside the input box. Let me know if 
that's ok this way.

Stefan
  2912   02 Dec 2024 Stefan RittBug ReportODB key picker does not close when creating link / Edit-on-run string box too large
> Actual result:
> The key picker does not close.

Thanks for reporting that bug. It has been fixed in the current commit (installed already on megon02)

Stefan
  2911   01 Dec 2024 Pavel MuratBug ReportEQ_PERIODIC-only equipment ?
> There is no requirement that you pair an EQ_PERIODIC with an EQ_TRIGGER. Take for exmaple
> 
>   midas/examples/experiment/frontend.cxx
> 
> and remove there the triggered event. The frontend runs happily with the periodic event only (I just tried that myself). You have probably some problem in 
> your event definition. Start with the running example frontend, and add your code line by line until you see the problem.

Hi Stefan, thank you very much! 

As the pointer to the readout function and pointers to device drivers are all defined in the same structure (EQUIPMENT), 
I was naively assuming that the readout function should be set during the class driver initialization.
Now it is clear that the equipment responding to EQ_PERIODIC events doesn't have to have drivers, 
and specifying the readout function is the responsibility of the user.

I got around exactly this way yesterday, but was thinking that I was hacking the system :)
 
-- regards, Pasha
  2910   01 Dec 2024 Stefan RittBug ReportEQ_PERIODIC-only equipment ?
There is no requirement that you pair an EQ_PERIODIC with an EQ_TRIGGER. Take for exmaple

  midas/examples/experiment/frontend.cxx

and remove there the triggered event. The frontend runs happily with the periodic event only (I just tried that myself). You have probably some problem in 
your event definition. Start with the running example frontend, and add your code line by line until you see the problem.

Stefan
  2909   30 Nov 2024 Pavel MuratBug ReportEQ_PERIODIC-only equipment ?
Dear Midas experts, 

I'm running into something which looks like an initialization problem. 
I have a mfe.cxx-style frontend which introduces an equipment of the EQ_PERIODIC type (EQ_PERIODIC-only!). 
When Midas enters the running state, I see the frontend crashing. 
Stepping through the code shows that the frontend is crashing because its equipment has been ignored 
by the initialize_equipment@mfe.cxx - see
 
https://bitbucket.org/tmidas/midas/src/5d0dae001712164ae43137dced2fbbb594f0201e/src/mfe.cxx#lines-630

Is there an assumption that the initialization of the EQ_PERIODIC-only equipment is the user responsibility? 
Or EQ_PERIODIC should always come paired with some other type?

-- many thanks, regards, Pasha
  2908   27 Nov 2024 Konstantin OlchanskiBug ReportTMFE::Sleep() errors
>       status = select(1, &fdset, NULL, NULL, &timeout);
>
>       On error, -1 is returned, ... timeout becomes undefined.

I have been reading "man select" for 30 years and I do not remember seeing this text.

I believe on BSD UNIX (MacOS) it says timeout is unchanged and on Linux is says timeout is updated to time actually slept.

I will have to investigate, but I suspect the man page was posix-ized, by sweeping BSD/MacOS and Linux implementations
under the same "instead of saying what it actually does, we will just say 'undefined'".

In any case, EINTR is not an error, it's an artefact of UNIX signal handling. Linux implementations always try
very hard to handle signals without causing EINTR to select(), read() and write(). This is most painful
when reading and writing to TCP sockets, because one most handle partial reads and EINTR.

K.O.
  2907   27 Nov 2024 Konstantin OlchanskiBug ReportTMFE::Sleep() errors
> 
> I wonder also if now that midas requires stricter/newer c++ standards if there maybe
> some more straightforward method to sleep that is sufficiently robust and portable.
> 

I believe POSIX defined clock_nanosleep() & co, so on most recent machines that is the most portable way to sleep.

Historically, select() was the only way to sleep for less than 1 sec, but it was never portable
because of differences between BSD UNIX and Linux implementations. (MacOS is BSD UNIX via FreeBSD).

On difference is the update of struct timeval is select() is interrupted.

In this elog entry, I compare sleep using select() with sleep using clock_nanosleep() and see that there is no difference:
https://daq00.triumf.ca/elog-midas/Midas/2115

As you can see tmfe.cxx has both implementations, select() and clock_nanosleep(), and anybody can try which one works better on 
their computer.

K.O.
  2906   27 Nov 2024 Konstantin OlchanskiBug ReportTMFE::Sleep() errors
> [tmfe.cxx:1033:TMFE::Sleep,ERROR] select() returned -1, errno 22 (Invalid argument).

The very original copy of this function had an error and was spewing out this error quite often,
this was a missing handler for EINTR.

Now it looks like we are missing a handler for EINVAL.

Most likely sleep is called with a funny sleep time value that fills struct timeval with
values select() does not like.

I see Ben added a check for negative sleep times, and this is good.

I think I will do these changes:

a) add an error message for negative sleep time, I think user should never call ::Sleep with negative or zero sleep times and 
if they do it is a bug and they should fix it, the error message will inform them so.

b) add a handler for EINVAL, which will report the requested sleep time and the values in struct timeval

K.O.
ELOG V3.1.4-2e1708b5