ID |
Date |
Author |
Topic |
Subject |
2925
|
19 Dec 2024 |
Stefan Ritt | Suggestion | New alarm count added | Another modification has been done to the alarm system.
We have often cases where an alarm is triggered on some readout noise. Like an analog voltage just over the alarm threshold for a very short period of time, triggered sometimes from environmental
electromagnetic effects like turning on the light.
To mitigate this problem, an "alarm trigger count" has been implemented. Every alarm has now a variable "Trigger count required". If this value is zero (the default), the alarm system works as before. If this
value is nowever set to a non-zero value N, the alarm limit has to be met on N consecutive periods in order to trigger the alarm. Each alarm has a "Check interval" which determines the period of the alarm
checking. If one has for example:
Check interval = 10
Trigger count required = 3
then the alarm condition has to be met for 3 consecutive periods of 10 seconds each, or for 30 seconds in total.
The modification has been merged into the develop branch, and people have to be aware that the alarm structures in the ODB changed. The current code tries to fix this automatically, but it is important
that ALL MIDAS CLIENTS GET RE-COMPILED after the new code is applied. Otherwise we could have "new" clients expecting the new ODB structure, and some "old" clients expecting the old structure,
then both types of clients would fight against each other and change the ODB structure every few seconds, leading to tons of error messages. So if you pull the current develop branch, make sure to re-
compile ALL midas clients.
/Stefan |
2924
|
19 Dec 2024 |
Stefan Ritt | Bug Report | [History plots] "Jump to current time" resets x range to 7d | I had put in a check which limits the range to 7d into the past if you press the "play" button, but now I'm not sure why this was needed. I removed it again and
things seem to be fine. Change is committed to develop.
Stefan |
2923
|
17 Dec 2024 |
Lukas Gerritzen | Bug Report | [History plots] "Jump to current time" resets x range to 7d | To reproduce:
- Open a history plot, click [-] a few times until the x axis shows more than 7 days.
- Scroll to the past (left)
- Click "Jump to current time" (the triangle)
Expected result:
The upper limit of the x axis is at the current time and the lower range is now - whatever range you had before
(>7d)
Actual result:
The upper limit is the current time, the lower limit is now - 7d
(The interval seems unchanged if the range was < 7d before clicking "Jump to current time") |
2922
|
13 Dec 2024 |
Marius Koeppel | Info | New Feature: Message Search | Dear all,
a new feature was implemented which allows to search the log messages in MIDAS. Attached one can find a more detailed explanation of how to use the feature.
If you see any issues / bugs feel don't hesitate to report them. For now the code was tested on Linux / Mac OS using Chrome, Firefox and Safari.
Best,
Marius |
Attachment 1: filters.pdf
|
|
2921
|
12 Dec 2024 |
Stefan Ritt | Suggestion | New alarm sound flag to be tested | We had the case in MEG that some alarms were actually just warnings, not very severe. This happens for example if we calibrate our detector
once every other day and modify the hardware which actually triggers the alarm for about an hour or so.
The problem with this is now that the alarm sounds every few minutes, and people get annoyed during that hour. They turn down the volume
of their speakers, or even disable the alarm sound. If the detector gets back into the default mode again, they might forget to re-enable the
alarm, which causes some risk.
Turning down the volume is also not good, since during that hour we could have a "real" alarm on which people have to react quickly in order
not destroy the detector.
The art is now to configure the alarm system in a way that "normal" changes do not annoy people or cover up really severe alarms. After long
discussions we came to following conclusion: We need a special class of alarm (let's call it 'warning') which does not annoy people. The
warning should be visible on the screen, but not ring the alarm bell.
While we have different alarm classes in midas, which let us customize the frequency of alarms and the screen colors, all alarms or warnings
ring the alarm sound right now. This can be changed in the browser under "Config/Alarm sound" but that switch affects ALL alarms, which is
not what we want.
The idea we came up with was to add a flag "Alarm sound" to the alarm classes. For the 'warning' we can then turn off the alarm sound, so
only the banner is shown on top of the screen, and the spoken message is generated every 15 mins to remind people, but not to annoy them.
I added this "Alarm sound" flag in the branch feature/alarm_sound so everybody can test it. The downside is that all "Alarm/Classs/xxx" need
to be modified to add this flag. While the new code will add this flag automatically (with a default value of 'true'), the size of the alarm class
record changes by four bytes (one bool). Therefore, all running midas programs will complain about the changed size, until they get
recompiled.
Therefore, to test the new feature, you have to checkout the branch and re-compile all midas programs you use, otherwise you will get errors
like
Fixing ODB "/Alarms/Classes/Alarm" struct size mismatch (expected 352, odb size 348)
I will keep the branch for a few days for people to try it out and report any issue, and later merge it to develop.
Stefan |
2920
|
10 Dec 2024 |
Stefan Ritt | Suggestion | Comma-separated indices in alarm conditions | These kind of alarm conditions have been implemented and committed. The documentation at
https://daq00.triumf.ca/MidasWiki/index.php/Alarm_System
has been updated.
/Stefan |
2919
|
06 Dec 2024 |
Konstantin Olchanski | Bug Report | TMFE::Sleep() errors | > Commit [develop 06735d29] improve TMFE::Sleep()
Report of test_sleep on Ubuntu-22 (Intel E-2236) and MacOS 14.6.1 (M1 MAX Mac Studio).
It is easy to see Ubuntu-22 (kernel 6.2.x) sleep granularity is ~60 usec, MacOS sleep granularity is <2 usec. (Sub 1 usec sleep likely measures the syscall() speed, 500 ns on Intel and 200
ns on ARM M1 MAX). (NOTE: long sleep is interrupted by an alarm roughly 10 seconds into the sleep, see progs/test_sleep.cxx)
daq00:midas$ uname -a
Linux daq00.triumf.ca 6.2.0-39-generic #40~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 16 10:53:04 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
daq00:midas$ ./bin/test_sleep
test short sleep:
sleep 10 loops, 100000.000 usec per loop, 1.000000000 sec total, 1.002568007 sec actual total, 100256.801 usec actual per loop, oversleep 256.801 usec, 0.3%
sleep 100 loops, 10000.000 usec per loop, 1.000000000 sec total, 1.025897980 sec actual total, 10258.980 usec actual per loop, oversleep 258.980 usec, 2.6%
sleep 1000 loops, 1000.000 usec per loop, 1.000000000 sec total, 1.169670105 sec actual total, 1169.670 usec actual per loop, oversleep 169.670 usec, 17.0%
sleep 10000 loops, 100.000 usec per loop, 1.000000000 sec total, 1.573357105 sec actual total, 157.336 usec actual per loop, oversleep 57.336 usec, 57.3%
sleep 99999 loops, 10.000 usec per loop, 0.999990000 sec total, 6.963442087 sec actual total, 69.635 usec actual per loop, oversleep 59.635 usec, 596.4%
sleep 1000000 loops, 1.000 usec per loop, 1.000000000 sec total, 60.939687967 sec actual total, 60.940 usec actual per loop, oversleep 59.940 usec, 5994.0%
sleep 1000000 loops, 0.100 usec per loop, 0.100000000 sec total, 0.613572121 sec actual total, 0.614 usec actual per loop, oversleep 0.514 usec, 513.6%
sleep 1000000 loops, 0.010 usec per loop, 0.010000000 sec total, 0.576359987 sec actual total, 0.576 usec actual per loop, oversleep 0.566 usec, 5663.6%
test long sleep: requested 126.000000000 sec ... sleeping ...
alarm!
test long sleep: requested 126.000000000 sec, actual 126.000875950 sec
bash-3.2$ uname -a
Darwin send.triumf.ca 23.6.0 Darwin Kernel Version 23.6.0: Mon Jul 29 21:14:30 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T6000 arm64
bash-3.2$ ./bin/test_sleep
test short sleep:
sleep 10 loops, 100000.000 usec per loop, 1.000000000 sec total, 1.032556057 sec actual total, 103255.606 usec actual per loop, oversleep 3255.606 usec, 3.3%
sleep 100 loops, 10000.000 usec per loop, 1.000000000 sec total, 1.245460033 sec actual total, 12454.600 usec actual per loop, oversleep 2454.600 usec, 24.5%
sleep 1000 loops, 1000.000 usec per loop, 1.000000000 sec total, 1.331466913 sec actual total, 1331.467 usec actual per loop, oversleep 331.467 usec, 33.1%
sleep 10000 loops, 100.000 usec per loop, 1.000000000 sec total, 1.281141996 sec actual total, 128.114 usec actual per loop, oversleep 28.114 usec, 28.1%
sleep 99999 loops, 10.000 usec per loop, 0.999990000 sec total, 1.410759926 sec actual total, 14.108 usec actual per loop, oversleep 4.108 usec, 41.1%
sleep 1000000 loops, 1.000 usec per loop, 1.000000000 sec total, 2.400593996 sec actual total, 2.401 usec actual per loop, oversleep 1.401 usec, 140.1%
sleep 1000000 loops, 0.100 usec per loop, 0.100000000 sec total, 0.188431025 sec actual total, 0.188 usec actual per loop, oversleep 0.088 usec, 88.4%
sleep 1000000 loops, 0.010 usec per loop, 0.010000000 sec total, 0.188102007 sec actual total, 0.188 usec actual per loop, oversleep 0.178 usec, 1781.0%
test long sleep: requested 126.000000000 sec ... sleeping ...
alarm!
test long sleep: requested 126.000000000 sec, actual 126.001244068 sec
K.O. |
2918
|
06 Dec 2024 |
Konstantin Olchanski | Bug Report | TMFE::Sleep() errors | > > status = select(1, &fdset, NULL, NULL, &timeout);
> > On error, -1 is returned, ... timeout becomes undefined.
I confirm Linux and MacOS man pages and select() with EINTR work as I remember, Linux updates "timeout" to account for the
time already slept, MacOS does not ("timeout" is unchanged).
So the original code is roughly correct, but long sleeps will not work right if SIGALRM fires during sleeping.
Note that MIDAS no longer uses SIGALRM to fire cm_watchdog() (it was moved to a thread) and MIDAS does not use signals,
so handling of EINTR is now moot.
(Please correct me if I missed something).
The original bug report was about EINVAL, and best I can tell, it was caused by calls to TMFE::Sleep()
with strange sleep times that caused invalid values to be computed into the select() timeout.
To improve on this, I make these changes:
1) TMFE::Sleep(0) will report an error and will not sleep
2) TMFE::Sleep(negative number) will report an error and will not sleep
(please check the sleep time before calling TMFE::Sleep())
3) TMFE::Sleep(1 sec or less) will sleep using select(). (I also looked into using poll(), ppoll() and pselect()).
4) TMFE::Sleep(more than 1 second) will use a loop to sleep in increments of 1 second and will use one additional syscall to
read the current time to decide how much more to sleep.
5) if select() returns EINVAL, the error message will reporting the sleep time and the values in "timeout".
A side effect of this is that on both Linux and MacOS long sleeps work correctly if interrupted by SIGALRM,
because SIGALRM granularity is 1 sec and our sleep time is also 1 sec.
Commit [develop 06735d29] improve TMFE::Sleep()
K.O. |
2917
|
06 Dec 2024 |
Stefan Ritt | Info | New slow control framework "mdev" | A new slow control mini-framework has been developed for MIDAS and been successfully tested in the Mu3e experiment. It
might be suited for other experiments as well.
Background
Since the late 90’s we have the three-tier bases slow control framework in MIDAS with class drivers, device drivers and bus
drivers. While it was used successfully since many years, it is complicated to understand and limited in its flexibility. If we
have a HV device with a demand value, a measured voltage and a current it’s fine, but if we want to control more things like
trip voltage, temperature and status readout etc. it soon hits its limits. With the development of the new odbxx API
(https://daq00.triumf.ca/MidasWiki/index.php/Odbxx) there is now an opportunity to make everything much simpler.
Design principles
Instead of a three-tier system, the new “mdev” framework (“m”idas “dev”ices) uses a simple base class which is attached to
a certain MIDAS equipment. It implements five simple functions:
- odb_setup() to setup /Equipment/<name>/Settings and /Equipment/<name>/Variables to its desired structure
- init() to initialize the slow control device
- exit() to close the connection to the device
- loop() which is called periodically to read the device
- read_event() which returns a MIDAS event going to the data stream
A device driver inherits from this base class and implements the functions. A simple example can be found in
midas/drivers/mdev/mdev_mscb.[h,cxx]
for the MSCB field bus system used at TRIUMF and PSI. It basically boils down to two calls:
Init:
m_variables.connect(“/Equipment/<name>/Variables”);
m_variables[“Output”].watch(midas::odb &o) {
m_mscb[“HV”] = o[0]; // transfer value from ODB to MSCB device
}
Reading a value in the loop function:
m_variables[“Input”][0] = m_mscb[“HVMeas”];
The member variable m_variables is a midas::odb variable attached to the “Input” and “Output” variables in the ODB. The
watch() functions executes the lambda function whenever the “Output” in the ODB changes. It then simply transfers the new
value to the device. The reading of measured values just work in the other direction from the device to the ODB.
If you look at the mdev_mscb.cxx code, you see of course some more things like connecting to the MSCB device with proper
error handling, looping over several devices and variables, setting up the “Setting” directory in the ODB to define labels for
all variables. In addition we have a mirror for output variables, so that new values are only sent to the device if they differ
from the previous variable (needed to reduce some communication traffic).
The midas/drivers/mdev directory contains also an example frontend in the mfe.cxx framework, but this is no a requirement.
The mdev framework can also be used in the tmfe framework and others as well. Please note how compact the frontend
code now looks.
User interface
Since the beginning, MIDAS allows access to the the slow control devices through the “equipment” page (on the main status
page, click on one equipment). A few more options can control now the behavior of this page, allowing quite some flexibility
without having to write a dedicated custom page (which of course can still be done). Attached is an example from Mu3e where
the details of the equipment display are controlled through some options in the setting subdirectory as described in
https://daq00.triumf.ca/MidasWiki/index.php//Equipment_ODB_tree (especially the “grid display”, “Editable” and “Format”
flags).
Conclusions
The new “mdev” framework offers a compact and effective way to communicate from MIDAS to slow control devices. Since
all interface code is now not “hidden” any more in system class and device drivers, the user has much higher flexibility in
controlling different devices. If a device has a new parameter, the user can add a single line of code to connect this
parameter to an ODB entry.
The framework is very simple and misses some features of the old system. Ramping of HV voltages and current trips are not
available in the framework (like with the old HV class driver), but modern devices usually implement this in hardware which
is much better. The new framework is not multi-threaded, but modern devices are these day much faster than in the ‘90s.
Since the ODB is thread save, nothing prevents us from putting a device readout into its own thread in the frontend.
We will use the new system for all devices in Mu3e, with probably some new features being added soon, so stay tuned.
/Stefan |
Attachment 1: PDCC.png
|
|
2916
|
05 Dec 2024 |
Konstantin Olchanski | Bug Report | ODB key picker does not close when creating link / Edit-on-run string box too large | > > Another more minor visual problem is the edit-on-start dialog. There seems to be no upper bound to the
> > size of the text box. In the attached screenshot, ShortString has a maximum length of 32 characters,
> > LongString has 255. Both are empty at the time of the screenshot. Maybe, the size should be limited to a
> > reasonable width.
>
> I limited the input size now to (arbitrarily) 100 chars. The string can still be longer than 100 chars, and you start then scrolling inside the input box. Let me know if
> that's ok this way.
I am moving the dragon experiment to the new midas and we see this problem on the begin-of-run page.
Old midas: no horizontal scroll bar, edit-on-start names, values and comments are all squeezed in into the visible frame.
New midas: page is very wide, values entry fields are very long and there is a horizontal scroll bar.
So something got broken in the htlm formatting or sizing. I should be able to spot the change by doing
a diff between old resources/start.html and the new one.
K.O. |
2915
|
04 Dec 2024 |
Konstantin Olchanski | Bug Report | ODB key picker does not close when creating link / Edit-on-run string box too large | > > Actual result:
> > The key picker does not close.
>
> Thanks for reporting that bug. It has been fixed in the current commit (installed already on megon02)
>
Stefan, thank you for fixing both problems, I have seen them, too, but no time to deal with them.
K.O. |
2914
|
02 Dec 2024 |
Stefan Ritt | Forum | "Safe" abort of sequencer scripts | The atexit() function has been implemented in the current develop branch of midas, see
https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#ATEXIT_subroutine
Stefan |
2913
|
02 Dec 2024 |
Stefan Ritt | Bug Report | ODB key picker does not close when creating link / Edit-on-run string box too large | > Another more minor visual problem is the edit-on-start dialog. There seems to be no upper bound to the
> size of the text box. In the attached screenshot, ShortString has a maximum length of 32 characters,
> LongString has 255. Both are empty at the time of the screenshot. Maybe, the size should be limited to a
> reasonable width.
I limited the input size now to (arbitrarily) 100 chars. The string can still be longer than 100 chars, and you start then scrolling inside the input box. Let me know if
that's ok this way.
Stefan |
2912
|
02 Dec 2024 |
Stefan Ritt | Bug Report | ODB key picker does not close when creating link / Edit-on-run string box too large | > Actual result:
> The key picker does not close.
Thanks for reporting that bug. It has been fixed in the current commit (installed already on megon02)
Stefan |
2911
|
01 Dec 2024 |
Pavel Murat | Bug Report | EQ_PERIODIC-only equipment ? | > There is no requirement that you pair an EQ_PERIODIC with an EQ_TRIGGER. Take for exmaple
>
> midas/examples/experiment/frontend.cxx
>
> and remove there the triggered event. The frontend runs happily with the periodic event only (I just tried that myself). You have probably some problem in
> your event definition. Start with the running example frontend, and add your code line by line until you see the problem.
Hi Stefan, thank you very much!
As the pointer to the readout function and pointers to device drivers are all defined in the same structure (EQUIPMENT),
I was naively assuming that the readout function should be set during the class driver initialization.
Now it is clear that the equipment responding to EQ_PERIODIC events doesn't have to have drivers,
and specifying the readout function is the responsibility of the user.
I got around exactly this way yesterday, but was thinking that I was hacking the system :)
-- regards, Pasha |
2910
|
01 Dec 2024 |
Stefan Ritt | Bug Report | EQ_PERIODIC-only equipment ? | There is no requirement that you pair an EQ_PERIODIC with an EQ_TRIGGER. Take for exmaple
midas/examples/experiment/frontend.cxx
and remove there the triggered event. The frontend runs happily with the periodic event only (I just tried that myself). You have probably some problem in
your event definition. Start with the running example frontend, and add your code line by line until you see the problem.
Stefan |
2909
|
30 Nov 2024 |
Pavel Murat | Bug Report | EQ_PERIODIC-only equipment ? | Dear Midas experts,
I'm running into something which looks like an initialization problem.
I have a mfe.cxx-style frontend which introduces an equipment of the EQ_PERIODIC type (EQ_PERIODIC-only!).
When Midas enters the running state, I see the frontend crashing.
Stepping through the code shows that the frontend is crashing because its equipment has been ignored
by the initialize_equipment@mfe.cxx - see
https://bitbucket.org/tmidas/midas/src/5d0dae001712164ae43137dced2fbbb594f0201e/src/mfe.cxx#lines-630
Is there an assumption that the initialization of the EQ_PERIODIC-only equipment is the user responsibility?
Or EQ_PERIODIC should always come paired with some other type?
-- many thanks, regards, Pasha |
2908
|
27 Nov 2024 |
Konstantin Olchanski | Bug Report | TMFE::Sleep() errors | > status = select(1, &fdset, NULL, NULL, &timeout);
>
> On error, -1 is returned, ... timeout becomes undefined.
I have been reading "man select" for 30 years and I do not remember seeing this text.
I believe on BSD UNIX (MacOS) it says timeout is unchanged and on Linux is says timeout is updated to time actually slept.
I will have to investigate, but I suspect the man page was posix-ized, by sweeping BSD/MacOS and Linux implementations
under the same "instead of saying what it actually does, we will just say 'undefined'".
In any case, EINTR is not an error, it's an artefact of UNIX signal handling. Linux implementations always try
very hard to handle signals without causing EINTR to select(), read() and write(). This is most painful
when reading and writing to TCP sockets, because one most handle partial reads and EINTR.
K.O. |
2907
|
27 Nov 2024 |
Konstantin Olchanski | Bug Report | TMFE::Sleep() errors | >
> I wonder also if now that midas requires stricter/newer c++ standards if there maybe
> some more straightforward method to sleep that is sufficiently robust and portable.
>
I believe POSIX defined clock_nanosleep() & co, so on most recent machines that is the most portable way to sleep.
Historically, select() was the only way to sleep for less than 1 sec, but it was never portable
because of differences between BSD UNIX and Linux implementations. (MacOS is BSD UNIX via FreeBSD).
On difference is the update of struct timeval is select() is interrupted.
In this elog entry, I compare sleep using select() with sleep using clock_nanosleep() and see that there is no difference:
https://daq00.triumf.ca/elog-midas/Midas/2115
As you can see tmfe.cxx has both implementations, select() and clock_nanosleep(), and anybody can try which one works better on
their computer.
K.O. |
2906
|
27 Nov 2024 |
Konstantin Olchanski | Bug Report | TMFE::Sleep() errors | > [tmfe.cxx:1033:TMFE::Sleep,ERROR] select() returned -1, errno 22 (Invalid argument).
The very original copy of this function had an error and was spewing out this error quite often,
this was a missing handler for EINTR.
Now it looks like we are missing a handler for EINVAL.
Most likely sleep is called with a funny sleep time value that fills struct timeval with
values select() does not like.
I see Ben added a check for negative sleep times, and this is good.
I think I will do these changes:
a) add an error message for negative sleep time, I think user should never call ::Sleep with negative or zero sleep times and
if they do it is a bug and they should fix it, the error message will inform them so.
b) add a handler for EINVAL, which will report the requested sleep time and the values in struct timeval
K.O. |
|