ID |
Date |
Author |
Topic |
Subject |
2916
|
05 Dec 2024 |
Konstantin Olchanski | Bug Report | ODB key picker does not close when creating link / Edit-on-run string box too large | > > Another more minor visual problem is the edit-on-start dialog. There seems to be no upper bound to the
> > size of the text box. In the attached screenshot, ShortString has a maximum length of 32 characters,
> > LongString has 255. Both are empty at the time of the screenshot. Maybe, the size should be limited to a
> > reasonable width.
>
> I limited the input size now to (arbitrarily) 100 chars. The string can still be longer than 100 chars, and you start then scrolling inside the input box. Let me know if
> that's ok this way.
I am moving the dragon experiment to the new midas and we see this problem on the begin-of-run page.
Old midas: no horizontal scroll bar, edit-on-start names, values and comments are all squeezed in into the visible frame.
New midas: page is very wide, values entry fields are very long and there is a horizontal scroll bar.
So something got broken in the htlm formatting or sizing. I should be able to spot the change by doing
a diff between old resources/start.html and the new one.
K.O. |
2915
|
04 Dec 2024 |
Konstantin Olchanski | Bug Report | ODB key picker does not close when creating link / Edit-on-run string box too large | > > Actual result:
> > The key picker does not close.
>
> Thanks for reporting that bug. It has been fixed in the current commit (installed already on megon02)
>
Stefan, thank you for fixing both problems, I have seen them, too, but no time to deal with them.
K.O. |
2914
|
02 Dec 2024 |
Stefan Ritt | Forum | "Safe" abort of sequencer scripts | The atexit() function has been implemented in the current develop branch of midas, see
https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#ATEXIT_subroutine
Stefan |
2913
|
02 Dec 2024 |
Stefan Ritt | Bug Report | ODB key picker does not close when creating link / Edit-on-run string box too large | > Another more minor visual problem is the edit-on-start dialog. There seems to be no upper bound to the
> size of the text box. In the attached screenshot, ShortString has a maximum length of 32 characters,
> LongString has 255. Both are empty at the time of the screenshot. Maybe, the size should be limited to a
> reasonable width.
I limited the input size now to (arbitrarily) 100 chars. The string can still be longer than 100 chars, and you start then scrolling inside the input box. Let me know if
that's ok this way.
Stefan |
2912
|
02 Dec 2024 |
Stefan Ritt | Bug Report | ODB key picker does not close when creating link / Edit-on-run string box too large | > Actual result:
> The key picker does not close.
Thanks for reporting that bug. It has been fixed in the current commit (installed already on megon02)
Stefan |
2911
|
01 Dec 2024 |
Pavel Murat | Bug Report | EQ_PERIODIC-only equipment ? | > There is no requirement that you pair an EQ_PERIODIC with an EQ_TRIGGER. Take for exmaple
>
> midas/examples/experiment/frontend.cxx
>
> and remove there the triggered event. The frontend runs happily with the periodic event only (I just tried that myself). You have probably some problem in
> your event definition. Start with the running example frontend, and add your code line by line until you see the problem.
Hi Stefan, thank you very much!
As the pointer to the readout function and pointers to device drivers are all defined in the same structure (EQUIPMENT),
I was naively assuming that the readout function should be set during the class driver initialization.
Now it is clear that the equipment responding to EQ_PERIODIC events doesn't have to have drivers,
and specifying the readout function is the responsibility of the user.
I got around exactly this way yesterday, but was thinking that I was hacking the system :)
-- regards, Pasha |
2910
|
01 Dec 2024 |
Stefan Ritt | Bug Report | EQ_PERIODIC-only equipment ? | There is no requirement that you pair an EQ_PERIODIC with an EQ_TRIGGER. Take for exmaple
midas/examples/experiment/frontend.cxx
and remove there the triggered event. The frontend runs happily with the periodic event only (I just tried that myself). You have probably some problem in
your event definition. Start with the running example frontend, and add your code line by line until you see the problem.
Stefan |
2909
|
30 Nov 2024 |
Pavel Murat | Bug Report | EQ_PERIODIC-only equipment ? | Dear Midas experts,
I'm running into something which looks like an initialization problem.
I have a mfe.cxx-style frontend which introduces an equipment of the EQ_PERIODIC type (EQ_PERIODIC-only!).
When Midas enters the running state, I see the frontend crashing.
Stepping through the code shows that the frontend is crashing because its equipment has been ignored
by the initialize_equipment@mfe.cxx - see
https://bitbucket.org/tmidas/midas/src/5d0dae001712164ae43137dced2fbbb594f0201e/src/mfe.cxx#lines-630
Is there an assumption that the initialization of the EQ_PERIODIC-only equipment is the user responsibility?
Or EQ_PERIODIC should always come paired with some other type?
-- many thanks, regards, Pasha |
2908
|
27 Nov 2024 |
Konstantin Olchanski | Bug Report | TMFE::Sleep() errors | > status = select(1, &fdset, NULL, NULL, &timeout);
>
> On error, -1 is returned, ... timeout becomes undefined.
I have been reading "man select" for 30 years and I do not remember seeing this text.
I believe on BSD UNIX (MacOS) it says timeout is unchanged and on Linux is says timeout is updated to time actually slept.
I will have to investigate, but I suspect the man page was posix-ized, by sweeping BSD/MacOS and Linux implementations
under the same "instead of saying what it actually does, we will just say 'undefined'".
In any case, EINTR is not an error, it's an artefact of UNIX signal handling. Linux implementations always try
very hard to handle signals without causing EINTR to select(), read() and write(). This is most painful
when reading and writing to TCP sockets, because one most handle partial reads and EINTR.
K.O. |
2907
|
27 Nov 2024 |
Konstantin Olchanski | Bug Report | TMFE::Sleep() errors | >
> I wonder also if now that midas requires stricter/newer c++ standards if there maybe
> some more straightforward method to sleep that is sufficiently robust and portable.
>
I believe POSIX defined clock_nanosleep() & co, so on most recent machines that is the most portable way to sleep.
Historically, select() was the only way to sleep for less than 1 sec, but it was never portable
because of differences between BSD UNIX and Linux implementations. (MacOS is BSD UNIX via FreeBSD).
On difference is the update of struct timeval is select() is interrupted.
In this elog entry, I compare sleep using select() with sleep using clock_nanosleep() and see that there is no difference:
https://daq00.triumf.ca/elog-midas/Midas/2115
As you can see tmfe.cxx has both implementations, select() and clock_nanosleep(), and anybody can try which one works better on
their computer.
K.O. |
2906
|
27 Nov 2024 |
Konstantin Olchanski | Bug Report | TMFE::Sleep() errors | > [tmfe.cxx:1033:TMFE::Sleep,ERROR] select() returned -1, errno 22 (Invalid argument).
The very original copy of this function had an error and was spewing out this error quite often,
this was a missing handler for EINTR.
Now it looks like we are missing a handler for EINVAL.
Most likely sleep is called with a funny sleep time value that fills struct timeval with
values select() does not like.
I see Ben added a check for negative sleep times, and this is good.
I think I will do these changes:
a) add an error message for negative sleep time, I think user should never call ::Sleep with negative or zero sleep times and
if they do it is a bug and they should fix it, the error message will inform them so.
b) add a handler for EINVAL, which will report the requested sleep time and the values in struct timeval
K.O. |
2905
|
26 Nov 2024 |
Maia Henriksson-Ward | Bug Report | TMFE::Sleep() errors | > Hello,
>
> I've noticed that SC FEs that use the TMFE class with midas-2022-05-c often report errors when calling TMFE:Sleep().
> The error is :
>
> [tmfe.cxx:1033:TMFE::Sleep,ERROR] select() returned -1, errno 22 (Invalid argument).
>
> This seems to happen in two different ways:
>
> 1. Error being reported repeatedly
> 2. Occasional single errors being reported
>
> When the first of these presents, we typically restart the FE to "solve" the problem.
> Case 2. is typically ignored.
>
> The code in question is:
>
> void TMFE::Sleep(double time)
> {
> int status;
> fd_set fdset;
> struct timeval timeout;
>
> FD_ZERO(&fdset);
>
> timeout.tv_sec = time;
> timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
>
> while (1) {
> status = select(1, &fdset, NULL, NULL, &timeout);
> #ifdef EINTR
> if (status < 0 && errno == EINTR) {
> continue;
> }
> #endif
> break;
> }
>
> if (status < 0) {
> TMFE::Instance()->Msg(MERROR, "TMFE::Sleep", "select() returned %d, errno %d (%s)", status, errno, strerror(errno));
> }
> }
>
> So it looks like either file descriptor of the timeval struct must have a problem.
> From some reading it seems that invalid timeval structs are often caused by one or both
> of tv_sec or tv_usec not being set. In the code above we can see that both appear to be
> correctly set initially.
>
> From the select() man page I see:
>
> RETURN VALUE
> On success, select() and pselect() return the number of file descriptors contained in
> the three returned descriptor sets (that is, the total number of bits that are set in
> readfds, writefds, exceptfds). The return value may be zero if the timeout expired
> before any file descriptors became ready.
>
> On error, -1 is returned, and errno is set to indicate the error; the file descriptor
> sets are unmodified, and timeout becomes undefined.
>
> The second paragraph quoted from the man page above would indicate to me that perhaps the
> timeout needs to be reset inside the if block. eg:
>
> if (status < 0 && errno == EINTR) {
> timeout.tv_sec = time;
> timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
> continue;
> }
>
> Please note that I've only just briefly looked at this and was hoping someone more
> familiar with using select() as a way to sleep() might be better able to understand
> what is happening.
>
> I wonder also if now that midas requires stricter/newer c++ standards if there maybe
> some more straightforward method to sleep that is sufficiently robust and portable.
>
> Thanks,
>
> Nick.
I had the same error a few months ago, though I wasn't using a tagged release. It happened because I was calling TMFE::Sleep()
with a negative time. If your issues were caused by the same reason, TMFE::Sleep() can handle negative times since commit
591f78f (https://bitbucket.org/tmidas/midas/commits/591f78f52893d5ffd64bf4e52a1daac537ebd672).
Early in my debugging, I did come to the same conclusions you did, and actually tried a similar solution the one you suggested.
This was a few months ago and I didn't write down what happened, but I believe it didn't work because in my case the errno was
something other than EINTR, and/or the timeval was still an invalid argument for sleep because the timeout was still negative. I
never followed it up because I was able to fix my problem by fixing my frontend. |
2904
|
26 Nov 2024 |
Nick Hastings | Bug Report | TMFE::Sleep() errors | Hello,
I've noticed that SC FEs that use the TMFE class with midas-2022-05-c often report errors when calling TMFE:Sleep().
The error is :
[tmfe.cxx:1033:TMFE::Sleep,ERROR] select() returned -1, errno 22 (Invalid argument).
This seems to happen in two different ways:
1. Error being reported repeatedly
2. Occasional single errors being reported
When the first of these presents, we typically restart the FE to "solve" the problem.
Case 2. is typically ignored.
The code in question is:
void TMFE::Sleep(double time)
{
int status;
fd_set fdset;
struct timeval timeout;
FD_ZERO(&fdset);
timeout.tv_sec = time;
timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
while (1) {
status = select(1, &fdset, NULL, NULL, &timeout);
#ifdef EINTR
if (status < 0 && errno == EINTR) {
continue;
}
#endif
break;
}
if (status < 0) {
TMFE::Instance()->Msg(MERROR, "TMFE::Sleep", "select() returned %d, errno %d (%s)", status, errno, strerror(errno));
}
}
So it looks like either file descriptor of the timeval struct must have a problem.
From some reading it seems that invalid timeval structs are often caused by one or both
of tv_sec or tv_usec not being set. In the code above we can see that both appear to be
correctly set initially.
From the select() man page I see:
RETURN VALUE
On success, select() and pselect() return the number of file descriptors contained in
the three returned descriptor sets (that is, the total number of bits that are set in
readfds, writefds, exceptfds). The return value may be zero if the timeout expired
before any file descriptors became ready.
On error, -1 is returned, and errno is set to indicate the error; the file descriptor
sets are unmodified, and timeout becomes undefined.
The second paragraph quoted from the man page above would indicate to me that perhaps the
timeout needs to be reset inside the if block. eg:
if (status < 0 && errno == EINTR) {
timeout.tv_sec = time;
timeout.tv_usec = (time-timeout.tv_sec)*1000000.0;
continue;
}
Please note that I've only just briefly looked at this and was hoping someone more
familiar with using select() as a way to sleep() might be better able to understand
what is happening.
I wonder also if now that midas requires stricter/newer c++ standards if there maybe
some more straightforward method to sleep that is sufficiently robust and portable.
Thanks,
Nick. |
2903
|
24 Nov 2024 |
Pavel Murat | Bug Report | ODB lock timeout, Difficulty running MIDAS on Rocky 9.4 | there is a really good software tool developed by the Fermilab DAQ group, called TRACE -
https://github.com/art-daq/trace ,
It could be useful for debugging cases like this one. In short, TRACE instruments the code
with the printouts which could be selectively turned on and off without recompiling the executable.
TRACE output could go to /dev/stdout (slow output) and/or to a circular buffer implemented via a shared
memory segment (fast output). Sending unlimited output to the shared memory segment is extremely useful.
TRACE also allows to trigger on certain conditions, again, w/o recompiling the executable.
For debugging cases like the one in question, that could turn out even more useful,
however I didn't try the triggering functionality myself.
-- regards, Pasha |
2902
|
22 Nov 2024 |
Konstantin Olchanski | Bug Report | ODB lock timeout, Difficulty running MIDAS on Rocky 9.4 | > try to replicate the issue ...
I see ODB lock timeout (and abort() of everything) in the dsvslice test station. We have
about 15-20 MIDAS clients connected.
I am pretty sure we have not seen this problem until recently (and I have not seen it
personally for a very long time). There were no changes to the MIDAS ODB locking code in a
long time.
I suspect a recent change in the linux kernel. But I am likely to be wrong.
I have 1000 core dumps from this crash of dsvslice, and among them should be the 1 thread
that has ODB locked. Wish me luck finding it. Worst case is to discover that ODB is locked
but nobody is holding a lock ("missing unlock bug"). This is hard to debug, I would have add
tracking of "who was the last one to lock it, who forgot to unlock it".
K.O. |
2901
|
21 Nov 2024 |
Stefan Ritt | Info | What do the status numbers mean and where can I find more information about them? |
> [RP Streaming Frontend,ERROR] [midas.cxx:17806:cm_write_event_to_odb,ERROR]
> cannot create key for bank "DATA" with tid 24 in ODB, db_create_key() status 309
>
>
>
> I just need more information on what the error message means. Which data type
> refers to tid 24 and what does status 309 indicate??
A tid (type identification) of 24 does actually not exist. See midas.h:327, so this tells
me that your bank header got corrupted. Somewhere you write over your data.
Stefan |
2900
|
21 Nov 2024 |
Mann Gandhi | Info | What do the status numbers mean and where can I find more information about them? | Hello,
This is the error message I got:
[RP Streaming Frontend,ERROR] [midas.cxx:17806:cm_write_event_to_odb,ERROR]
cannot create key for bank "DATA" with tid 24 in ODB, db_create_key() status 309
read_periodic_event: No data in ring buffer or error occurred
[RP Streaming Frontend,ERROR] [odb.cxx:3373:db_create_key,ERROR] invalid key type
24 to create 'DATA' in '/Equipment/Periodic/Variables'
[RP Streaming Frontend,ERROR] [midas.cxx:17806:cm_write_event_to_odb,ERROR]
cannot create key for bank "DATA" with tid 24 in ODB, db_create_key() status 309
I just need more information on what the error message means. Which data type
refers to tid 24 and what does status 309 indicate??
There is definitely data in the ring buffer but I keep on getting this error.
Thank you!
M.G |
2899
|
18 Nov 2024 |
Lukas Gerritzen | Suggestion | Comma-separated indices in alarm conditions | I have the following use case: I would like to check if two elements of an array exceed a certain threshold.
However, they are not consecutive. Currently, I have to write two alarms, one checking Array[8] and one
checking Array [10].
It would be nice if we could enter conditions such as "/Path/To/Array[8,10] > 0.5".
I looked into the code of al_evaluate_condition() and it seems very C-style. I know that you have been
refactoring a lot of code to work with STL strings and their functions. If you find the time to refactor
alarm.cxx, I ask that you consider adding comma-separated lists as a new feature.
Cheers
Lukas |
2898
|
18 Nov 2024 |
Stefan Ritt | Info | New sequencer command ODBLOOKUP | > "value not found" sets "index" to ?
It sets it actually to "not found". Since all variables are stings in the sequencer, you can then do a test like
ODBLOOKUP ..., index
if ($index == "not found")
...
> "odb key not found" sets "index" to ?
If the odb key is not found, the sequencer aborts.
> link to documentation?
The documentation is where it always has been:
https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#Sequencer_Commands
/Stefan |
2897
|
15 Nov 2024 |
Konstantin Olchanski | Info | New sequencer command ODBLOOKUP | > A new sequencer command "ODBLOOKUP" has been implemented, which does a lookup of a string in a string
> array in the ODB given by a path and returns its index as a number. If we have for example an array
>
> /Examples/Names
> [0] Hello
> [1] Test
> [2] Other
>
> and do a
>
> ODBLOOKUP "/Examples/Names", "Test", index
>
> we get a index equal 1.
>
"value not found" sets "index" to ?
"odb key not found" sets "index" to ?
link to documentation?
K.O. |
|