ID |
Date |
Author |
Topic |
Subject |
2751
|
03 May 2024 |
Stefan Ritt | Bug Report | Params not initialized when starting sequencer | Ok, here is the complete code to reproduce the problem. Load parameter_test.msl which includes functions.msl. From the screenshot you see the variables containing
garbage, and you also see that from the ODB screenshot. For completeness, I added Sequencer.json which contains the whole sequencer tree.
The interesting thing is that this works sometimes, and sometimes not. I'm not sure if this in the GUI or in the sequencer program, so we have to sort out who can
fix it ;-)
Best,
Stefan |
Attachment 1: param_test.msl
|
INCLUDE functions
PARAM lv_start, "Start of LV", 1.8
PARAM lv_stop, "Stop of LV", 2.1
PARAM lv_step, "Step of LV", 0.02
n_iterations = (($lv_stop - $lv_start)/$lv_step)
MSG "Parameters:"
MSG $lv_start
MSG $lv_stop
MSG $lv_step
MSG $n_iterations
MSG "Start of looping"
LOOP n, $n_iterations
lv_now = $lv_start + $n * $lv_step
MSG $lv_now
WAIT SECONDS, 1
ENDLOOP
|
Attachment 2: functions.msl
|
SUBROUTINE sub1
WAIT seconds, 1
ENDSUBROUTINE
SUBROUTINE sub2
WAIT seconds, 1
ENDSUBROUTINE
SUBROUTINE sub3
WAIT seconds, 1
ENDSUBROUTINE
SUBROUTINE sub4
WAIT seconds, 1
ENDSUBROUTINE
SUBROUTINE sub5
WAIT seconds, 1
ENDSUBROUTINE
SUBROUTINE sub6
WAIT seconds, 1
ENDSUBROUTINE
|
Attachment 3: Sequencer.json
|
{
"/MIDAS version": "2.1",
"/filename": "Sequencer.json",
"/ODB path": "/Sequencer",
"State": {
"New File/key": {
"type": 8,
"access_mode": 7,
"last_written": 1714720819
},
"New File": false,
"Path/key": {
"type": 12,
"item_size": 256,
"access_mode": 7,
"last_written": 1714720819
},
"Path": "",
"Filename/key": {
"type": 12,
"item_size": 256,
"access_mode": 7,
"last_written": 1714720819
},
"Filename": "param_test.msl",
"SFilename/key": {
"type": 12,
"item_size": 256,
"access_mode": 7,
"last_written": 1714720819
},
"SFilename": "/Users/ritt/online/userfiles/sequencer/param_test.msl",
"Next Filename/key": {
"type": 12,
"num_values": 10,
"item_size": 256,
"access_mode": 7,
"last_written": 1714720819
},
"Next Filename": [
"",
"",
"",
"",
"",
"",
"",
"",
"",
""
],
"Error/key": {
"type": 12,
"item_size": 256,
"access_mode": 7,
"last_written": 1714720819
},
"Error": "",
"Error line/key": {
"type": 7,
"access_mode": 7,
"last_written": 1714720819
},
"Error line": 0,
"SError line/key": {
"type": 7,
"access_mode": 7,
"last_written": 1714720819
},
"SError line": 0,
"Message/key": {
"type": 12,
"item_size": 256,
"access_mode": 7,
"last_written": 1714720819
},
"Message": "",
"Message Wait/key": {
"type": 8,
"access_mode": 7,
"last_written": 1714720819
},
"Message Wait": false,
"Running/key": {
"type": 8,
"access_mode": 7,
"last_written": 1714720819
},
"Running": true,
"Finished/key": {
"type": 8,
"access_mode": 7,
"last_written": 1714720819
},
"Finished": false,
"Paused/key": {
"type": 8,
"access_mode": 7,
"last_written": 1714720819
},
"Paused": false,
"Debug/key": {
"type": 8,
"access_mode": 7,
"last_written": 1714720819
},
"Debug": false,
"Current line number/key": {
"type": 7,
"access_mode": 7,
"last_written": 1714720819
},
"Current line number": 46,
"SCurrent line number/key": {
"type": 7,
"access_mode": 7,
"last_written": 1714720819
},
"SCurrent line number": 20,
"Follow Libraries/key": {
"type": 8,
"access_mode": 7,
"last_written": 1714720819
},
"Follow Libraries": true,
"Stop after run/key": {
"type": 8,
"access_mode": 7,
"last_written": 1714720819
},
"Stop after run": false,
"Transition request/key": {
"type": 8,
"access_mode": 7,
"last_written": 1714720819
},
"Transition request": false,
"Loop start line/key": {
"type": 7,
"num_values": 10,
"access_mode": 7,
"last_written": 1714720819
},
"Loop start line": [
43,
0,
0,
0,
0,
0,
0,
0,
0,
0
],
"SLoop start line/key": {
"type": 7,
"num_values": 10,
"access_mode": 7,
"last_written": 1714720819
},
"SLoop start line": [
17,
0,
0,
0,
0,
0,
0,
0,
0,
0
],
"Loop end line/key": {
"type": 7,
"num_values": 10,
"access_mode": 7,
"last_written": 1714720819
},
"Loop end line": [
47,
0,
0,
0,
0,
0,
0,
0,
0,
0
],
"SLoop end line/key": {
"type": 7,
"num_values": 10,
"access_mode": 7,
"last_written": 1714720819
},
"SLoop end line": [
21,
0,
0,
0,
0,
0,
0,
0,
0,
0
],
"Loop counter/key": {
"type": 7,
"num_values": 10,
"access_mode": 7,
"last_written": 1714720819
},
"Loop counter": [
6,
0,
0,
0,
0,
0,
0,
0,
0,
0
],
"Loop n/key": {
"type": 7,
"num_values": 10,
"access_mode": 7,
"last_written": 1714720819
},
"Loop n": [
0,
0,
0,
0,
0,
0,
0,
0,
0,
0
],
"Subdir/key": {
"type": 12,
"item_size": 256,
"access_mode": 7,
"last_written": 1714720819
},
"Subdir": "",
"Subdir end line/key": {
"type": 7,
"access_mode": 7,
"last_written": 1714720819
},
"Subdir end line": 0,
"Subdir not notify/key": {
"type": 7,
"access_mode": 7,
"last_written": 1714720819
},
"Subdir not notify": 0,
"If index/key": {
"type": 7,
"access_mode": 7,
"last_written": 1714720819
},
"If index": 0,
"If line/key": {
"type": 7,
"num_values": 10,
"access_mode": 7,
"last_written": 1714720819
},
"If line": [
0,
0,
0,
0,
0,
0,
0,
0,
0,
0
],
"If else line/key": {
"type": 7,
"num_values": 10,
"access_mode": 7,
"last_written": 1714720819
},
"If else line": [
0,
0,
0,
0,
0,
... 379 more lines ...
|
Attachment 4: Screenshot_2024-05-03_at_09.19.29.png
|
|
Attachment 5: Screenshot_2024-05-03_at_09.20.47.png
|
|
2752
|
03 May 2024 |
Luigi Vigani | Bug Report | Params not initialized when starting sequencer | It is pretty much the same as Stefan, I attach here the screenshots. Also in my case it works sometimes, and sometimes partially (one or 2 params, like in
attachment 3).
> Could you please export and send me the /Sequencer ODB tree (or just /Sequencer/Param and /Sequencer/Variables) in both cases while the sequence is running.
>
> thanks,
> Zaher
>
>
> > Good afternoon,
> >
> > After updating Midas to the latest develop commit
> > (0f5436d901a1dfaf6da2b94e2d87f870e3611cf1) we found out a bug when starting
> > sequencer. If we have a simple loop from start value to stop value and step
> > size, just printing the value at each iteration, we see everything good (see
> > first attachment). Then we included another script though, which contains
> > several subroutines we defined for our detector, and we try to run the same
> > script. Unfortunately after this the parameters seem uninitialized, and the
> > value at each loop does not make sense (see second attachment). Also, sometimes
> > when pressing run the set parameter window would pop-up, but sometimes not.
> >
> > The script is this one:
> >
> > >>>
> > COMMENT Test script to check for a specific bug
> >
> > INCLUDE global_basic_functions
> >
> > #CALL setup_paths
> > #CALL generate_DUT_params
> >
> > PARAM lv_start, "Start of LV", 1.8
> > PARAM lv_stop, "Stop of LV", 2.1
> > PARAM lv_step, "Step of LV", 0.02
> >
> > n_iterations = (($lv_stop - $lv_start)/$lv_step)
> >
> > MSG "Parameters:"
> > MSG $lv_start
> > MSG $lv_stop
> > MSG $lv_step
> > MSG $n_iterations
> >
> > MSG "Start of looping"
> >
> > LOOP n, $n_iterations
> > lv_now = $lv_start + $n * $lv_step
> > MSG $lv_now
> > WAIT SECONDS, 1
> > ENDLOOP
> > <<<
> >
> > and the only difference comes from commenting the line:
> >
> > >>>
> > INCLUDE global_basic_functions
> > <<<
> >
> > as global_basic_functions is defined as a LIBRARY and it includes 75 (!)
> > subroutines...
> >
> > Is it possible that when loading a large script it messes up the loading of
> > parameters?
> >
> > Thank you very much,
> > Regards,
> > Luigi. |
Attachment 1: seq1.PNG
|
|
Attachment 2: seq2.PNG
|
|
Attachment 3: seq3.PNG
|
|
2755
|
03 May 2024 |
Zaher Salman | Bug Report | Params not initialized when starting sequencer | I have been able to reproduce the problem only once. From what I see, it seems that the Variables ODB tree is not initialized properly from the Param tree. Below are the messages from the failed run compared to a successful one. As far as I could see, the javascript code does not change anything in the Variables ODB tree (only monitors it). The actual changes are done by the sequencer program, or am I wrong?
Failed run:
16:14:25.849 2024/05/03 [Sequencer,INFO] + 3 *
16:14:24.722 2024/05/03 [Sequencer,INFO] + 2 *
16:14:23.594 2024/05/03 [Sequencer,INFO] + 1 *
16:14:23.592 2024/05/03 [Sequencer,INFO] Start of looping
16:14:23.591 2024/05/03 [Sequencer,INFO] (( - )/)
16:14:23.591 2024/05/03 [Sequencer,INFO]
16:14:23.590 2024/05/03 [Sequencer,INFO]
16:14:23.590 2024/05/03 [Sequencer,INFO]
16:14:23.589 2024/05/03 [Sequencer,INFO] Parameters:
16:14:23.562 2024/05/03 [Sequencer,TALK] Sequencer started with script "testpars.msl".
Successful run:
16:15:37.472 2024/05/03 [Sequencer,INFO] 1.820000
16:15:37.471 2024/05/03 [Sequencer,INFO] Start of looping
16:15:37.471 2024/05/03 [Sequencer,INFO] 15
16:15:37.470 2024/05/03 [Sequencer,INFO] 0.020000
16:15:37.470 2024/05/03 [Sequencer,INFO] 2.100000
16:15:37.469 2024/05/03 [Sequencer,INFO] 1.800000
16:15:37.469 2024/05/03 [Sequencer,INFO] Parameters:
16:15:37.450 2024/05/03 [Sequencer,TALK] Sequencer started with script "testpars.msl". |
2756
|
03 May 2024 |
Stefan Ritt | Bug Report | Params not initialized when starting sequencer | Ahh, that rings a bell:
1) JS opens start dialog box
2) User enters parameters and presses start
3) JS writes parameters
4) JS starts sequencer
5) Sequencer copies parameters to variables
Now how do you handle 3) and 4). Just issue two mjsonrpc commands together? What then could happen is that 4) is executed before 3) and we get the garbage.
You have to do 3) and WAIT for the return ("then" in the JS promise), and only then issue 4) from there.
Stefan |
2757
|
03 May 2024 |
Zaher Salman | Bug Report | Params not initialized when starting sequencer | Thanks for the hint Stefan. I pushed a possible fix but I cannot test it since I cannot reproduce the issue.
> Ahh, that rings a bell:
>
> 1) JS opens start dialog box
> 2) User enters parameters and presses start
> 3) JS writes parameters
> 4) JS starts sequencer
> 5) Sequencer copies parameters to variables
>
> Now how do you handle 3) and 4). Just issue two mjsonrpc commands together? What then could happen is that 4) is executed before 3) and we get the garbage.
> You have to do 3) and WAIT for the return ("then" in the JS promise), and only then issue 4) from there.
>
> Stefan |
2758
|
03 May 2024 |
Stefan Ritt | Bug Report | Params not initialized when starting sequencer | Seems to me like the problem happens less frequently, but I still see it (1 out of 5 or so). The fact that /Sequencer/Params/Value is empty tells me that the GUI
has the problem and not the sequencer side.
Stefan |
Attachment 1: Screenshot_2024-05-03_at_18.19.52.png
|
|
2763
|
10 May 2024 |
Zaher Salman | Bug Report | Params not initialized when starting sequencer | I think that I finally managed to fix the problem. The default values of the parameters are now written first in one go, then the sequencer waits for confirmation that everything is completed before proceeding. Please test and let me know if there are still any issues.
Zaher |
Draft
|
13 May 2024 |
Luigi Vigani | Bug Report | Params not initialized when starting sequencer | [quote="Zaher Salman"]I think that I finally managed to fix the problem. The default values of the parameters are now written first in one go, then the sequencer waits for confirmation that everything is completed before proceeding. Please test and let me know if there are still any issues. Zaher [/quote] |
2765
|
13 May 2024 |
Luigi Vigani | Bug Report | Params not initialized when starting sequencer |
Zaher Salman wrote: | I think that I finally managed to fix the problem. The default values of the parameters are now written first in one go, then the sequencer waits for confirmation that everything is completed before proceeding. Please test and let me know if there are still any issues.
Zaher |
Hi Zaher,
It seems fixed to me as well! Thanks a lot!
Luigi. |
2767
|
16 May 2024 |
Konstantin Olchanski | Bug Report | midas alarm borked condition evaluation | I am updating the TRIUMF IRIS experiment to the latest version of MIDAS. I see following error messages in midas.log:
19:06:16.806 2024/05/16 [mhttpd,ERROR] [odb.cxx:6967:db_get_data_index,ERROR] index (29) exceeds array length (20) for key
"/Equipment/Beamline/Variables/Measured"
19:06:15.806 2024/05/16 [mhttpd,ERROR] [odb.cxx:6967:db_get_data_index,ERROR] index (30) exceeds array length (20) for key
"/Equipment/Beamline/Variables/Measured"
The errors are correct, there is only 20 elements in that array. The errors are coming every few seconds, they spam midas.log. How do I fix
them? Where do they come from? There is no additional diagnostics or information to go from.
In the worst case, they come from some custom web page that reads wrong index variables from ODB. mhttpd currently provides no diagnostics to
find out which web page could be causing this.
But maybe it's internal to MIDAS? I save odb to odb.json, "grep Measured odb.json" yields:
iris00:~> grep Measured odb.json
"Condition" : "/Equipment/Beamline/Variables/Measured[29] > 1e-5",
"Condition" : "/Equipment/Beamline/Variables/Measured[30] < 0.5",
So wrong index errors is coming from evaluated alarms.
ODB "/Alarms/Alarm system active" is set to "no" (alarm system is disabled), the errors are coming.
ODB "/Alarms/Alarms/TP4/Active" is set to "no" (specific alarm is disabled), the errors are coming.
WTF? (and this is recentish borkage, old IRIS MIDAS had the same wrong index alarms and did not generate these errors).
Breakage:
- where is the error message "evaluated alarm XXX cannot be computed because YYY cannot be read from ODB!"
- disabled alarm should not be computed
- alarm system is disabled, alarms should not be computed
K.O.
P.S. I am filing bug reports here, I cannot be bothered with the 25-factor authentication to access bitbucket. |
2768
|
17 May 2024 |
Stefan Ritt | Bug Report | midas alarm borked condition evaluation | This is a common problem I also encountered in the past. You get a low-level ODB access error (could also be a read of a non-existing key) and you
have no idea where this comes from. Could be the alarm system, a mhttpd web page, even some user code in a front-end over which the midas library
has no control.
One option would be to add a complete stack dump to each of these error (ROOT does something like that), but I hear already people shouting "my
midas.log is flooded with stack dumps!". So what I do in this case is I run a midas program in the debugger and set a breakpoint (in your case at
odb.cxx:6967). If the breakpoint triggers, I inspect the stack and find out where this comes from.
Not that I print a stack dump for such error in the odbxx API. This goes to stdout, not the midas log, and it helped me in the past. Unfortunately
stack dumps work only under linux (not MacOSX), and they do not contain all information a debugger can show you.
It is not true that alarm conditions are evaluated when the alarm system is off. I just tried and it works fine. The code is here:
alarm.cxx:591
/* check global alarm flag */
flag = TRUE;
size = sizeof(flag);
db_get_value(hDB, 0, "/Alarms/Alarm system active", &flag, &size, TID_BOOL, TRUE);
if (!flag)
return AL_SUCCESS;
so no idea why you see this error if you correctly st "Alarm system active" to false.
Stefan |
2769
|
17 May 2024 |
Konstantin Olchanski | Bug Report | midas alarm borked condition evaluation | > This is a common problem I also encountered in the past. You get a low-level ODB access error (could also be a read of a non-existing key) and you
> have no idea where this comes from. Could be the alarm system, a mhttpd web page, even some user code in a front-end over which the midas library
> has no control.
committed a partial fix, added an error message in alarm condition evaluation code to report alarm name and odb paths when we cannot get something from
ODB. Now at least midas.log gives some clue that ODB errors are coming from alarms.
and the errors are actually coming from the alarms web page.
the alarms web page shows all the alarms even if alarms are disabled and it shows evaluated alarm conditions and current values even for alarms that
are disabled.
I could change it to show "disabled" for disabled alarms, but I think it would not be an improvement,
right now it is quite convenient to see the alarm values for disabled/inactive alarms,
it is easy to see if they will immediately trip if I enable them. Hiding the values would make
them blind.
And I think I know what caused the original problem in IRIS experiment, I think the list of EPICS variables got truncated from 30 to 20 and EPICS
values 29 and 30 used in the alarm conditions have become lost.
So the next step is to fix feepics to not truncate the list of variables (right now it is hardwired to 20 variables) and restore
the lost variable definition from a saved odb dump.
K.O.
>
> One option would be to add a complete stack dump to each of these error (ROOT does something like that), but I hear already people shouting "my
> midas.log is flooded with stack dumps!". So what I do in this case is I run a midas program in the debugger and set a breakpoint (in your case at
> odb.cxx:6967). If the breakpoint triggers, I inspect the stack and find out where this comes from.
>
> Not that I print a stack dump for such error in the odbxx API. This goes to stdout, not the midas log, and it helped me in the past. Unfortunately
> stack dumps work only under linux (not MacOSX), and they do not contain all information a debugger can show you.
>
> It is not true that alarm conditions are evaluated when the alarm system is off. I just tried and it works fine. The code is here:
>
> alarm.cxx:591
>
> /* check global alarm flag */
> flag = TRUE;
> size = sizeof(flag);
> db_get_value(hDB, 0, "/Alarms/Alarm system active", &flag, &size, TID_BOOL, TRUE);
> if (!flag)
> return AL_SUCCESS;
>
> so no idea why you see this error if you correctly st "Alarm system active" to false.
>
> Stefan |
Draft
|
17 May 2024 |
Konstantin Olchanski | Bug Report | odbedit load into the wrong place | Trying to restore IRIS ODB was a nasty surprise, old save files are in .odb format and odbedit "load xxx.odb" does an unexpected thing.
mkdir tmp
cd tmp
load odb.xml loads odb.xml into the current directory "tmp"
load odb.json same thing
load odb.odb loads into "/" unexpectedly overwriting everything with old data |
170
|
22 Oct 2004 |
Konstantin Olchanski | Bug Fix | mhttpd message colouring | I commited a fix to mhttpd logic that decides which messages should be shown in
"red" colour- before, any message with square brackets and colons would be
highlighted in red. Now only messages matching the pattern [...:...] are
highlighted. The decision logic was moved into a function message_red(). K.O. |
174
|
09 Nov 2004 |
Pierre-Andre Amaudruz | Bug Fix | New transition scheme | Problem:
If cm_set_transition_sequence() is used for changing the sequence number, the
command odbedit> start/stop/resume/pause -v report the propre sequence but the
action on the client side is actually not performed!
Fix:
Local transition table updated in midas.c (1.226)
Note:
The transition number under /system/clients/<pid>/transition...
is used internally. Changing it won't have any effect on the client action
if sequence number is not registered. |
200
|
25 Feb 2005 |
Konstantin Olchanski | Bug Fix | fixed: double free in FORMAT_MIDAS ybos.c causing lazylogger crashes | We stumbled upon and fixed a "double free" bug in src/ybos.c causing crashes in
lazylogger writing .mid files in the FORMAT_MIDAS format (why does it use
ybos.c? Pierre says- for generic file i/o). Why this code had ever worked before
remains a mystery. K.O. |
211
|
05 May 2005 |
Konstantin Olchanski | Bug Fix | fix: minor bit rot in the example experiment | I fixed some minor bit rot in the example experiment: a few minor Makefile
problems, make the analyzer use the current histogram creation macros, etc. I
also added startup and shutdown scripts. These will be documented as we work
through them with our Summer student. K.O. |
212
|
02 Aug 2005 |
Konstantin Olchanski | Bug Fix | fix odb corruption when running analzer for the first time | I have been plagued by ODB corruption when I run the analyzer for the first time
after setting up the new experiment. Some time ago, I traced this to
mana.c::book_ttree() and now I found and fixed the bug, fix now commited to
midas cvs. In book_ttree(), db_find("/Analyzer/Bank switches") was returning an
error and setting hkey to zero. Then we called db_open_record() with hkey==0,
which cased ODB corruption later on. The normal db_validate_hkey() did not catch
this because it considers hkey==0 to be valid (when most likely it is not). K.O. |
216
|
18 Aug 2005 |
Konstantin Olchanski | Bug Fix | fix race condition between clients on run start/stop, pause/resume | It turns out that the new priority sequencing of run state transitions had a
flaw: the frontends, the analyzer and the logger all registered at priority 500
and were invoked in essentially a random order. For example the frontend could
get a begin-run transition before the logger and so start sending data before
the logger opened the output file. Same for the analyzer and same for the end of
run. Also the sequencing for pause/resume run and begin/end run was different
when the two pairs ought to have identical sequencing.
I now commited changes to mana.c and mlogger.c changing their transition sequencing:
start and resume:
200 - logger (mlogger.c, no change)
300 - analyzer (mana.c, was 500)
500 - frontends (mfe.c, no change)
stop and pause:
500 - frontends (mfe.c, no change)
700 - analyzer (mana.c, was 500)
800 - mlogger (mlogger.c, was 500)
P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
see an anomaly in the analyzer, where it receives data events after the
end-of-run transition.
K.O. |
219
|
01 Sep 2005 |
Stefan Ritt | Bug Fix | fix race condition between clients on run start/stop, pause/resume | > It turns out that the new priority sequencing of run state transitions had a
> flaw: the frontends, the analyzer and the logger all registered at priority 500
> and were invoked in essentially a random order. For example the frontend could
> get a begin-run transition before the logger and so start sending data before
> the logger opened the output file. Same for the analyzer and same for the end of
> run. Also the sequencing for pause/resume run and begin/end run was different
> when the two pairs ought to have identical sequencing.
>
> I now commited changes to mana.c and mlogger.c changing their transition sequencing:
>
> start and resume:
> 200 - logger (mlogger.c, no change)
> 300 - analyzer (mana.c, was 500)
> 500 - frontends (mfe.c, no change)
>
> stop and pause:
> 500 - frontends (mfe.c, no change)
> 700 - analyzer (mana.c, was 500)
> 800 - mlogger (mlogger.c, was 500)
>
> P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
> see an anomaly in the analyzer, where it receives data events after the
> end-of-run transition.
>
> K.O.
Thanks for fixing that bug. It happend because during the implementatoin of the priority
sequencing we have up the pre/post tansition, which took care of the proper sequence
between the logger, frontend and analyzer. The way you modified the sequence is
absolutely correct. It is important to have >10 numbers "around" the frontends (like
450...550) in case one has an experiment with >10 frontends which need to make a
transition in a certain sequence (like the DANCE experiment in Los Alamos). |
|