ID |
Date |
Author |
Topic |
Subject |
2751
|
03 May 2024 |
Stefan Ritt | Bug Report | Params not initialized when starting sequencer |
Ok, here is the complete code to reproduce the problem. Load parameter_test.msl which includes functions.msl. From the screenshot you see the variables containing
garbage, and you also see that from the ODB screenshot. For completeness, I added Sequencer.json which contains the whole sequencer tree.
The interesting thing is that this works sometimes, and sometimes not. I'm not sure if this in the GUI or in the sequencer program, so we have to sort out who can
fix it ;-)
Best,
Stefan |
2752
|
03 May 2024 |
Luigi Vigani | Bug Report | Params not initialized when starting sequencer |
It is pretty much the same as Stefan, I attach here the screenshots. Also in my case it works sometimes, and sometimes partially (one or 2 params, like in
attachment 3).
> Could you please export and send me the /Sequencer ODB tree (or just /Sequencer/Param and /Sequencer/Variables) in both cases while the sequence is running.
>
> thanks,
> Zaher
>
>
> > Good afternoon,
> >
> > After updating Midas to the latest develop commit
> > (0f5436d901a1dfaf6da2b94e2d87f870e3611cf1) we found out a bug when starting
> > sequencer. If we have a simple loop from start value to stop value and step
> > size, just printing the value at each iteration, we see everything good (see
> > first attachment). Then we included another script though, which contains
> > several subroutines we defined for our detector, and we try to run the same
> > script. Unfortunately after this the parameters seem uninitialized, and the
> > value at each loop does not make sense (see second attachment). Also, sometimes
> > when pressing run the set parameter window would pop-up, but sometimes not.
> >
> > The script is this one:
> >
> > >>>
> > COMMENT Test script to check for a specific bug
> >
> > INCLUDE global_basic_functions
> >
> > #CALL setup_paths
> > #CALL generate_DUT_params
> >
> > PARAM lv_start, "Start of LV", 1.8
> > PARAM lv_stop, "Stop of LV", 2.1
> > PARAM lv_step, "Step of LV", 0.02
> >
> > n_iterations = (($lv_stop - $lv_start)/$lv_step)
> >
> > MSG "Parameters:"
> > MSG $lv_start
> > MSG $lv_stop
> > MSG $lv_step
> > MSG $n_iterations
> >
> > MSG "Start of looping"
> >
> > LOOP n, $n_iterations
> > lv_now = $lv_start + $n * $lv_step
> > MSG $lv_now
> > WAIT SECONDS, 1
> > ENDLOOP
> > <<<
> >
> > and the only difference comes from commenting the line:
> >
> > >>>
> > INCLUDE global_basic_functions
> > <<<
> >
> > as global_basic_functions is defined as a LIBRARY and it includes 75 (!)
> > subroutines...
> >
> > Is it possible that when loading a large script it messes up the loading of
> > parameters?
> >
> > Thank you very much,
> > Regards,
> > Luigi. |
2755
|
03 May 2024 |
Zaher Salman | Bug Report | Params not initialized when starting sequencer |
I have been able to reproduce the problem only once. From what I see, it seems that the Variables ODB tree is not initialized properly from the Param tree. Below are the messages from the failed run compared to a successful one. As far as I could see, the javascript code does not change anything in the Variables ODB tree (only monitors it). The actual changes are done by the sequencer program, or am I wrong?
Failed run:
16:14:25.849 2024/05/03 [Sequencer,INFO] + 3 *
16:14:24.722 2024/05/03 [Sequencer,INFO] + 2 *
16:14:23.594 2024/05/03 [Sequencer,INFO] + 1 *
16:14:23.592 2024/05/03 [Sequencer,INFO] Start of looping
16:14:23.591 2024/05/03 [Sequencer,INFO] (( - )/)
16:14:23.591 2024/05/03 [Sequencer,INFO]
16:14:23.590 2024/05/03 [Sequencer,INFO]
16:14:23.590 2024/05/03 [Sequencer,INFO]
16:14:23.589 2024/05/03 [Sequencer,INFO] Parameters:
16:14:23.562 2024/05/03 [Sequencer,TALK] Sequencer started with script "testpars.msl".
Successful run:
16:15:37.472 2024/05/03 [Sequencer,INFO] 1.820000
16:15:37.471 2024/05/03 [Sequencer,INFO] Start of looping
16:15:37.471 2024/05/03 [Sequencer,INFO] 15
16:15:37.470 2024/05/03 [Sequencer,INFO] 0.020000
16:15:37.470 2024/05/03 [Sequencer,INFO] 2.100000
16:15:37.469 2024/05/03 [Sequencer,INFO] 1.800000
16:15:37.469 2024/05/03 [Sequencer,INFO] Parameters:
16:15:37.450 2024/05/03 [Sequencer,TALK] Sequencer started with script "testpars.msl". |
2756
|
03 May 2024 |
Stefan Ritt | Bug Report | Params not initialized when starting sequencer |
Ahh, that rings a bell:
1) JS opens start dialog box
2) User enters parameters and presses start
3) JS writes parameters
4) JS starts sequencer
5) Sequencer copies parameters to variables
Now how do you handle 3) and 4). Just issue two mjsonrpc commands together? What then could happen is that 4) is executed before 3) and we get the garbage.
You have to do 3) and WAIT for the return ("then" in the JS promise), and only then issue 4) from there.
Stefan |
2757
|
03 May 2024 |
Zaher Salman | Bug Report | Params not initialized when starting sequencer |
Thanks for the hint Stefan. I pushed a possible fix but I cannot test it since I cannot reproduce the issue.
> Ahh, that rings a bell:
>
> 1) JS opens start dialog box
> 2) User enters parameters and presses start
> 3) JS writes parameters
> 4) JS starts sequencer
> 5) Sequencer copies parameters to variables
>
> Now how do you handle 3) and 4). Just issue two mjsonrpc commands together? What then could happen is that 4) is executed before 3) and we get the garbage.
> You have to do 3) and WAIT for the return ("then" in the JS promise), and only then issue 4) from there.
>
> Stefan |
2758
|
03 May 2024 |
Stefan Ritt | Bug Report | Params not initialized when starting sequencer |
Seems to me like the problem happens less frequently, but I still see it (1 out of 5 or so). The fact that /Sequencer/Params/Value is empty tells me that the GUI
has the problem and not the sequencer side.
Stefan |
2763
|
10 May 2024 |
Zaher Salman | Bug Report | Params not initialized when starting sequencer |
I think that I finally managed to fix the problem. The default values of the parameters are now written first in one go, then the sequencer waits for confirmation that everything is completed before proceeding. Please test and let me know if there are still any issues.
Zaher |
Draft
|
13 May 2024 |
Luigi Vigani | Bug Report | Params not initialized when starting sequencer |
[quote="Zaher Salman"]I think that I finally managed to fix the problem. The default values of the parameters are now written first in one go, then the sequencer waits for confirmation that everything is completed before proceeding. Please test and let me know if there are still any issues. Zaher [/quote] |
2765
|
13 May 2024 |
Luigi Vigani | Bug Report | Params not initialized when starting sequencer |
Zaher Salman wrote: | I think that I finally managed to fix the problem. The default values of the parameters are now written first in one go, then the sequencer waits for confirmation that everything is completed before proceeding. Please test and let me know if there are still any issues.
Zaher |
Hi Zaher,
It seems fixed to me as well! Thanks a lot!
Luigi. |
2767
|
16 May 2024 |
Konstantin Olchanski | Bug Report | midas alarm borked condition evaluation |
I am updating the TRIUMF IRIS experiment to the latest version of MIDAS. I see following error messages in midas.log:
19:06:16.806 2024/05/16 [mhttpd,ERROR] [odb.cxx:6967:db_get_data_index,ERROR] index (29) exceeds array length (20) for key
"/Equipment/Beamline/Variables/Measured"
19:06:15.806 2024/05/16 [mhttpd,ERROR] [odb.cxx:6967:db_get_data_index,ERROR] index (30) exceeds array length (20) for key
"/Equipment/Beamline/Variables/Measured"
The errors are correct, there is only 20 elements in that array. The errors are coming every few seconds, they spam midas.log. How do I fix
them? Where do they come from? There is no additional diagnostics or information to go from.
In the worst case, they come from some custom web page that reads wrong index variables from ODB. mhttpd currently provides no diagnostics to
find out which web page could be causing this.
But maybe it's internal to MIDAS? I save odb to odb.json, "grep Measured odb.json" yields:
iris00:~> grep Measured odb.json
"Condition" : "/Equipment/Beamline/Variables/Measured[29] > 1e-5",
"Condition" : "/Equipment/Beamline/Variables/Measured[30] < 0.5",
So wrong index errors is coming from evaluated alarms.
ODB "/Alarms/Alarm system active" is set to "no" (alarm system is disabled), the errors are coming.
ODB "/Alarms/Alarms/TP4/Active" is set to "no" (specific alarm is disabled), the errors are coming.
WTF? (and this is recentish borkage, old IRIS MIDAS had the same wrong index alarms and did not generate these errors).
Breakage:
- where is the error message "evaluated alarm XXX cannot be computed because YYY cannot be read from ODB!"
- disabled alarm should not be computed
- alarm system is disabled, alarms should not be computed
K.O.
P.S. I am filing bug reports here, I cannot be bothered with the 25-factor authentication to access bitbucket. |
2768
|
17 May 2024 |
Stefan Ritt | Bug Report | midas alarm borked condition evaluation |
This is a common problem I also encountered in the past. You get a low-level ODB access error (could also be a read of a non-existing key) and you
have no idea where this comes from. Could be the alarm system, a mhttpd web page, even some user code in a front-end over which the midas library
has no control.
One option would be to add a complete stack dump to each of these error (ROOT does something like that), but I hear already people shouting "my
midas.log is flooded with stack dumps!". So what I do in this case is I run a midas program in the debugger and set a breakpoint (in your case at
odb.cxx:6967). If the breakpoint triggers, I inspect the stack and find out where this comes from.
Not that I print a stack dump for such error in the odbxx API. This goes to stdout, not the midas log, and it helped me in the past. Unfortunately
stack dumps work only under linux (not MacOSX), and they do not contain all information a debugger can show you.
It is not true that alarm conditions are evaluated when the alarm system is off. I just tried and it works fine. The code is here:
alarm.cxx:591
/* check global alarm flag */
flag = TRUE;
size = sizeof(flag);
db_get_value(hDB, 0, "/Alarms/Alarm system active", &flag, &size, TID_BOOL, TRUE);
if (!flag)
return AL_SUCCESS;
so no idea why you see this error if you correctly st "Alarm system active" to false.
Stefan |
170
|
22 Oct 2004 |
Konstantin Olchanski | Bug Fix | mhttpd message colouring |
I commited a fix to mhttpd logic that decides which messages should be shown in
"red" colour- before, any message with square brackets and colons would be
highlighted in red. Now only messages matching the pattern [...:...] are
highlighted. The decision logic was moved into a function message_red(). K.O. |
174
|
09 Nov 2004 |
Pierre-Andre Amaudruz | Bug Fix | New transition scheme |
Problem:
If cm_set_transition_sequence() is used for changing the sequence number, the
command odbedit> start/stop/resume/pause -v report the propre sequence but the
action on the client side is actually not performed!
Fix:
Local transition table updated in midas.c (1.226)
Note:
The transition number under /system/clients/<pid>/transition...
is used internally. Changing it won't have any effect on the client action
if sequence number is not registered. |
200
|
25 Feb 2005 |
Konstantin Olchanski | Bug Fix | fixed: double free in FORMAT_MIDAS ybos.c causing lazylogger crashes |
We stumbled upon and fixed a "double free" bug in src/ybos.c causing crashes in
lazylogger writing .mid files in the FORMAT_MIDAS format (why does it use
ybos.c? Pierre says- for generic file i/o). Why this code had ever worked before
remains a mystery. K.O. |
211
|
05 May 2005 |
Konstantin Olchanski | Bug Fix | fix: minor bit rot in the example experiment |
I fixed some minor bit rot in the example experiment: a few minor Makefile
problems, make the analyzer use the current histogram creation macros, etc. I
also added startup and shutdown scripts. These will be documented as we work
through them with our Summer student. K.O. |
212
|
02 Aug 2005 |
Konstantin Olchanski | Bug Fix | fix odb corruption when running analzer for the first time |
I have been plagued by ODB corruption when I run the analyzer for the first time
after setting up the new experiment. Some time ago, I traced this to
mana.c::book_ttree() and now I found and fixed the bug, fix now commited to
midas cvs. In book_ttree(), db_find("/Analyzer/Bank switches") was returning an
error and setting hkey to zero. Then we called db_open_record() with hkey==0,
which cased ODB corruption later on. The normal db_validate_hkey() did not catch
this because it considers hkey==0 to be valid (when most likely it is not). K.O. |
216
|
18 Aug 2005 |
Konstantin Olchanski | Bug Fix | fix race condition between clients on run start/stop, pause/resume |
It turns out that the new priority sequencing of run state transitions had a
flaw: the frontends, the analyzer and the logger all registered at priority 500
and were invoked in essentially a random order. For example the frontend could
get a begin-run transition before the logger and so start sending data before
the logger opened the output file. Same for the analyzer and same for the end of
run. Also the sequencing for pause/resume run and begin/end run was different
when the two pairs ought to have identical sequencing.
I now commited changes to mana.c and mlogger.c changing their transition sequencing:
start and resume:
200 - logger (mlogger.c, no change)
300 - analyzer (mana.c, was 500)
500 - frontends (mfe.c, no change)
stop and pause:
500 - frontends (mfe.c, no change)
700 - analyzer (mana.c, was 500)
800 - mlogger (mlogger.c, was 500)
P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
see an anomaly in the analyzer, where it receives data events after the
end-of-run transition.
K.O. |
219
|
01 Sep 2005 |
Stefan Ritt | Bug Fix | fix race condition between clients on run start/stop, pause/resume |
> It turns out that the new priority sequencing of run state transitions had a
> flaw: the frontends, the analyzer and the logger all registered at priority 500
> and were invoked in essentially a random order. For example the frontend could
> get a begin-run transition before the logger and so start sending data before
> the logger opened the output file. Same for the analyzer and same for the end of
> run. Also the sequencing for pause/resume run and begin/end run was different
> when the two pairs ought to have identical sequencing.
>
> I now commited changes to mana.c and mlogger.c changing their transition sequencing:
>
> start and resume:
> 200 - logger (mlogger.c, no change)
> 300 - analyzer (mana.c, was 500)
> 500 - frontends (mfe.c, no change)
>
> stop and pause:
> 500 - frontends (mfe.c, no change)
> 700 - analyzer (mana.c, was 500)
> 800 - mlogger (mlogger.c, was 500)
>
> P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
> see an anomaly in the analyzer, where it receives data events after the
> end-of-run transition.
>
> K.O.
Thanks for fixing that bug. It happend because during the implementatoin of the priority
sequencing we have up the pre/post tansition, which took care of the proper sequence
between the logger, frontend and analyzer. The way you modified the sequence is
absolutely correct. It is important to have >10 numbers "around" the frontends (like
450...550) in case one has an experiment with >10 frontends which need to make a
transition in a certain sequence (like the DANCE experiment in Los Alamos). |
229
|
17 Oct 2005 |
Exaos Lee | Bug Fix | "make install" error under MacOS X |
Under MacOS X, "make install" will cours an error like this:
...
install: darwin/bin/dio: No such file or directory
make: *** [install] Error 71
This can be fixed as the following diff:
404,405c404,405
< $(BIN_DIR)/mcnaf: $(UTL_DIR)/mcnaf.c $(DRV_DIR)/camac/camacrpc.c
< $(CC) $(CFLAGS) $(OSFLAGS) -o $@ $(UTL_DIR)/mcnaf.c $(DRV_DIR)/camac/camacrpc.c $(LIB) $(LIBS)
---
> $(BIN_DIR)/mcnaf: $(UTL_DIR)/mcnaf.c $(DRV_DIR)/bus/camacrpc.c
> $(CC) $(CFLAGS) $(OSFLAGS) -o $@ $(UTL_DIR)/mcnaf.c $(DRV_DIR)/bus/camacrpc.c $(LIB) $(LIBS)
438c438,439
< @for i in mserver mhttpd odbedit mlogger ; \
---
>
> @for i in mserver mhttpd odbedit mlogger dio ; \
444,447d444
< chmod +s $(SYSBIN_DIR)/mhttpd
<
< ifeq ($(OSTYPE),linux)
< install -v -m 755 $(BIN_DIR)/dio $(SYSBIN_DIR)
449c446
< endif
---
> chmod +s $(SYSBIN_DIR)/mhttpd
|
235
|
23 Nov 2005 |
Stefan Ritt | Bug Fix | Endian swapping in mana.c |
It was reported that following code in mana.c :
/* swap event header if in wrong format */
if (pevent->serial_number > 0x1000000) {
WORD_SWAP(&pevent->event_id);
WORD_SWAP(&pevent->trigger_mask);
DWORD_SWAP(&pevent->serial_number);
DWORD_SWAP(&pevent->time_stamp);
DWORD_SWAP(&pevent->data_size);
}
does not work correctly for events having a true serial number above 16777216 (=0x10000000). After some considerations, I concluded that there is no good way to determine automatically the endian format of midas events, without adding another field in the header, which would break the compatibility with all recorded data up to date. I therefore changed the above code to
/* swap event header if in wrong format */
#ifdef SWAP_EVENTS
WORD_SWAP(&pevent->event_id);
WORD_SWAP(&pevent->trigger_mask);
DWORD_SWAP(&pevent->serial_number);
DWORD_SWAP(&pevent->time_stamp);
DWORD_SWAP(&pevent->data_size);
#endif
So if one wants to analyze events with the midas analyzer on a PC system for example where the events come from a VxWorks system with the opposite endian encoding, one has to set the flag -DSWAP_EVENTS when compiling the analyzer for that type of analysis. |