ELOG Midas

Back Midas Rome Roody Rootana

Midas DAQ System, Page 7 of 51

Not logged in

Find | Login | Help

New entries since:

Wed Dec 31 16:00:00 1969

Full | Summary | Threaded | Collapse | Expand

1011 Entries

Goto page Previous 1, 2, 3 ... 6, 7, 8 ... 49, 50, 51 Next

04 Apr 2024, Konstantin Olchanski, Info, MIDAS RPC data format

I am not sure I have seen this documented before. MIDAS RPC data format.

1) RPC request (from client to mserver), in rpc_call_encode()

1.1) header:

4 bytes NET_COMMAND.header.routine_id is the RPC routine ID
4 bytes NET_COMMAND.header.param_size is the size of following data, aligned to 8 bytes

1.2) followed by values of RPC_IN parameters:

arg_size is the actual data size
param_size = ALIGN8(arg_size)

for TID_STRING||TID_LINK, arg_size = 1+strlen()
for TID_STRUCT||RPC_FIXARRAY, arg_size is taken from RPC_LIST.param[i].n
for RPC_VARARRAY|RPC_OUT, arg_size is pointed to by the next argument
for RPC_VARARRAY, arg_size is the value of the next argument
otherwise arg_size = rpc_tid_size()

data encoding:

RPC_VARARRAY:
4 bytes of ALIGN8(arg_size)
4 bytes of padding
param_size bytes of data

TID_STRING||TID_LINK:
param_size of string data, zero terminated

otherwise:
param_size of data

2) RPC dispatch in rpc_execute

for each parameter, a pointer is placed into prpc_param[i]:

RPC_IN: points to the data inside the receive buffer
RPC_OUT: points to the data buffer allocated inside the send buffer
RPC_IN|RPC_OUT: data is copied from the receive buffer to the send buffer, prpc_param[i] is a pointer to the copy in the send buffer

prpc_param[] is passed to the user handler function.

user function reads RPC_IN parameters by using the CSTRING(i), etc macros to dereference prpc_param[i]

user function modifies RPC_IN|RPC_OUT parameters pointed to by prpc_param[i] (directly in the send buffer)

user function places RPC_OUT data directly to the send buffer pointed to by prpc_param[i]

size of RPC_VARARRAY|RPC_OUT data should be written into the next/following parameter.

3) RPC reply

3.1) header:

4 bytes NET_COMMAND.header.routine_id contains the value returned by the user function (RPC_SUCCESS)
4 bytes NET_COMMAND.header.param_size is the size of following data aligned to 8 bytes

3.2) followed by data for RPC_OUT parameters:

data sizes and encodings are the same as for RPC_IN parameters.

for variable-size RPC_OUT parameters, space is allocated in the send buffer according to the maximum data size
that the user code expects to receive:

RPC_VARARRAY||TID_STRING: max_size is taken from the first 4 bytes of the *next* parameter
otherwise: max_size is same as arg_size and param_size.

when encoding and sending RPC_VARARRAY data, actual data size is taken from the next parameter, which is expected to be 
TID_INT32|RPC_IN|RPC_OUT.

4) Notes:

4.1) RPC_VARARRAY should always be sent using two parameters:

a) RPC_VARARRAY|RPC_IN is pointer to the data we are sending, next parameter must be TID_INT32|RPC_IN is data size
b) RPC_VARARRAY|RPC_OUT is pointer to the data buffer for received data, next parameter must be TID_INT32|RPC_IN|RPC_OUT before the call should 
contain maximum data size we expect to receive (size of malloc() buffer), after the call it may contain the actual data size returned
c) RPC_VARARRAY|RPC_IN|RPC_OUT is pointer to the data we are sending, next parameter must be TID_INT32|RPC_IN|RPC_OUT containing the maximum 
data size we are expected to receive.

4.2) during dispatching, RPC_VARARRAY|RPC_OUT and TID_STRING|RPC_OUT both have 8 bytes of special header preceeding the actual data, 4 bytes of 
maximum data size and 4 bytes of padding. prpc_param[] points to the actual data and user does not see this special header.

4.3) when encoding outgoing data, this special 8 byte header is removed from TID_STRING|RPC_OUT parameters using memmove().

4.4) TID_STRING parameters:

TID_STRING|RPC_IN can be sent using oe parameter
TID_STRING|RPC_OUT must be sent using two parameters, second parameter should be TID_INT32|RPC_IN to specify maximum returned string length
TID_STRING|RPC_IN|RPC_OUT ditto, but not used anywhere inside MIDAS

4.5) TID_IN32|RPC_VARARRAY does not work, corrupts following parameters. MIDAS only uses TID_ARRAY|RPC_VARARRAY

4.6) TID_STRING|RPC_IN|RPC_OUT does not seem to work.

4.7) RPC_VARARRAY does not work is there is preceding TID_STRING|RPC_OUT that returned a short string. memmove() moves stuff in the send buffer, 
this makes prpc_param[] pointers into the send buffer invalid. subsequent RPC_VARARRAY parameter refers to now-invalid prpc_param[i] pointer to 
get param_size and gets the wrong value. MIDAS does not use this sequence of RPC parameters.

4.8) same bug is in the processing of TID_STRING|RPC_OUT parameters, where it refers to invalid prpc_param[i] to get the string length.

K.O.

24 Apr 2024, Konstantin Olchanski, Info, MIDAS RPC data format

> 4.5) TID_IN32|RPC_VARARRAY does not work, corrupts following parameters. MIDAS only uses TID_ARRAY|RPC_VARARRAY

fixed in commit 0f5436d901a1dfaf6da2b94e2d87f870e3611cf1, TID_ARRAY|RPC_VARARRAY was okey (i.e. db_get_value()), bug happened only if rpc_tid_size() 
is not zero.

> 
> 4.6) TID_STRING|RPC_IN|RPC_OUT does not seem to work.
> 
> 4.7) RPC_VARARRAY does not work is there is preceding TID_STRING|RPC_OUT that returned a short string. memmove() moves stuff in the send buffer, 
> this makes prpc_param[] pointers into the send buffer invalid. subsequent RPC_VARARRAY parameter refers to now-invalid prpc_param[i] pointer to 
> get param_size and gets the wrong value. MIDAS does not use this sequence of RPC parameters.
> 
> 4.8) same bug is in the processing of TID_STRING|RPC_OUT parameters, where it refers to invalid prpc_param[i] to get the string length.

fixed in commits e45de5a8fa81c75e826a6a940f053c0794c962f5 and dc08fe8425c7d7bfea32540592b2c3aec5bead9f

K.O.

02 Jun 2024, Konstantin Olchanski, Info, MIDAS RPC data format

> MIDAS RPC data format.
> 3) RPC reply
> 3.1) header:
> 3.2) followed by data for RPC_OUT parameters:
> 
> data sizes and encodings are the same as for RPC_IN parameters.

Correction:

RPC_VARARRAY data encoding for data returned by RPC is different from data sent to RPC:

4 bytes of arg_size (before 8-byte alignement), (for data sent to RPC, it's 4 bytes of param_size, after 8-byte alignment)
4 bytes of padding
param_size of data

K.O.

P.S. bug/discrepancy caught by GCC/LLVM address sanitizer.

24 May 2024, Konstantin Olchanski, Info, added ubuntu 22 arm64 cross-compilation

Ubuntu 22 has almost everything necessary to cross-build arm64 MIDAS frontends:

# apt install g++-12-aarch64-linux-gnu gcc-12-aarch64-linux-gnu-base libstdc++-12-dev-arm64-cross
$ aarch64-linux-gnu-gcc-12 -o ttcp.aarch64 ttcp.c -static

to cross-build MIDAS:

make arm64_remoteonly -j

run programs from $MIDASSYS/linux-arm64-remoteonly/bin
link frontends to libraries in $MIDASSYS/linux-arm64-remoteonly/lib

Ubuntu 22 do not provide an arm64 libz.a, as a workaround, I build a fake one. (we do not have HAVE_ZLIB anymore...). or you 
can link to libz.a from your arm64 linux image, assuming include/zlib.h are compatible.

K.O.

21 May 2024, Nikolay, Bug Report, experiment from midas/examples

There are 2 bugs in midas/examples/experiment:

1) In fronted bank named "PRDC" is created for scaler event. But in analyzer 
module scaler.cxx the bank named "SCLR" is searched for the same event.

2) In mana.cxx linked from analyzer.cxx is "Invalid name "/Analyzer/Tests/Always 
true/Rate [Hz]" passed to db_create_key: should not contain "["". 
Looks like ODB doesn't like '[', ']' characters.

17 May 2024, Konstantin Olchanski, Bug Report, odbedit load into the wrong place

Trying to restore IRIS ODB was a nasty surprise, old save files are in .odb format and odbedit "load xxx.odb" 
does an unexpected thing.

mkdir tmp
cd tmp
load odb.xml loads odb.xml into the current directory "tmp"
load odb.json same thing
load odb.odb loads into "/" unexpectedly overwriting everything in my ODB with old data

this makes it impossible for me to restore just /equipment/beamline from old .odb save file (without 
overwriting all of my odb with old data).

I look inside db_paste() and it looks like this is intentional, if ODB path names in the odb save file start 
with "/" (and they do), instead of loading into the current directory it loads into "/", overwriting existing 
data.

The fix would be to ignore the leading "/", always restore into the current directory. This will make odbedit 
load consistent between all 3 odb save file formats.

Should I apply this change?

K.O.

18 May 2024, Stefan Ritt, Bug Report, odbedit load into the wrong place

Taht's strange. I always was under the impression that .odb files are loaded relatively to the current location in 
the ODB. The behaviour should not be different for different data formats, so I agree to change the .odb loading to 
behave like the .xml and .json save/load.

Stefan

16 May 2024, Konstantin Olchanski, Bug Report, midas alarm borked condition evaluation

I am updating the TRIUMF IRIS experiment to the latest version of MIDAS. I see following error messages in midas.log:

19:06:16.806 2024/05/16 [mhttpd,ERROR] [odb.cxx:6967:db_get_data_index,ERROR] index (29) exceeds array length (20) for key 
"/Equipment/Beamline/Variables/Measured"

19:06:15.806 2024/05/16 [mhttpd,ERROR] [odb.cxx:6967:db_get_data_index,ERROR] index (30) exceeds array length (20) for key 
"/Equipment/Beamline/Variables/Measured"

The errors are correct, there is only 20 elements in that array. The errors are coming every few seconds, they spam midas.log. How do I fix 
them? Where do they come from? There is no additional diagnostics or information to go from.

In the worst case, they come from some custom web page that reads wrong index variables from ODB. mhttpd currently provides no diagnostics to 
find out which web page could be causing this.

But maybe it's internal to MIDAS? I save odb to odb.json, "grep Measured odb.json" yields:
iris00:~> grep Measured odb.json 
        "Condition" : "/Equipment/Beamline/Variables/Measured[29] > 1e-5",
        "Condition" : "/Equipment/Beamline/Variables/Measured[30] < 0.5",

So wrong index errors is coming from evaluated alarms.
ODB "/Alarms/Alarm system active" is set to "no" (alarm system is disabled), the errors are coming.
ODB "/Alarms/Alarms/TP4/Active" is set to "no" (specific alarm is disabled), the errors are coming.
WTF? (and this is recentish borkage, old IRIS MIDAS had the same wrong index alarms and did not generate these errors).

Breakage:
- where is the error message "evaluated alarm XXX cannot be computed because YYY cannot be read from ODB!"
- disabled alarm should not be computed
- alarm system is disabled, alarms should not be computed

K.O.

P.S. I am filing bug reports here, I cannot be bothered with the 25-factor authentication to access bitbucket.

17 May 2024, Stefan Ritt, Bug Report, midas alarm borked condition evaluation

This is a common problem I also encountered in the past. You get a low-level ODB access error (could also be a read of a non-existing key) and you 
have no idea where this comes from. Could be the alarm system, a mhttpd web page, even some user code in a front-end over which the midas library 
has no control.

One option would be to add a complete stack dump to each of these error (ROOT does something like that), but I hear already people shouting "my 
midas.log is flooded with stack dumps!". So what I do in this case is I run a midas program in the debugger and set a breakpoint (in your case at 
odb.cxx:6967). If the breakpoint triggers, I inspect the stack and find out where this comes from.

Not that I print a stack dump for such error in the odbxx API. This goes to stdout, not the midas log, and it helped me in the past. Unfortunately 
stack dumps work only under linux (not MacOSX), and they do not contain all information a debugger can show you.

It is not true that alarm conditions are evaluated when the alarm system is off. I just tried and it works fine. The code is here:

alarm.cxx:591

      /* check global alarm flag */
      flag = TRUE;
      size = sizeof(flag);
      db_get_value(hDB, 0, "/Alarms/Alarm system active", &flag, &size, TID_BOOL, TRUE);
      if (!flag)
         return AL_SUCCESS;

so no idea why you see this error if you correctly st "Alarm system active" to false.

Stefan

17 May 2024, Konstantin Olchanski, Bug Report, midas alarm borked condition evaluation

> This is a common problem I also encountered in the past. You get a low-level ODB access error (could also be a read of a non-existing key) and you 
> have no idea where this comes from. Could be the alarm system, a mhttpd web page, even some user code in a front-end over which the midas library 
> has no control.

committed a partial fix, added an error message in alarm condition evaluation code to report alarm name and odb paths when we cannot get something from 
ODB. Now at least midas.log gives some clue that ODB errors are coming from alarms.

and the errors are actually coming from the alarms web page.

the alarms web page shows all the alarms even if alarms are disabled and it shows evaluated alarm conditions and current values even for alarms that 
are disabled.

I could change it to show "disabled" for disabled alarms, but I think it would not be an improvement,
right now it is quite convenient to see the alarm values for disabled/inactive alarms,
it is easy to see if they will immediately trip if I enable them. Hiding the values would make
them blind.

And I think I know what caused the original problem in IRIS experiment, I think the list of EPICS variables got truncated from 30 to 20 and EPICS 
values 29 and 30 used in the alarm conditions have become lost.

So the next step is to fix feepics to not truncate the list of variables (right now it is hardwired to 20 variables) and restore
the lost variable definition from a saved odb dump.

K.O.






> 
> One option would be to add a complete stack dump to each of these error (ROOT does something like that), but I hear already people shouting "my 
> midas.log is flooded with stack dumps!". So what I do in this case is I run a midas program in the debugger and set a breakpoint (in your case at 
> odb.cxx:6967). If the breakpoint triggers, I inspect the stack and find out where this comes from.
> 
> Not that I print a stack dump for such error in the odbxx API. This goes to stdout, not the midas log, and it helped me in the past. Unfortunately 
> stack dumps work only under linux (not MacOSX), and they do not contain all information a debugger can show you.
> 
> It is not true that alarm conditions are evaluated when the alarm system is off. I just tried and it works fine. The code is here:
> 
> alarm.cxx:591
> 
>       /* check global alarm flag */
>       flag = TRUE;
>       size = sizeof(flag);
>       db_get_value(hDB, 0, "/Alarms/Alarm system active", &flag, &size, TID_BOOL, TRUE);
>       if (!flag)
>          return AL_SUCCESS;
> 
> so no idea why you see this error if you correctly st "Alarm system active" to false.
> 
> Stefan

17 May 2024, Konstantin Olchanski, Bug Report, midas alarm borked condition evaluation

> 
> And I think I know what caused the original problem in IRIS experiment, I think the list of EPICS variables got truncated from 30 to 20 and EPICS 
> values 29 and 30 used in the alarm conditions have become lost.
> 
> So the next step is to fix feepics to not truncate the list of variables (right now it is hardwired to 20 variables) and restore
> the lost variable definition from a saved odb dump.

for the record, I restored the old ODB settings from feepics, my EPICS variables now have the correct size and the alarm now works correctly.

I also updated the example feepics to read the number of EPICS variables from ODB instead of always truncating them to 20 (IRIS MIDAS had a local change 
setting number of variables to 40).

I think I will make no more changes to the alarms, leave well enough alone.

K.O.

18 May 2024, Stefan Ritt, Bug Report, midas alarm borked condition evaluation

For everybody using EPICS: There is now a new system called MSetPoint (Midas Set Point) to control whole beamlines via EPICS.
It's under midas/msetpoint and the documentation is here:

   https://bitbucket.org/tmidas/midas/wiki/MSetPoint

It is basically an EPICS frontend and two custom pages. The special thing is that the EPICS elements are not hardcoded in
the frontend but come from the ODB. There is even an editor for the beamline elements (second custom page). By loading different
ODB settings, one can switch easily between completely different beamlines without having to recompile the frontend. The system
can be operated standalone (all other MIDAS pages do not appear), or as a custom page in a normal midas setup. At PSI, this
system is now used as the standard editor for our beamlines.

Attached and example screen.

Stefan

14 May 2024, Konstantin Olchanski, Info, ROOT v6.30.6 requires libtbb-dev

root_v6.30.06.Linux-ubuntu22.04-x86_64-gcc11.4 the libtbb-dev package.

This is a new requirement and it is not listed in the ROOT dependancies page (I left a note on the ROOT forum, hopefully it will be 
fixed quickly). https://root.cern/install/dependencies/

Symptom is when starting ROOT, you get an error:
cling::DynamicLibraryManager::loadLibrary(): libtbb.so.12: cannot open shared object file: No such file or directory
and things do not work.

Fix is to:
apt install libtbb-dev

K.O.

29 Apr 2024, Musaab Al-Bakry, Forum, Midas Sequencer with less than 1 second wait

Hello there,

I am working on a task that involves some ODB changes that happen within 20-500 
ms. The wait command for Midas Sequencer only works for > 1 second. As a 
workaround, I tried calling a python script that has a time.sleep() command, but 
the sequencer doesn't wait for the python script to terminate before moving to the 
next command. Obviously, I could just move the entire script to python, but that 
would cause further issues to us. Is there a way to have a wait that has precision 
in order of milliseconds from within the Midas Sequencer? If there is no Midas-
native methods for doing this, what workaround will you suggest to get this to 
work?

29 Apr 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait

I guess the simplest way to do that without breaking with existing code is to change the 
second number to a float. So a

WAIT SECONDS, 1

will still work, and you can then write

WAIT SECONDS, 0.01

to get a 10 ms wait. Would that work for you?

Stefan

30 Apr 2024, Scott Oser, Forum, Midas Sequencer with less than 1 second wait

> I guess the simplest way to do that without breaking with existing code is to change the 
> second number to a float. So a
> 
> WAIT SECONDS, 1
> 
> will still work, and you can then write
> 
> WAIT SECONDS, 0.01
> 
> to get a 10 ms wait. Would that work for you?

This would work fine in principle, but isn't implemented in the current MIDAS sequencer as we understand it.  (We tried!) Is your proposal to rewrite the sequencer 
to allow fractional waits?  Right now the code seems to store the start_time as a DWORD and uses atoi to parse the wait time, and uses ss_time (which seems only get
the time to the nearest second) to fetch the time.

30 Apr 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait

> This would work fine in principle, but isn't implemented in the current MIDAS sequencer as we understand it.  (We tried!) Is your proposal to rewrite the sequencer 
> to allow fractional waits?  Right now the code seems to store the start_time as a DWORD and uses atoi to parse the wait time, and uses ss_time (which seems only get
> the time to the nearest second) to fetch the time.

No it's not implemented, was just my idea. If it would work for you, I can implement it in the next couple of days.

Stefan

30 Apr 2024, Scott Oser, Forum, Midas Sequencer with less than 1 second wait

> > This would work fine in principle, but isn't implemented in the current MIDAS sequencer as we understand it.  (We tried!) Is your proposal to rewrite the sequencer 
> > to allow fractional waits?  Right now the code seems to store the start_time as a DWORD and uses atoi to parse the wait time, and uses ss_time (which seems only get
> > the time to the nearest second) to fetch the time.
> 
> No it's not implemented, was just my idea. If it would work for you, I can implement it in the next couple of days.
> 
> Stefan

Yes, please!  Something like WAIT seconds, 0.01 would be perfect.

30 Apr 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait

While I will do it, i'm not sure if this is what you want. If I understand correctly, some process gets triggered and then writes some values to the ODB, then the sequencer 
should continue. Putting a wait there is dangerous. Maybe your process always takes like 10-20 ms, so you put a wait of let's say 100ms, and things are fine with you. Your 
script runs many days, but then once in a while your machine is on heavy load because someone starts a web browser, and your process takes 110ms, and you script crashes.

I would rather go following path: put a "done" flag in the ODB, which is the last one which gets set by your process. Then the sequencer does a 

WAIT ODBvalue, /path/value, =, 1

which will work always, independend of the delay of your process.

Stefan

30 Apr 2024, Scott Oser, Forum, Midas Sequencer with less than 1 second wait

> While I will do it, i'm not sure if this is what you want. If I understand correctly, some process gets triggered and then writes some values to the ODB, then the sequencer 
> should continue. Putting a wait there is dangerous. Maybe your process always takes like 10-20 ms, so you put a wait of let's say 100ms, and things are fine with you. Your 
> script runs many days, but then once in a while your machine is on heavy load because someone starts a web browser, and your process takes 110ms, and you script crashes.
> 
> I would rather go following path: put a "done" flag in the ODB, which is the last one which gets set by your process. Then the sequencer does a 
> 
> WAIT ODBvalue, /path/value, =, 1
> 
> which will work always, independend of the delay of your process.
> 
> Stefan

Our use case is pretty simple and I don't think is affected by the scenario you envision.  We want to turn on a setting in our equipment, and turn it off again some 0.2 s later.  We don't need msec timing.  So something like:

ODBSET /somekey 1   # this will cause a front-end to flip a bit in our hardware     
WAIT seconds, 0.2
ODBSET /somekey 0   # this will cause a front-end to reset a bit in our hardware 

It is true that if the load is high there could be a little delay, and the time that the bit is set will not be 0.2 seconds, but on average it should work, 
and it should be good enough we think.

Yes, we could also check an ODB key to see that something is done, but we'd still need the ability to wait for time intervals less than 1 second, which
right now doesn't exist.

02 May 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait

Ok, I implemented the float second wait function. Internally it works in ms, so the maximum resolution is 0.001 s.

Best,
Stefan

02 May 2024, Scott Oser, Forum, Midas Sequencer with less than 1 second wait

> Ok, I implemented the float second wait function. Internally it works in ms, so the maximum resolution is 0.001 s.
> 
> Best,
> Stefan

Thank you, we will test this soon and let you know if we see any issues (but we're not expecting any).

05 May 2024, Musaab Al-Bakry, Forum, Midas Sequencer with less than 1 second wait

> > Ok, I implemented the float second wait function. Internally it works in ms, so the maximum resolution is 0.001 s.
> > 
> > Best,
> > Stefan
> 
> Thank you, we will test this soon and let you know if we see any issues (but we're not expecting any).

Hello Stefan,

Thank you for the help you provided for us so far. I tried your code changes on our midas fork. Now, I notice that any 
wait command takes at least 0.2 seconds to run. 

For example, when I use the following script:
SCRIPT source scripts/time_print.sh
WAIT Seconds, 0.1
SCRIPT source scripts/time_print.sh
WAIT Seconds, 0.1
SCRIPT source scripts/time_print.sh

The time_print.sh script prints time segments separated by almost exactly 0.2 seconds. Same goes for when I use 0.01 
second waits.

However, when I use 0.2 seconds wait, then I get time segments separated by 0.3 seconds. I also tried something like 
this:
SCRIPT source scripts/time_print.sh
WAIT Seconds, 0.2
WAIT Seconds, 0.2
SCRIPT source scripts/time_print.sh
WAIT Seconds, 0.2
WAIT Seconds, 0.2
SCRIPT source scripts/time_print.sh

This script results in time segements of 0.6 seconds difference. It is not immidiately clear to me from the sequencer 
code what causes this effect. The code as it stands is a lot better than what we had before the changes, but I am 
wondering if this can be reduced to the order of 1ms or 10ms.

Best regards,
Musaab Faozi

06 May 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait

Indeed there was a sleep(100ms) in the sequencer in each loop. I reduced it now to 10ms. I need at least 10ms since otherwise 
the sequencer would run in an infinite loop during the wait and burn 100% CPU. The smallest time slice on Linux to sleep is 
10ms, so that's why I set it to that. Give it a try.

Stefan

06 May 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait

Actually I realized that a 1ms wait still works, so I reduced it to that.

Stefan

07 May 2024, Musaab Al-Bakry, Forum, Midas Sequencer with less than 1 second wait

> Actually I realized that a 1ms wait still works, so I reduced it to that.
> 
> Stefan

Thank you so much, Stefan. I have tested your changes, and it seems like this does 
the job for our purposes. Very appreciated!

Best regards,
Musaab Faozi

03 May 2024, Thomas Senger, Suggestion, Possible addition to IF Statements

Hello there, 
in our setup we use many variables with many different exceptions. Would it be possible to implement something like an 
IF or/and IF statement? I believe that this is currently not possible.
Best regards, 
Thomas Senger

03 May 2024, Stefan Ritt, Suggestion, Possible addition to IF Statements

The tinyexpr library I use to evaluate expressions does not support boolean operations. I would have to switch to the newer 
tineyexpr-plusplus version, which also has much richer functionality:

https://github.com/Blake-Madden/tinyexpr-plusplus/blob/tinyexpr%2B%2B/TinyExprChanges.md

Unfortunately it requires C++17, and at the moment we limit MIDAS to C++11, meaning we would break this requirement. I 
believe at the moment there are still some experiments (mainly at TRIUMF) which are stuck to older OS and therefore cannot 
switch to C++17, but hopefully this will change over time.

Stefan

15 Apr 2024, Konstantin Olchanski, Bug Report, open MIDAS RPC ports

we had a bit of trouble with open network ports recently and I now think security of MIDAS RPC
ports needs to be tightened.

TL;DR, this is a non-trivial network configuration problem, TL required, DR up to you.

as background, right now we have two settings in ODB, "/expt/security/enable non-localhost
RPC" set to "no" (the default) and set to "yes". Set to "no" is very secure, all RPC sockets
listen only on the "localhost" interface (127.0.0.1) and do not accept connections from other
computers. Set to "yes", RPC sockets accept connections from everywhere in the world, but
immediately close them without reading any data unless connection origins are listed in ODB
"/expt/security/RPC hosts" (white-listed).

the problem, one. for security and robustness we place most equipments on a private network
(192.168.1.x). MIDAS frontends running on these equipments must connect to MIDAS running on
the main computer. This requires setting "enable non-localhost RPC" to "yes" and white-listing
all private network equipments. so far so good.

the problem, one, continued. in this configuration, the MIDAS main computer is usually also
the network gateway (with NAT, IP forwarding, DHCP, DNS, etc). so now MIDAS RPC ports are open
to all external connections (in the absence of restrictive firewall rules). one would hope for
security-through-obscurity and expect that "external threat actors" will try to bother them,
but in reality we eventually see large numbers of rejected unwanted connections logged in
midas.log (we log the first 10 rejected connections to help with maintaining the RPC
connections white-list).

the problem, two. central IT do not like open network ports. they run their scanners, discover
the MIDAS RPC ports, complain about them, require lengthy explanations, etc.

it would be much better if in the typical configuration, MIDAS RPC ports did not listen on
external interfaces (campus network). only listen on localhost and on private network
interfaces (192.168.1.x).

I am not yet of the simplest way to implement this. But I think this is the direction we
should go.

P.S. what about firewall rules? two problems: one: from statistic-of-one, I make mistakes
writing firewall rules, others also will make mistakes, a literally fool-proof protection of
MIDAS RPC ports is needed. two: RHEL-derived Linuxes by-default have restrictive firewall
rules, and this is good for security, except that there is a failure mode where at boot time
something can go wrong and firewall rules are not loaded at all. we have seen this happen.
this is a complete disaster on a system that depends on firewall rules for security. better to
have secure applications (TCP ports protected by design and by app internals) with firewall
rules providing a secondary layer of protection.

P.P.S. what about MIDAS frontend initial connection to the mserver? this is currently very
insecure, but the vulnerability window is very small. Ideally we should rework the mserver
connection to make it simpler, more secure and compatible with SSH tunneling.

P.P.S. Typical network diagram:

internet - campus firewall - campus network - MIDAS host (MIDAS) - 192.168.1.x network - power
supplies, digitizers, MIDAS frontends.

P.P.S. mserver connection sequence:

1) midas frontend opens 3 tcp sockets, connections permitted from anywhere
2) midas frontend opens tcp socket to main mserver, sends port numbers of the 3 tcp sockets
3) main mserver forks out a secondary (per-client) mserver
4) secondary mserver connects to the 3 tcp sockets of the midas frontend created in (1)
5) from here midas rpc works
6) midas frontend loads the RPC white-list
7) from here MIDAS RPC sockets are secure (protected by the white-list).

(the 3 sockets are: RPC recv_sock, RPC send_sock and event_sock)

P.P.S. MIDAS UDP sockets used for event buffer and odb notifications are secure, they bind to
localhost interface and do not accept external connections.

K.O.

15 Apr 2024, Stefan Ritt, Bug Report, open MIDAS RPC ports

One thing coming to my mind is the interface binding. If you have a midas host with two networks 
("global" and "local"=192.168.x.x), you can tell to which interface a socket should bind. 
By default it binds both interfaces, but we could restrict the socket only to bind to the local 
interface 192.168.x.x. This way the open port would be invisible from the outside, but from 
your local network everybody can connect easily without the need of a white list.

Stefan

18 Mar 2024, Grzegorz Nieradka, Bug Report, Midas (manalyzer) + ROOT 6.31/01 - compilation error

I tried to update MIDAS installation on Ubuntu 22.04.1 to the latest commit at 
the bitbucket.

I have update the ROOT from source the latest version ROOT 6.31/01.

During the MIDAS compilation I have error:

/usr/bin/ld: *some_path_to_ROOT*/libRIO.so: undefined reference to 
`std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30'

The longer version of this error is below.

Has anybody knows some simple solution of this error?

Thanks, GN

Consolidate compiler generated dependencies of target manalyzer_main
[ 32%] Building CXX object 
manalyzer/CMakeFiles/manalyzer_main.dir/manalyzer_main.cxx.o
[ 33%] Linking CXX static library libmanalyzer_main.a
[ 33%] Built target manalyzer_main
Consolidate compiler generated dependencies of target manalyzer_test.exe
[ 33%] Building CXX object 
manalyzer/CMakeFiles/manalyzer_test.exe.dir/manalyzer_main.cxx.o
[ 34%] Linking CXX executable manalyzer_test.exe
/usr/bin/ld: /home/astrocent/workspace/root/root_install/lib/libRIO.so: undefined 
reference to 
`std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30'
collect2: error: ld returned 1 exit status
make[2]: *** [manalyzer/CMakeFiles/manalyzer_test.exe.dir/build.make:124: 
manalyzer/manalyzer_test.exe] Error 1
make[1]: *** [CMakeFiles/Makefile2:780: 
manalyzer/CMakeFiles/manalyzer_test.exe.dir/all] Error 2

18 Mar 2024, Konstantin Olchanski, Bug Report, Midas (manalyzer) + ROOT 6.31/01 - compilation error

> [ 34%] Linking CXX executable manalyzer_test.exe
> /usr/bin/ld: /home/astrocent/workspace/root/root_install/lib/libRIO.so: undefined 
> reference to 
> `std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30'
> collect2: error: ld returned 1 exit status

the error is actually in ROOT, libRIO does not find someting in the standard library.

one possible source of trouble is mismatched compilation flags, to debug this, please 
use "make cmake" and email me (or post here) the full output. (standard cmake hides 
all compiler information to make it easier to debug such problems).

since this is a prerelease of ROOT 6.32 (which in turn fixes major breakage on MacOS) 
and you built it from sources, can you confirm for me that it actually works, you can 
run "root -l somefile.root", open the tbrowser, look at some plots? this is to 
eliminate the possibility that your ROOR is miscompiled.

hmm... also please run "make cmake -k", let's see is linking of rmlogger also fails.

K.O.

19 Mar 2024, Grzegorz Nieradka, Bug Report, Midas (manalyzer) + ROOT 6.31/01 - compilation error

Dear Konstantin,
Thank you for your interest in my problem.

What I did:
1. I installed the latest ROOT from source according tho the manual,
exactly as in this webpage (https://root.cern/install/).
ROOT sems work correctly, .demo from it is works and some example
file too. The manalyzer is not linking with this ROOT version installed from source.

2. I downgraded the ROOT to the lower version (6.30.04):
 git checkout -b v6-30-04 v6-30-04
ROOT seems compiled, installed and run correctly. The manalyzer,
from the MIDAS is not linked.

3. I downoladed the latest version of ROOT:
https://root.cern/download/root_v6.30.04.Linux-ubuntu22.04-x86_64-gcc11.4.tar.gz
and I installed it simple by tar: tar -xzvf root_...
   ------------------------------------------------------------------
  | Welcome to ROOT 6.30/04                        https://root.cern |
  | (c) 1995-2024, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Jan 31 2024, 10:01:37                 |
  | From heads/master@tags/v6-30-04                                  |
  | With c++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0                   |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------
Again the ROOT sems work properly, the .demo from it is working, and example file
are working too. Manalyzer from MIDAS is failed to linking.

4. The midas with the option: cmake -D NO_ROOT=ON ..
is compliling, linking and even working.

5. When I try to build MIDAS with ROOT support threre is error:
[ 33%] Linking CXX executable manalyzer_test.exe
/usr/bin/ld: /home/astrocent/workspace/root/lib/libRIO.so: undefined reference to 
`std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30

I'm trying to attach files:
cmake-midas-root -> My configuration of compiling MIDAS with ROOT
make-cmake-midas  -> output of my the command make cmake in MIDAS directory
make-cmake-k -> output of my the command make cmake -k in MIDAS directory

And I'm stupid at this moment.
Regards, 
Grzegorz Nieradka

19 Mar 2024, Konstantin Olchanski, Bug Report, Midas (manalyzer) + ROOT 6.31/01 - compilation error

ok, thank you for your information. I cannot reproduce this problem, I use vanilla Ubuntu 
LTS 22, ROOT binary kit root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4 from root.cern.ch 
and latest midas from git.

something is wrong with your ubuntu or with your c++ standard library or with your ROOT.

a) can you try with root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4 from root.cern.ch
b) for the midas build, please run "make cclean; make cmake -k" and email me (or post 
here) the complete output.

K.O.

19 Mar 2024, Konstantin Olchanski, Bug Report, Midas (manalyzer) + ROOT 6.31/01 - compilation error

> ok, thank you for your information. I cannot reproduce this problem, I use vanilla Ubuntu 
> LTS 22, ROOT binary kit root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4 from root.cern.ch 
> and latest midas from git.
> 
> something is wrong with your ubuntu or with your c++ standard library or with your ROOT.
> 
> a) can you try with root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4 from root.cern.ch
> b) for the midas build, please run "make cclean; make cmake -k" and email me (or post 
> here) the complete output.

also, please email me the output of these commands on your machine:

daq00:midas$ ls -l /lib/x86_64-linux-gnu/libstdc++*
lrwxrwxrwx 1 root root      19 May 13  2023 /lib/x86_64-linux-gnu/libstdc++.so.6 -> libstdc++.so.6.0.30
-rw-r--r-- 1 root root 2260296 May 13  2023 /lib/x86_64-linux-gnu/libstdc++.so.6.0.30
daq00:midas$ 

and

daq00:midas$ ldd $ROOTSYS/bin/rootreadspeed 
	linux-vdso.so.1 (0x00007ffe6c399000)
	libTree.so => /daq/cern_root/root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4/lib/libTree.so (0x00007f67e53b5000)
	libRIO.so => /daq/cern_root/root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4/lib/libRIO.so (0x00007f67e4fb9000)
	libCore.so => /daq/cern_root/root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4/lib/libCore.so (0x00007f67e4b08000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f67e48bd000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f67e489b000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f67e4672000)
	libNet.so => /daq/cern_root/root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4/lib/libNet.so (0x00007f67e458b000)
	libThread.so => /daq/cern_root/root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4/lib/libThread.so (0x00007f67e4533000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f67e444c000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f67e5599000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f67e43d6000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f67e43b8000)
	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f67e438d000)
	libxxhash.so.0 => /lib/x86_64-linux-gnu/libxxhash.so.0 (0x00007f67e4378000)
	liblz4.so.1 => /lib/x86_64-linux-gnu/liblz4.so.1 (0x00007f67e4358000)
	libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f67e4289000)
	libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007f67e41e3000)
	libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007f67e3d9f000)
daq00:midas$ 

K.O.

28 Mar 2024, Grzegorz Nieradka, Bug Report, Midas (manalyzer) + ROOT 6.31/01 - compilation error

I found solution for my trouble. With MIDAS and ROOT everything is OK,
the trobule was with my Ubuntu enviroment.

In this case the trobule was caused by earlier installed anaconda and hardcoded path
to anaconda libs folder in PATH enviroment variable.

In anaconda lib folder I have the libstdc++.so.6.0.29 and the hardcoded path
to this folder was added during the linking, by ld program, after the standard path location 
of libstdc++.

So the linker tried to link to this version of libstdc++.

When I removed the path for anaconda libs from enviroment and the standard libs location 
is /usr/lib/x86_64-linux-gnu/ and I have the libstdc++.so.6.0.32 version
of  stdc++ library everything is compiling and linking smoothly without any errors.

Additionaly, everything works smoothly even with the newest ROOT version 6.30/04 compiled
from source.

Thanks for help.

BTW. I would like to take this opportunity to wish everyone a happy Easter and tasty eggs!

Regards,
Grzegorz Nieradka

02 Apr 2024, Konstantin Olchanski, Bug Report, Midas (manalyzer) + ROOT 6.31/01 - compilation error

> I found solution for my trouble. With MIDAS and ROOT everything is OK,
> the trobule was with my Ubuntu enviroment.

Congratulations with figuring this out.

BTW, this is the 2nd case of contaminated linker environment I run into in the last 30 days. We 
just had a problem of "cannot link MIDAS with ROOT" (resolving by "make cmake NO_ROOT=1 NO_CURL=1 
NO_MYSQL=1").

This all seems to be a flaw in cmake, it reports "found ROOT at XXX", "found CURL at YYY", "found 
MYSQL at ZZZ", then proceeds to link ROOT, CURL and MYSQL libraries from somewhere else, 
resulting in shared library version mismatch.

With normal Makefiles, this is fixable by changing the link command from:

g++ -o rmlogger ... -LAAA/lib -LXXX/lib -LYYY/lib -lcurl -lmysql -lROOT

into explicit

g++ -o rmlogger ... -LAAA/lib XXX/lib/libcurl.a YYY/lib/libmysql.a ...

defeating the bogus CURL and MYSQL libraries in AAA.

With cmake, I do not think it is possible to make this transformation.

Maybe it is possible to add a cmake rules to at least detect this situation, i.e. compare library 
paths reported by "ldd rmlogger" to those found and expected by cmake.

K.O.

02 Apr 2024, Zaher Salman, Info, Sequencer editor

Dear all,
Stefan and I have been working on improving the sequencer editor to make it look and feel more like a standard editor. This sequencer v2 has been finally merged into the develop branch earlier today.

The sequencer page has now a main tab which is used as a "console" to show the loaded sequence and it's progress when running. All other tabs are used only for editing scripts. To edit a currently loaded sequence simply double click on the editing area of the main tab or load the file in a new tab. A couple of screen shots of the new editor are attached.

For those who would like to stay with the older sequencer version a bit longer, you may simply copy resources/sequencer_v1.html to resources/sequencer.html. However, this version is not being actively maintained and may become obsolete at some point. Please help us improve the new version instead by reporting bugs and feature requests on bitbucket or here.

Best regards,
Zaher

02 Apr 2024, Konstantin Olchanski, Info, Sequencer editor

> Stefan and I have been working on improving the sequencer editor ...

Looks grand! Congratulations with getting it completed. The previous version was 
my rewrite of the old generated-C pages into html+javascript, nothing to write 
home about, I even kept the 1990-ies-style html formatting and styling as much as 
possible.

K.O.

01 Apr 2024, Konstantin Olchanski, Info, xz-utils bomb out, compression benchmarks

you may have heard the news of a major problem with the xz-utils project, authors of the popular "xz" file compression, 
https://nvd.nist.gov/vuln/detail/CVE-2024-3094

the debian bug tracker is interesting reading on this topic, "750 commits or contributions to xz by Jia Tan, who backdoored it", 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1068024

and apparently there is problems with the deisng of the .xz file format, making it vulnerable to single-bit errors and unreliable checksums,
https://www.nongnu.org/lzip/xz_inadequate.html

this moved me to review status of file compression in MIDAS.

MIDAS does not use or recommend xz compression, MIDAS programs to not link to xz and lzma libraries provided by xz-utils.

mlogger has built-in support for:
- gzip-1, enabled by default, as the most safe and bog-standard compression method
- bzip2 and pbzip2, as providing the best compression
- lz4, for high data rate situations where gzip and bzip2 cannot keep up with the data

compression benchmarks on an AMD 7700 CPU (8-core, DDR5 RAM) confirm the usual speed-vs-compression tradeoff:

note: observe how both lz4 and pbzip2 compress time is the time it takes to read the file from ZFS cache, around 6 seconds.
note: decompression stacks up in the same order: lz4, gzip fastest, pbzip2 same speed using 10x CPU, bzip2 10x slower uses 1 CPU.
note: because of the fast decompression speed, gzip remains competitive.

no compression: 6 sec, 270 MiBytes/sec,
lz4, bpzip2:    6 sec, same, (pbzip2 uses 10 CPU vs lz4 uses 1 CPU)
gzip -1:       21 sec,  78 MiBytes/sec
bzip2:         70 sec,  23 MiBytes/sec (same speed as pbzip2, but using 1 CPU instead of 10 CPU)

file sizes:

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ ls -lSr test.mid*
-rw-r--r-- 1 dsdaqdev users  483319523 Apr  1 14:06 test.mid.bz2
-rw-r--r-- 1 dsdaqdev users  631575929 Apr  1 14:06 test.mid.gz
-rw-r--r-- 1 dsdaqdev users 1002432717 Apr  1 14:06 test.mid.lz4
-rw-r--r-- 1 dsdaqdev users 1729327169 Apr  1 14:06 test.mid
(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ 

actual benchmarks:

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time cat test.mid > /dev/null
0.00user 6.00system 0:06.00elapsed 99%CPU (0avgtext+0avgdata 1408maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time gzip -1 -k test.mid
14.70user 6.42system 0:21.14elapsed 99%CPU (0avgtext+0avgdata 1664maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time lz4 -k -f test.mid
2.90user 6.44system 0:09.39elapsed 99%CPU (0avgtext+0avgdata 7680maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time bzip2 -k -f test.mid
64.76user 8.81system 1:13.59elapsed 99%CPU (0avgtext+0avgdata 8448maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time pbzip2 -k -f test.mid
86.76user 15.39system 0:09.07elapsed 1125%CPU (0avgtext+0avgdata 114596maxresident)k

decompression benchmarks:

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time lz4cat  test.mid.lz4 > /dev/null
0.68user 0.23system 0:00.91elapsed 99%CPU (0avgtext+0avgdata 7680maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time zcat  test.mid.gz > /dev/null
6.61user 0.23system 0:06.85elapsed 99%CPU (0avgtext+0avgdata 1408maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time bzcat  test.mid.bz2 > /dev/null
27.99user 1.59system 0:29.58elapsed 99%CPU (0avgtext+0avgdata 4656maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time pbzip2 -dc test.mid.bz2 > /dev/null
37.32user 0.56system 0:02.75elapsed 1377%CPU (0avgtext+0avgdata 157036maxresident)k

K.O.

10 Mar 2024, Zaher Salman, Bug Report, Autostart program

Hello everyone,

It seems that if a frontend is started automatically by using Program->Auto start then the status page does not show it as started. This is since the FE name has a number after the name. If I stop and start manually then the status page shows the correct state of the FE. Am I doing something wrong or is this a bug somewhere?

thanks,
Zaher

11 Mar 2024, Konstantin Olchanski, Bug Report, Autostart program

> It seems that if a frontend is started automatically by using Program->Auto start then the status page does not show it as started. This is since the FE name has a number after the name. If I stop and start manually then the status page shows the correct state of the FE. Am I doing something wrong or is this a bug somewhere?

Zaher, please read https://daq00.triumf.ca/elog-midas/Midas/919

K.O.

08 Mar 2024, Konstantin Olchanski, Info, MIDAS frontend for WIENER L.V. P.S. and VME crates

Our MIDAS frontend for WIENER power supplies is now available as a standalone git repository.

https://bitbucket.org/ttriumfdaq/fewienerlvps/src/master/

This frontend use the snmpwalk and snmpset programs to talk to the power supply.

Also included is a simple custom web page to display power supply status and to turn things on and off.

This frontend was originally written for the T2K/ND280 experiment in Japan.

In addition to controlling Wiener low voltage power supplies, it was also used to control the ISEG MPOD high 
voltage power supplies.

In Japan, ISEG MPOD was (still is) connected to the MicroMegas TPC and is operated in a special "spark counting" 
mode. This spark counting code is still present in this MIDAS frontend and can be restored with a small amount of 
work.

K.o.

27 Feb 2024, Pavel Murat, Forum, displaying integers in hex format ?

Dear MIDAS Experts,

I'm having an odd problem when trying to display an integer stored in ODB on a custom 
web page:  the hex specifier, "%x", displays integers as if it were "%d" .

- attachment 1 shows the layout and the contents of the ODB sub-tree in question
- attachment 2 shows the web page as it is displayed
- attachment 3 shows the snippet of html/js producing the web page

I bet I'm missing smth trivial - an advice is greatly appreciated! 

Also, is there an equivalent of a "0x%04x" specifier to have the output formatted 
into a fixed length string ?  

-- thanks, regards, Pasha

27 Feb 2024, Stefan Ritt, Forum, displaying integers in hex format ?

Thanks for reporting that bug. I fixed it and committed the change to the develop branch.

Stefan

27 Feb 2024, Pavel Murat, Forum, displaying integers in hex format ?

Hi Stefan (and Ben),

thanks for reacting so promptly - your commits on Bitbucket fixed the problem.

For those of us who knows little about how the web browsers work: 

- picking up the fix required flushing the cache of the MIDAS client web browser - apparently the web browser 
  I'm using - Firefox 115.6 - cached the old version of midas.js but wouldn't report it cached and wouldn't load 
  the updated file on its own.

-- thanks again, regards, Pasha

30 Mar 2016, Belina von Krosigk, Forum, mserver ERR message saying data area 100% full, though it is free

Hi,

I have just installed Midas and set-up the ODB for a SuperCDMS test-facility (on
a SL6.7 machine). All works fine except that I receive the following error message:

[mserver,ERROR] [odb.c:944:db_validate_db,ERROR] Warning: database data area is
100% full

Which is puzzling for the following reason:

-> I have created the ODB with: odbedit -s 4194304
-> Checking the size of the .ODB.SHM it says: 4.2M
-> When I save the ODB as .xml and check the file's size it says: 1.1M
-> When I start odbedit and check the memory usage issuing 'mem', it says: 
...
Free Key area: 1982136 bytes out of 2097152 bytes
...
Free Data area: 2020072 bytes out of 2097152 bytes
Free: 1982136 (94.5%) keylist, 2020072 (96.3%) data

So it seems like nearly all memory is still free. As a test I created more
instances of one of our front-ends and checked 'mem' again. As expected the free
memory was decreasing. I did this ten times in fact, reaching

...
Free Key area: 1440976 bytes out of 2097152 bytes
...
Free Data area: 1861264 bytes out of 2097152 bytes
Free: 1440976 (68.7%) keylist, 1861264 (88.8%) data

So I could use another >20% of the database data area, which is according to the
error message 100% (resp. >95%) full. Am I misunderstanding the error message?
I'd appreciate any comments or ideas on that subject!

Thanks, Belina

26 Feb 2024, Maia Henriksson-Ward, Forum, mserver ERR message saying data area 100% full, though it is free

> Hi,
> 
> I have just installed Midas and set-up the ODB for a SuperCDMS test-facility (on
> a SL6.7 machine). All works fine except that I receive the following error message:
> 
> [mserver,ERROR] [odb.c:944:db_validate_db,ERROR] Warning: database data area is
> 100% full
> 
> Which is puzzling for the following reason:
> 
> -> I have created the ODB with: odbedit -s 4194304
> -> Checking the size of the .ODB.SHM it says: 4.2M
> -> When I save the ODB as .xml and check the file's size it says: 1.1M
> -> When I start odbedit and check the memory usage issuing 'mem', it says: 
> ...
> Free Key area: 1982136 bytes out of 2097152 bytes
> ...
> Free Data area: 2020072 bytes out of 2097152 bytes
> Free: 1982136 (94.5%) keylist, 2020072 (96.3%) data
> 
> So it seems like nearly all memory is still free. As a test I created more
> instances of one of our front-ends and checked 'mem' again. As expected the free
> memory was decreasing. I did this ten times in fact, reaching
> 
> ...
> Free Key area: 1440976 bytes out of 2097152 bytes
> ...
> Free Data area: 1861264 bytes out of 2097152 bytes
> Free: 1440976 (68.7%) keylist, 1861264 (88.8%) data
> 
> So I could use another >20% of the database data area, which is according to the
> error message 100% (resp. >95%) full. Am I misunderstanding the error message?
> I'd appreciate any comments or ideas on that subject!
> 
> Thanks, Belina

This is an old post, but I encountered the same error message recently and was looking for a 
solution here. Here's how I solved it, for anyone else who finds this: 
The size of .ODB.SHM was bigger than the maximum ODB size (4.2M > 4194304 in Belina's case). For us, 
the very large odb size was in error and I suspect it happened because we forgot to shut down midas 
cleanly before shutting the computer down. Using odbedit to load a previously saved copy of the ODB 
did not help me to get .ODB.SHM back to a normal size. Following the instructions on the wiki for 
recovery from a corrupted odb, 
https://daq00.triumf.ca/MidasWiki/index.php/FAQ#How_to_recover_from_a_corrupted_ODB, (odbinit with --cleanup option) should 
work, but didn't for me. Unfortunately I didn't save the output to figure out why. My solution was to manually delete/move/hide 
the .ODB.SHM file, and an equally large file called .ODB.SHM.1701109528, then run odbedit again and reload that same saved copy of my ODB. 
Manually changing files used by mserver is risky - for anyone who has the same problem, I suggest trying odbinit --cleanup -s 
<yoursize> first.

28 Jan 2024, Pavel Murat, Forum, number of entries in a given ODB subdirectory ?

Dear MIDAS experts,

- I have a detector configuration with a variable number of hardware components - FPGA's receiving data
from the detector. They are described in ODB using a set of keys ranging
from "/Detector/FPGAs/FPGA00" .... to "/Detector/FPGAs/FPGA68".
Each of "FPGAxx" corresponds to an ODB subdirectory containing parameters of a given FPGA.

The number of FPGAs in the detector configuration is variable - [independent] commissioning
of different detector subsystems involves different number of FPGAs.

In the beginning of the data taking one needs to loop over all of "FPGAxx",
parse the information there and initialize the corresponding FPGAs.

The actual question sounds rather trivial - what is the best way to implement a loop over them?

- it is certainly possible to have the number of FPGAs introduced as an additional configuration parameter,
say, "/Detector/Number_of_FPGAs", and this is what I have resorted to right now.

However, not only that loooks ugly, but it also opens a way to make a mistake
and have the Number_of_FPGAs, introduced separately, different from the actual number
of FPGA's in the detector configuration.

I therefore wonder if there could be a function, smth like

int db_get_n_keys(HNDLE hdb, HNDLE hKeyParent)

returning the number of ODB keys with a common parent, or, to put it simpler,
a number of ODB entries in a given subdirectory.

And if there were a better solution to the problem I'm dealing with, knowing it might be helpful
for more than one person - configuring detector readout may require to deal with a variable number
of very different parameters.

-- many thanks, regards, Pasha

28 Jan 2024, Konstantin Olchanski, Forum, number of entries in a given ODB subdirectory ?

Very good question. It exposes a very nasty problem, the race condition between "ls" and "rm". While you are 
looping over directory entries, somebody else is completely permitted to remove one of the files (or add more 
files), making the output of "ls" incorrect (contains non-existant/removed files, does not contain newly added 
files). even the simple count of number of files can be wrong.

Exactly the same problem exists in ODB. As you loop over directory entries, some other ODB client can remove or 
add new entries.

To help with this, I considered adding an db_ls() function that would take the odb lock, atomically iterate over 
a directory and return an std::vector<std::string> with names of all entries. (current odb iterator returns ODB 
handles that may be invalid if corresponding entry was removed while we were iterating). Unfortunately the 
delete/add race condition remains, some returned entries may be invalid or missing.

For your specific application, you can swear that you will never add/delete files "at the wrong time", and you 
will not see this problem until one of your users writes a script that uses odbedit to add/remove subdirectory 
entries exactly at the wrong time. (you run your "ls" in the BeginRun() handler of your frontend, they run their 
"rm" from their's, so both run at the same time, a race condition.

Closer to your question, I think it is simplest to always iterate over the subdirectory, collect names of all 
entries, then work with them:

std::vector<std::string> names;
iterate over odb {
  names.push_back(name);
}
foreach (name in names)
  work_on(name);

instead of:

size_t n = db_get_num_entries();
for (size_t i=0; i<n; i++) {
   std::string name = sprintf("FPGA%d", i);
   work_on(name);
}

K.O.


> Dear MIDAS experts, 
> 
> - I have a detector configuration with a variable number of hardware components - FPGA's receiving data 
>   from the detector. They are described in ODB using a set of keys ranging 
>   from "/Detector/FPGAs/FPGA00" .... to "/Detector/FPGAs/FPGA68".
>   Each of "FPGAxx" corresponds to an ODB subdirectory containing parameters of a given FPGA. 
> 
>   The number of FPGAs in the detector configuration is variable - [independent] commissioning 
>   of different detector subsystems involves different number of FPGAs.
> 
>   In the beginning of the data taking one needs to loop over all of "FPGAxx", 
>   parse the information there and initialize the corresponding FPGAs.
> 
> The actual question sounds rather trivial - what is the best way to implement a loop over them? 
> 
> - it is certainly possible to have the number of FPGAs introduced as an additional configuration parameter, 
>   say, "/Detector/Number_of_FPGAs", and this is what I have resorted to right now.
> 
>   However, not only that loooks ugly, but it also opens a way to make a mistake 
>   and have the Number_of_FPGAs, introduced separately, different from the actual number 
>   of FPGA's in the detector configuration.
>  
> I therefore wonder if there could be a function, smth like 
> 
>     int db_get_n_keys(HNDLE hdb, HNDLE hKeyParent)
> 
> returning the number of ODB keys with a common parent, or, to put it simpler, 
> a number of ODB entries in a given subdirectory.
> 
> And if there were a better solution to the problem I'm dealing with, knowing it might be helpful 
> for more than one person - configuring detector readout may require to deal with a variable number 
> of very different parameters.
> 
> -- many thanks, regards, Pasha

28 Jan 2024, Stefan Ritt, Forum, number of entries in a given ODB subdirectory ?

I guess you won't change your FPGA configuration just when you start a run, so I don't consider the race
condition very crucial (although KO is correct, it it there).

I guess rather than any pseudo code you want to see real working code (db_get_num_entries() does not exist!), right?

The easiest these day is to ask ChatGPT. MIDAS has been open source since a long time, so it has been used
to train modern Large Language Models. Attached is the result. Here is the direct link from where you can
copy the code:

https://chat.openai.com/share/d927c78d-9914-4413-ab5e-3b0e5d173132

Please note that you never can be 100% sure that the code from a LLM is correct, so always compile and debug it.
But nevertheless, it's always easier to start from some existing code, even if there is a danger that it's not perfect.

Best,
Stefan

29 Jan 2024, Pavel Murat, Forum, number of entries in a given ODB subdirectory ?

Hi Stefan, Konstantin, 

thanks a lot for your responses - they are very teaching and it is good to have them archived in the forum.
 
Konstantin, as Stefan already noticed, in this particular case the race condition is not really a concern.

Stefan, the ChatGPT-generated code snippet is awesome! (teach a man how to fish ...)

-- regards, Pasha

29 Jan 2024, Konstantin Olchanski, Forum, number of entries in a given ODB subdirectory ?

> https://chat.openai.com/share/d927c78d-9914-4413-ab5e-3b0e5d173132
> 
> Please note that you never can be 100% sure that the code from a LLM is correct

yup, it's wrong allright. it should be looping until db_enum_key() returns "no more keys",
not from 0 to N. this is same as iterating over unix filesystem directory entries, opendir(),
loop readdir() until it returns EOF, closedir().

K.O.

03 Feb 2024, Pavel Murat, Forum, number of entries in a given ODB subdirectory ?

Konstantin is right: KEY.num_values is not the same as the number of subkeys (should it be ?)
For those looking for an example in the future, I attach a working piece of code converted 
from the ChatGPT example, together with its printout.

-- regards, Pasha

08 Feb 2024, Stefan Ritt, Forum, number of entries in a given ODB subdirectory ?

> Konstantin is right: KEY.num_values is not the same as the number of subkeys (should it be ?)

For ODB keys of type TID_KEY, the value num_values IS the number of subkeys. The only issue here is 
what KO mentioned already. If you obtain num_values, start iterating, then someone else might 
change the number of subkeys, then your (old) num_values is off. Therefore it's always good to 
check the return status of all subkey accesses. To do a truely atomic access to a subtree, you need 
db_copy(), but then you have to parse the JSON yourself, and again you have no guarantee that the 
ODB hasn't changed in meantime.

Stefan

11 Feb 2024, Pavel Murat, Forum, number of entries in a given ODB subdirectory ?

> For ODB keys of type TID_KEY, the value num_values IS the number of subkeys. 

this logic makes sense, however it doesn't seem to be consistent with the printout of the test example
at the end of https://daq00.triumf.ca/elog-midas/Midas/240203_095803/a.cc . The printout reports 

key.num_values = 1, but the actual number of subkeys = 6, and all subkeys being of TID_KEY type

I'm certain that the ODB subtree in question was not accessed concurrently during the test.

-- regards, Pasha

13 Feb 2024, Stefan Ritt, Forum, number of entries in a given ODB subdirectory ?

> > For ODB keys of type TID_KEY, the value num_values IS the number of subkeys. 
> 
> this logic makes sense, however it doesn't seem to be consistent with the printout of the test example
> at the end of https://daq00.triumf.ca/elog-midas/Midas/240203_095803/a.cc . The printout reports 
> 
> key.num_values = 1, but the actual number of subkeys = 6, and all subkeys being of TID_KEY type
> 
> I'm certain that the ODB subtree in question was not accessed concurrently during the test.

You are right, num_values is always 1 for TID_KEYS. The number of subkeys is stored in 

  ((KEYLIST *) ((char *)pheader + pkey->data))->num_keys

Maybe we should add a function to return this. But so far db_enum_key() was enough.

Stefan

15 Feb 2024, Konstantin Olchanski, Forum, number of entries in a given ODB subdirectory ?

> > > For ODB keys of type TID_KEY, the value num_values IS the number of subkeys. 
> > 
> > this logic makes sense, however it doesn't seem to be consistent with the printout of the test example
> > at the end of https://daq00.triumf.ca/elog-midas/Midas/240203_095803/a.cc . The printout reports 
> > 
> > key.num_values = 1, but the actual number of subkeys = 6, and all subkeys being of TID_KEY type
> > 
> > I'm certain that the ODB subtree in question was not accessed concurrently during the test.
> 
> You are right, num_values is always 1 for TID_KEYS. The number of subkeys is stored in 
> 
>   ((KEYLIST *) ((char *)pheader + pkey->data))->num_keys
> 
> Maybe we should add a function to return this. But so far db_enum_key() was enough.
> 
> Stefan

I would rather add a function that atomically returns an std::vector<KEY>. number of entries
is vector size, entry names are in key.name. If you need to do something with an entry,
like iterate a subdirectory, you have to go by name (not by HNDLE), and if somebody deleted
it, you get an error "entry deleted, tough!", (HNDLE becomes invalid without any error message about it, 
subsequent db_get_data() likely returns gibberish, subsequent db_set_data() likely corrupts ODB).

K.O.

15 Feb 2024, Konstantin Olchanski, Forum, number of entries in a given ODB subdirectory ?

> > You are right, num_values is always 1 for TID_KEYS. The number of subkeys is stored in 
> >   ((KEYLIST *) ((char *)pheader + pkey->data))->num_keys
> > Maybe we should add a function to return this. But so far db_enum_key() was enough.

Hmm... is there any use case where you want to know the number of directory entries, but you will not iterate 
over them later?

K.O.

15 Feb 2024, Stefan Ritt, Forum, number of entries in a given ODB subdirectory ?

> Hmm... is there any use case where you want to know the number of directory entries, but you will not iterate 
> over them later?

I agree. 

One more way to iterate over subkeys by name is by using the new odbxx API:


   midas::odb tree("/Test/Settings");
   for (midas::odb& key : tree)
      std::cout << key.get_name() << std::endl;


Stefan

19 Feb 2024, Pavel Murat, Forum, number of entries in a given ODB subdirectory ?

> > Hmm... is there any use case where you want to know the number of directory entries, but you will not iterate 
> > over them later?
> 
> I agree. 

here comes the use case: 

I have a slow control frontend which monitors several DAQ components - software processes. 
The components are listed in the system configuration stored in ODB, a subkey per component.

Each component has its own driver, so the length of the driver list, defined by the number of components, 
needs to be determined at run time.

I calculate the number of components by iterating over the list of component subkeys in the system configuration, 
allocate space for the driver list, and store the pointer to the driver list in the equipment record.

The approach works, but it does require pre-calculating the number of subkeys of a given key.

-- regards, Pasha

15 Jan 2024, Frederik Wauters, Forum, dump history FILE files

We switched from the history files from MIDAS to FILE, so we have *.dat files now (per variable), instead of the old *.hst. 

How shoul
d one now extract data from these data files? With the old *,hst files I can e.g. mhdump -E 102 231010.hst 

but with the new *.dat files I get

...2023/history$ mhdump -E 0 -T "Run number" mhf_1697445335_20231016_run_transitions.dat | head -n 15
event name: [Run transitions], time [1697445335]
tag: tag: /DWORD 1 4 /timestamp
tag: tag: UINT32 1 4 State
tag: tag: UINT32 1 4 Run number
record size: 12, data offset: 1024
record 0, time 1697557722, incr 112387
record 1, time 1697557783, incr 61
record 2, time 1697557804, incr 21
record 3, time 1697557834, incr 30
record 4, time 1697557888, incr 54
record 5, time 1697558318, incr 430
record 6, time 1697558323, incr 5
record 7, time 1697558659, incr 336
record 8, time 1697558668, incr 9
record 9, time 1697558753, incr 85

not very intelligible

Yes, I can do csv export on the webpage. But it would be nice to be able to extract from just the files. Also, the webpage export only saves the data shown ( range limited and/or downsampled)

28 Jan 2024, Konstantin Olchanski, Forum, dump history FILE files

$ cat mhf_1697445335_20231016_run_transitions.dat
event name: [Run transitions], time [1697445335]
tag: tag: /DWORD 1 4 /timestamp
tag: tag: UINT32 1 4 State
tag: tag: UINT32 1 4 Run number
record size: 12, data offset: 1024
...

data is in fixed-length record format. from the file header, you read "record size" is 12 and data starts at offset 1024.

the 12 bytes of the data record are described by the tags:
4 bytes of timestamp (DWORD, unix time)
4 bytes of State (UINT32)
4 bytes of "Run number" (UINT32)

endianess is "local endian", which means "little endian" as we have no big-endian hardware anymore to test endian conversions.

file format is designed for reading using read() or mmap().

and you are right mhdump, does not work on these files, I guess I can write another utility that does what I just described and spews the numbers to stdout.

K.O.

18 Feb 2024, Frederik Wauters, Forum, dump history FILE files

> $ cat mhf_1697445335_20231016_run_transitions.dat
> event name: [Run transitions], time [1697445335]
> tag: tag: /DWORD 1 4 /timestamp
> tag: tag: UINT32 1 4 State
> tag: tag: UINT32 1 4 Run number
> record size: 12, data offset: 1024
> ...
> 
> data is in fixed-length record format. from the file header, you read "record size" is 12 and data starts at offset 1024.
> 
> the 12 bytes of the data record are described by the tags:
> 4 bytes of timestamp (DWORD, unix time)
> 4 bytes of State (UINT32)
> 4 bytes of "Run number" (UINT32)
> 
> endianess is "local endian", which means "little endian" as we have no big-endian hardware anymore to test endian conversions.
> 
> file format is designed for reading using read() or mmap().
> 
> and you are right mhdump, does not work on these files, I guess I can write another utility that does what I just described and spews the numbers to stdout.
> 
> K.O.


Thanks for the answer. As this FILE system is advertised as the new default (eog:2617), this format does merit some more WIKI info.

14 Feb 2024, Konstantin Olchanski, Info, bitbucket permissions

I pushed some buttons in bitbucket user groups and permissions to make it happy 
wrt recent changes.

The intended configuration is this:

- two user groups: admins and developers
- admins has full control over the workspace, project and repositories ("Admin" 
permission)
- developers have push permission for all repositories (not the "create 
repository" permission, this is limited to admins) ("Write" permission).
- there seems to be a quirk, admins also need to be in the developers group or 
some things do not work (like "run pipeline", which set me off into doing all 
this).
- admins "Admin" permission is set at the "workspace" level and is inherited 
down to project and repository level.
- developers "Write" permission is set at the "project" level and is inherited 
down to repository level.
- individual repositories in the "MIDAS" project also seem to have explicit 
(non-inhertited) permissions, I think this is redundant and I will probably 
remove them at some point (not today).

K.O.

03 Feb 2024, Pavel Murat, Bug Report, string --> int64 conversion in the python interface ?

Dear MIDAS experts,

I gave a try to the MIDAS python interface and ran all tests available in midas/python/tests.
Two Int64 tests from test_odb.py had failed (see below), everthong else - succeeded

I'm using a ~ 2.5 weeks-old commit and python 3.9 on SL7 Linux platform.

commit c19b4e696400ee437d8790b7d3819051f66da62d (HEAD -> develop, origin/develop, origin/HEAD)
Author: Zaher Salman <zaher.salman@gmail.com>
Date:   Sun Jan 14 13:18:48 2024 +0100

The symptoms are consistent with a string --> int64 conversion not happening 
where it is needed. 

Perhaps the issue have already been fixed? 

-- many thanks, regards, Pasha
-------------------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 178, in testInt64
    self.set_and_readback_from_parent_dir("/pytest", "int64_2", [123, 40000000000000000], midas.TID_INT64, True)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 130, in set_and_readback_from_parent_dir
    self.validate_readback(value, retval[key_name], expected_key_type)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 87, in validate_readback
    self.assert_equal(val, retval[i], expected_key_type)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 60, in assert_equal
    self.assertEqual(val1, val2)
AssertionError: 123 != '123'

with the test on line 178 commented out, the test on the next line fails in a similar way:
        
Traceback (most recent call last):
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 179, in testInt64
    self.set_and_readback_from_parent_dir("/pytest", "int64_2", 37, midas.TID_INT64, True)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 130, in set_and_readback_from_parent_dir
    self.validate_readback(value, retval[key_name], expected_key_type)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 102, in validate_readback
    self.assert_equal(value, retval, expected_key_type)
  File "/home/mu2etrk/test_stand/pasha_020/midas/python/tests/test_odb.py", line 60, in assert_equal
    self.assertEqual(val1, val2)
AssertionError: 37 != '37'
---------------------------------------------------------------------------

05 Feb 2024, Ben Smith, Bug Fix, string --> int64 conversion in the python interface ?

> The symptoms are consistent with a string --> int64 conversion not happening 
> where it is needed. 

Thanks for the report Pasha. Indeed I was missing a conversion in one place. Fixed now!

Ben

13 Feb 2024, Konstantin Olchanski, Bug Fix, string --> int64 conversion in the python interface ?

> > The symptoms are consistent with a string --> int64 conversion not happening 
> > where it is needed. 
> 
> Thanks for the report Pasha. Indeed I was missing a conversion in one place. Fixed now!
> 

Are we running these tests as part of the nightly build on bitbucket? They would be part of 
the "make test" target. Correct python dependancies may need to be added to the bitbucket OS 
image in bitbucket-pipelines.yml. (This is a PITA to get right).

K.O.

14 Feb 2024, Konstantin Olchanski, Bug Fix, added ubuntu-22 to nightly build on bitbucket, now need python!

> Are we running these tests as part of the nightly build on bitbucket? They would be part of 
> the "make test" target. Correct python dependancies may need to be added to the bitbucket OS 
> image in bitbucket-pipelines.yml. (This is a PITA to get right).

I added ubuntu-22 to the nightly builds.

but I notice the build says "no python" and I am not sure what packages I need to install for 
midas python to work.

Ben, can you help me with this?

https://bitbucket.org/tmidas/midas/pipelines/results/1106/steps/%7B9ef2cf97-bd9f-4fd3-9ca2-9c6aa5e20828%7D

K.O.

Goto page Previous 1, 2, 3 ... 6, 7, 8 ... 49, 50, 51 Next

ELOG V3.1.4-2e1708b5