Back Midas Rome Roody Rootana
  Midas DAQ System, Page 118 of 146  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
ID Date Authordown Topic Subject
  2711   14 Feb 2024 Konstantin OlchanskiBug Fixadded ubuntu-22 to nightly build on bitbucket, now need python!
> Are we running these tests as part of the nightly build on bitbucket? They would be part of 
> the "make test" target. Correct python dependancies may need to be added to the bitbucket OS 
> image in bitbucket-pipelines.yml. (This is a PITA to get right).

I added ubuntu-22 to the nightly builds.

but I notice the build says "no python" and I am not sure what packages I need to install for 
midas python to work.

Ben, can you help me with this?

https://bitbucket.org/tmidas/midas/pipelines/results/1106/steps/%7B9ef2cf97-bd9f-4fd3-9ca2-9c6aa5e20828%7D

K.O.
  2712   14 Feb 2024 Konstantin OlchanskiInfobitbucket permissions
I pushed some buttons in bitbucket user groups and permissions to make it happy 
wrt recent changes.

The intended configuration is this:

- two user groups: admins and developers
- admins has full control over the workspace, project and repositories ("Admin" 
permission)
- developers have push permission for all repositories (not the "create 
repository" permission, this is limited to admins) ("Write" permission).
- there seems to be a quirk, admins also need to be in the developers group or 
some things do not work (like "run pipeline", which set me off into doing all 
this).
- admins "Admin" permission is set at the "workspace" level and is inherited 
down to project and repository level.
- developers "Write" permission is set at the "project" level and is inherited 
down to repository level.
- individual repositories in the "MIDAS" project also seem to have explicit 
(non-inhertited) permissions, I think this is redundant and I will probably 
remove them at some point (not today).

K.O.
  2713   15 Feb 2024 Konstantin OlchanskiForumnumber of entries in a given ODB subdirectory ?
> > > For ODB keys of type TID_KEY, the value num_values IS the number of subkeys. 
> > 
> > this logic makes sense, however it doesn't seem to be consistent with the printout of the test example
> > at the end of https://daq00.triumf.ca/elog-midas/Midas/240203_095803/a.cc . The printout reports 
> > 
> > key.num_values = 1, but the actual number of subkeys = 6, and all subkeys being of TID_KEY type
> > 
> > I'm certain that the ODB subtree in question was not accessed concurrently during the test.
> 
> You are right, num_values is always 1 for TID_KEYS. The number of subkeys is stored in 
> 
>   ((KEYLIST *) ((char *)pheader + pkey->data))->num_keys
> 
> Maybe we should add a function to return this. But so far db_enum_key() was enough.
> 
> Stefan

I would rather add a function that atomically returns an std::vector<KEY>. number of entries
is vector size, entry names are in key.name. If you need to do something with an entry,
like iterate a subdirectory, you have to go by name (not by HNDLE), and if somebody deleted
it, you get an error "entry deleted, tough!", (HNDLE becomes invalid without any error message about it, 
subsequent db_get_data() likely returns gibberish, subsequent db_set_data() likely corrupts ODB).

K.O.
  2714   15 Feb 2024 Konstantin OlchanskiForumnumber of entries in a given ODB subdirectory ?
> > You are right, num_values is always 1 for TID_KEYS. The number of subkeys is stored in 
> >   ((KEYLIST *) ((char *)pheader + pkey->data))->num_keys
> > Maybe we should add a function to return this. But so far db_enum_key() was enough.

Hmm... is there any use case where you want to know the number of directory entries, but you will not iterate 
over them later?

K.O.
  2722   08 Mar 2024 Konstantin OlchanskiInfoMIDAS frontend for WIENER L.V. P.S. and VME crates
Our MIDAS frontend for WIENER power supplies is now available as a standalone git repository.

https://bitbucket.org/ttriumfdaq/fewienerlvps/src/master/

This frontend use the snmpwalk and snmpset programs to talk to the power supply.

Also included is a simple custom web page to display power supply status and to turn things on and off.

This frontend was originally written for the T2K/ND280 experiment in Japan.

In addition to controlling Wiener low voltage power supplies, it was also used to control the ISEG MPOD high 
voltage power supplies.

In Japan, ISEG MPOD was (still is) connected to the MicroMegas TPC and is operated in a special "spark counting" 
mode. This spark counting code is still present in this MIDAS frontend and can be restored with a small amount of 
work.

K.o.
  2724   11 Mar 2024 Konstantin OlchanskiBug ReportAutostart program
> It seems that if a frontend is started automatically by using Program->Auto start then the status page does not show it as started. This is since the FE name has a number after the name. If I stop and start manually then the status page shows the correct state of the FE. Am I doing something wrong or is this a bug somewhere?

Zaher, please read https://daq00.triumf.ca/elog-midas/Midas/919

K.O.
  2726   18 Mar 2024 Konstantin OlchanskiBug ReportMidas (manalyzer) + ROOT 6.31/01 - compilation error
> [ 34%] Linking CXX executable manalyzer_test.exe
> /usr/bin/ld: /home/astrocent/workspace/root/root_install/lib/libRIO.so: undefined 
> reference to 
> `std::condition_variable::wait(std::unique_lock<std::mutex>&)@GLIBCXX_3.4.30'
> collect2: error: ld returned 1 exit status

the error is actually in ROOT, libRIO does not find someting in the standard library.

one possible source of trouble is mismatched compilation flags, to debug this, please 
use "make cmake" and email me (or post here) the full output. (standard cmake hides 
all compiler information to make it easier to debug such problems).

since this is a prerelease of ROOT 6.32 (which in turn fixes major breakage on MacOS) 
and you built it from sources, can you confirm for me that it actually works, you can 
run "root -l somefile.root", open the tbrowser, look at some plots? this is to 
eliminate the possibility that your ROOR is miscompiled.

hmm... also please run "make cmake -k", let's see is linking of rmlogger also fails.

K.O.
  2728   19 Mar 2024 Konstantin OlchanskiBug ReportMidas (manalyzer) + ROOT 6.31/01 - compilation error
ok, thank you for your information. I cannot reproduce this problem, I use vanilla Ubuntu 
LTS 22, ROOT binary kit root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4 from root.cern.ch 
and latest midas from git.

something is wrong with your ubuntu or with your c++ standard library or with your ROOT.

a) can you try with root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4 from root.cern.ch
b) for the midas build, please run "make cclean; make cmake -k" and email me (or post 
here) the complete output.

K.O.
  2729   19 Mar 2024 Konstantin OlchanskiBug ReportMidas (manalyzer) + ROOT 6.31/01 - compilation error
> ok, thank you for your information. I cannot reproduce this problem, I use vanilla Ubuntu 
> LTS 22, ROOT binary kit root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4 from root.cern.ch 
> and latest midas from git.
> 
> something is wrong with your ubuntu or with your c++ standard library or with your ROOT.
> 
> a) can you try with root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4 from root.cern.ch
> b) for the midas build, please run "make cclean; make cmake -k" and email me (or post 
> here) the complete output.

also, please email me the output of these commands on your machine:

daq00:midas$ ls -l /lib/x86_64-linux-gnu/libstdc++*
lrwxrwxrwx 1 root root      19 May 13  2023 /lib/x86_64-linux-gnu/libstdc++.so.6 -> libstdc++.so.6.0.30
-rw-r--r-- 1 root root 2260296 May 13  2023 /lib/x86_64-linux-gnu/libstdc++.so.6.0.30
daq00:midas$ 

and

daq00:midas$ ldd $ROOTSYS/bin/rootreadspeed 
	linux-vdso.so.1 (0x00007ffe6c399000)
	libTree.so => /daq/cern_root/root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4/lib/libTree.so (0x00007f67e53b5000)
	libRIO.so => /daq/cern_root/root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4/lib/libRIO.so (0x00007f67e4fb9000)
	libCore.so => /daq/cern_root/root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4/lib/libCore.so (0x00007f67e4b08000)
	libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f67e48bd000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f67e489b000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f67e4672000)
	libNet.so => /daq/cern_root/root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4/lib/libNet.so (0x00007f67e458b000)
	libThread.so => /daq/cern_root/root_v6.30.02.Linux-ubuntu22.04-x86_64-gcc11.4/lib/libThread.so (0x00007f67e4533000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f67e444c000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f67e5599000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f67e43d6000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f67e43b8000)
	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f67e438d000)
	libxxhash.so.0 => /lib/x86_64-linux-gnu/libxxhash.so.0 (0x00007f67e4378000)
	liblz4.so.1 => /lib/x86_64-linux-gnu/liblz4.so.1 (0x00007f67e4358000)
	libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x00007f67e4289000)
	libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007f67e41e3000)
	libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007f67e3d9f000)
daq00:midas$ 

K.O.
  2731   01 Apr 2024 Konstantin OlchanskiInfoxz-utils bomb out, compression benchmarks
you may have heard the news of a major problem with the xz-utils project, authors of the popular "xz" file compression, 
https://nvd.nist.gov/vuln/detail/CVE-2024-3094

the debian bug tracker is interesting reading on this topic, "750 commits or contributions to xz by Jia Tan, who backdoored it", 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1068024

and apparently there is problems with the deisng of the .xz file format, making it vulnerable to single-bit errors and unreliable checksums,
https://www.nongnu.org/lzip/xz_inadequate.html

this moved me to review status of file compression in MIDAS.

MIDAS does not use or recommend xz compression, MIDAS programs to not link to xz and lzma libraries provided by xz-utils.

mlogger has built-in support for:
- gzip-1, enabled by default, as the most safe and bog-standard compression method
- bzip2 and pbzip2, as providing the best compression
- lz4, for high data rate situations where gzip and bzip2 cannot keep up with the data

compression benchmarks on an AMD 7700 CPU (8-core, DDR5 RAM) confirm the usual speed-vs-compression tradeoff:

note: observe how both lz4 and pbzip2 compress time is the time it takes to read the file from ZFS cache, around 6 seconds.
note: decompression stacks up in the same order: lz4, gzip fastest, pbzip2 same speed using 10x CPU, bzip2 10x slower uses 1 CPU.
note: because of the fast decompression speed, gzip remains competitive.

no compression: 6 sec, 270 MiBytes/sec,
lz4, bpzip2:    6 sec, same, (pbzip2 uses 10 CPU vs lz4 uses 1 CPU)
gzip -1:       21 sec,  78 MiBytes/sec
bzip2:         70 sec,  23 MiBytes/sec (same speed as pbzip2, but using 1 CPU instead of 10 CPU)

file sizes:

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ ls -lSr test.mid*
-rw-r--r-- 1 dsdaqdev users  483319523 Apr  1 14:06 test.mid.bz2
-rw-r--r-- 1 dsdaqdev users  631575929 Apr  1 14:06 test.mid.gz
-rw-r--r-- 1 dsdaqdev users 1002432717 Apr  1 14:06 test.mid.lz4
-rw-r--r-- 1 dsdaqdev users 1729327169 Apr  1 14:06 test.mid
(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ 

actual benchmarks:

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time cat test.mid > /dev/null
0.00user 6.00system 0:06.00elapsed 99%CPU (0avgtext+0avgdata 1408maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time gzip -1 -k test.mid
14.70user 6.42system 0:21.14elapsed 99%CPU (0avgtext+0avgdata 1664maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time lz4 -k -f test.mid
2.90user 6.44system 0:09.39elapsed 99%CPU (0avgtext+0avgdata 7680maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time bzip2 -k -f test.mid
64.76user 8.81system 1:13.59elapsed 99%CPU (0avgtext+0avgdata 8448maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time pbzip2 -k -f test.mid
86.76user 15.39system 0:09.07elapsed 1125%CPU (0avgtext+0avgdata 114596maxresident)k

decompression benchmarks:

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time lz4cat  test.mid.lz4 > /dev/null
0.68user 0.23system 0:00.91elapsed 99%CPU (0avgtext+0avgdata 7680maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time zcat  test.mid.gz > /dev/null
6.61user 0.23system 0:06.85elapsed 99%CPU (0avgtext+0avgdata 1408maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time bzcat  test.mid.bz2 > /dev/null
27.99user 1.59system 0:29.58elapsed 99%CPU (0avgtext+0avgdata 4656maxresident)k

(vslice) dsdaqdev@dsdaqgw:/zdata/vslice$ /usr/bin/time pbzip2 -dc test.mid.bz2 > /dev/null
37.32user 0.56system 0:02.75elapsed 1377%CPU (0avgtext+0avgdata 157036maxresident)k

K.O.
  2733   02 Apr 2024 Konstantin OlchanskiInfoSequencer editor
> Stefan and I have been working on improving the sequencer editor ...

Looks grand! Congratulations with getting it completed. The previous version was 
my rewrite of the old generated-C pages into html+javascript, nothing to write 
home about, I even kept the 1990-ies-style html formatting and styling as much as 
possible.

K.O.
  2734   02 Apr 2024 Konstantin OlchanskiBug ReportMidas (manalyzer) + ROOT 6.31/01 - compilation error
> I found solution for my trouble. With MIDAS and ROOT everything is OK,
> the trobule was with my Ubuntu enviroment.

Congratulations with figuring this out.

BTW, this is the 2nd case of contaminated linker environment I run into in the last 30 days. We 
just had a problem of "cannot link MIDAS with ROOT" (resolving by "make cmake NO_ROOT=1 NO_CURL=1 
NO_MYSQL=1").

This all seems to be a flaw in cmake, it reports "found ROOT at XXX", "found CURL at YYY", "found 
MYSQL at ZZZ", then proceeds to link ROOT, CURL and MYSQL libraries from somewhere else, 
resulting in shared library version mismatch.

With normal Makefiles, this is fixable by changing the link command from:

g++ -o rmlogger ... -LAAA/lib -LXXX/lib -LYYY/lib -lcurl -lmysql -lROOT

into explicit

g++ -o rmlogger ... -LAAA/lib XXX/lib/libcurl.a YYY/lib/libmysql.a ...

defeating the bogus CURL and MYSQL libraries in AAA.

With cmake, I do not think it is possible to make this transformation.

Maybe it is possible to add a cmake rules to at least detect this situation, i.e. compare library 
paths reported by "ldd rmlogger" to those found and expected by cmake.

K.O.
  2735   04 Apr 2024 Konstantin OlchanskiInfoMIDAS RPC data format
I am not sure I have seen this documented before. MIDAS RPC data format.

1) RPC request (from client to mserver), in rpc_call_encode()

1.1) header:

4 bytes NET_COMMAND.header.routine_id is the RPC routine ID
4 bytes NET_COMMAND.header.param_size is the size of following data, aligned to 8 bytes

1.2) followed by values of RPC_IN parameters:

arg_size is the actual data size
param_size = ALIGN8(arg_size)

for TID_STRING||TID_LINK, arg_size = 1+strlen()
for TID_STRUCT||RPC_FIXARRAY, arg_size is taken from RPC_LIST.param[i].n
for RPC_VARARRAY|RPC_OUT, arg_size is pointed to by the next argument
for RPC_VARARRAY, arg_size is the value of the next argument
otherwise arg_size = rpc_tid_size()

data encoding:

RPC_VARARRAY:
4 bytes of ALIGN8(arg_size)
4 bytes of padding
param_size bytes of data

TID_STRING||TID_LINK:
param_size of string data, zero terminated

otherwise:
param_size of data

2) RPC dispatch in rpc_execute

for each parameter, a pointer is placed into prpc_param[i]:

RPC_IN: points to the data inside the receive buffer
RPC_OUT: points to the data buffer allocated inside the send buffer
RPC_IN|RPC_OUT: data is copied from the receive buffer to the send buffer, prpc_param[i] is a pointer to the copy in the send buffer

prpc_param[] is passed to the user handler function.

user function reads RPC_IN parameters by using the CSTRING(i), etc macros to dereference prpc_param[i]

user function modifies RPC_IN|RPC_OUT parameters pointed to by prpc_param[i] (directly in the send buffer)

user function places RPC_OUT data directly to the send buffer pointed to by prpc_param[i]

size of RPC_VARARRAY|RPC_OUT data should be written into the next/following parameter.

3) RPC reply

3.1) header:

4 bytes NET_COMMAND.header.routine_id contains the value returned by the user function (RPC_SUCCESS)
4 bytes NET_COMMAND.header.param_size is the size of following data aligned to 8 bytes

3.2) followed by data for RPC_OUT parameters:

data sizes and encodings are the same as for RPC_IN parameters.

for variable-size RPC_OUT parameters, space is allocated in the send buffer according to the maximum data size
that the user code expects to receive:

RPC_VARARRAY||TID_STRING: max_size is taken from the first 4 bytes of the *next* parameter
otherwise: max_size is same as arg_size and param_size.

when encoding and sending RPC_VARARRAY data, actual data size is taken from the next parameter, which is expected to be 
TID_INT32|RPC_IN|RPC_OUT.

4) Notes:

4.1) RPC_VARARRAY should always be sent using two parameters:

a) RPC_VARARRAY|RPC_IN is pointer to the data we are sending, next parameter must be TID_INT32|RPC_IN is data size
b) RPC_VARARRAY|RPC_OUT is pointer to the data buffer for received data, next parameter must be TID_INT32|RPC_IN|RPC_OUT before the call should 
contain maximum data size we expect to receive (size of malloc() buffer), after the call it may contain the actual data size returned
c) RPC_VARARRAY|RPC_IN|RPC_OUT is pointer to the data we are sending, next parameter must be TID_INT32|RPC_IN|RPC_OUT containing the maximum 
data size we are expected to receive.

4.2) during dispatching, RPC_VARARRAY|RPC_OUT and TID_STRING|RPC_OUT both have 8 bytes of special header preceeding the actual data, 4 bytes of 
maximum data size and 4 bytes of padding. prpc_param[] points to the actual data and user does not see this special header.

4.3) when encoding outgoing data, this special 8 byte header is removed from TID_STRING|RPC_OUT parameters using memmove().

4.4) TID_STRING parameters:

TID_STRING|RPC_IN can be sent using oe parameter
TID_STRING|RPC_OUT must be sent using two parameters, second parameter should be TID_INT32|RPC_IN to specify maximum returned string length
TID_STRING|RPC_IN|RPC_OUT ditto, but not used anywhere inside MIDAS

4.5) TID_IN32|RPC_VARARRAY does not work, corrupts following parameters. MIDAS only uses TID_ARRAY|RPC_VARARRAY

4.6) TID_STRING|RPC_IN|RPC_OUT does not seem to work.

4.7) RPC_VARARRAY does not work is there is preceding TID_STRING|RPC_OUT that returned a short string. memmove() moves stuff in the send buffer, 
this makes prpc_param[] pointers into the send buffer invalid. subsequent RPC_VARARRAY parameter refers to now-invalid prpc_param[i] pointer to 
get param_size and gets the wrong value. MIDAS does not use this sequence of RPC parameters.

4.8) same bug is in the processing of TID_STRING|RPC_OUT parameters, where it refers to invalid prpc_param[i] to get the string length.

K.O.
  2736   15 Apr 2024 Konstantin OlchanskiBug Reportopen MIDAS RPC ports
we had a bit of trouble with open network ports recently and I now think security of MIDAS RPC 
ports needs to be tightened.

TL;DR, this is a non-trivial network configuration problem, TL required, DR up to you.

as background, right now we have two settings in ODB, "/expt/security/enable non-localhost 
RPC" set to "no" (the default) and set to "yes". Set to "no" is very secure, all RPC sockets 
listen only on the "localhost" interface (127.0.0.1) and do not accept connections from other 
computers. Set to "yes", RPC sockets accept connections from everywhere in the world, but 
immediately close them without reading any data unless connection origins are listed in ODB 
"/expt/security/RPC hosts" (white-listed).

the problem, one. for security and robustness we place most equipments on a private network 
(192.168.1.x). MIDAS frontends running on these equipments must connect to MIDAS running on 
the main computer. This requires setting "enable non-localhost RPC" to "yes" and white-listing 
all private network equipments. so far so good.

the problem, one, continued. in this configuration, the MIDAS main computer is usually also 
the network gateway (with NAT, IP forwarding, DHCP, DNS, etc). so now MIDAS RPC ports are open 
to all external connections (in the absence of restrictive firewall rules). one would hope for 
security-through-obscurity and expect that "external threat actors" will try to bother them, 
but in reality we eventually see large numbers of rejected unwanted connections logged in 
midas.log (we log the first 10 rejected connections to help with maintaining the RPC 
connections white-list).

the problem, two. central IT do not like open network ports. they run their scanners, discover 
the MIDAS RPC ports, complain about them, require lengthy explanations, etc.

it would be much better if in the typical configuration, MIDAS RPC ports did not listen on 
external interfaces (campus network). only listen on localhost and on private network 
interfaces (192.168.1.x).

I am not yet of the simplest way to implement this. But I think this is the direction we 
should go.

P.S. what about firewall rules? two problems: one: from statistic-of-one, I make mistakes 
writing firewall rules, others also will make mistakes, a literally fool-proof protection of 
MIDAS RPC ports is needed. two: RHEL-derived Linuxes by-default have restrictive firewall 
rules, and this is good for security, except that there is a failure mode where at boot time 
something can go wrong and firewall rules are not loaded at all. we have seen this happen. 
this is a complete disaster on a system that depends on firewall rules for security. better to 
have secure applications (TCP ports protected by design and by app internals) with firewall 
rules providing a secondary layer of protection.

P.P.S. what about MIDAS frontend initial connection to the mserver? this is currently very 
insecure, but the vulnerability window is very small. Ideally we should rework the mserver 
connection to make it simpler, more secure and compatible with SSH tunneling.

P.P.S. Typical network diagram:

internet - campus firewall - campus network - MIDAS host (MIDAS) - 192.168.1.x network - power 
supplies, digitizers, MIDAS frontends.

P.P.S. mserver connection sequence:

1) midas frontend opens 3 tcp sockets, connections permitted from anywhere
2) midas frontend opens tcp socket to main mserver, sends port numbers of the 3 tcp sockets
3) main mserver forks out a secondary (per-client) mserver
4) secondary mserver connects to the 3 tcp sockets of the midas frontend created in (1)
5) from here midas rpc works
6) midas frontend loads the RPC white-list
7) from here MIDAS RPC sockets are secure (protected by the white-list).

(the 3 sockets are: RPC recv_sock, RPC send_sock and event_sock)

P.P.S. MIDAS UDP sockets used for event buffer and odb notifications are secure, they bind to 
localhost interface and do not accept external connections.

K.O.
  2738   24 Apr 2024 Konstantin OlchanskiInfoMIDAS RPC data format
> 4.5) TID_IN32|RPC_VARARRAY does not work, corrupts following parameters. MIDAS only uses TID_ARRAY|RPC_VARARRAY

fixed in commit 0f5436d901a1dfaf6da2b94e2d87f870e3611cf1, TID_ARRAY|RPC_VARARRAY was okey (i.e. db_get_value()), bug happened only if rpc_tid_size() 
is not zero.

> 
> 4.6) TID_STRING|RPC_IN|RPC_OUT does not seem to work.
> 
> 4.7) RPC_VARARRAY does not work is there is preceding TID_STRING|RPC_OUT that returned a short string. memmove() moves stuff in the send buffer, 
> this makes prpc_param[] pointers into the send buffer invalid. subsequent RPC_VARARRAY parameter refers to now-invalid prpc_param[i] pointer to 
> get param_size and gets the wrong value. MIDAS does not use this sequence of RPC parameters.
> 
> 4.8) same bug is in the processing of TID_STRING|RPC_OUT parameters, where it refers to invalid prpc_param[i] to get the string length.

fixed in commits e45de5a8fa81c75e826a6a940f053c0794c962f5 and dc08fe8425c7d7bfea32540592b2c3aec5bead9f

K.O.
  2739   24 Apr 2024 Konstantin OlchanskiInfoMIDAS RPC add support for std::string and std::vector<char>
I now fully understand the MIDAS RPC code, had to add some debugging printfs, 
write some test code (odbedit test_rpc), catch and fix a few bugs.

Fixes for the bugs are now committed.

Small refactor of rpc_execute() should be committed soon, this removes the 
"goto" in the memory allocation of output buffer. Stefan's original code used a 
fixed size buffer, I later added allocation "as-neeed" but did not fully 
understand everything and implemented it as "if buffer too small, make it 
bigger, goto start over again".

After that, I can implement support for std::string and std::vector<char>.

The way it looks right now, the on-the-wire data format is flexible enough to 
make this change backward-compatible and allow MIDAS programs built with old 
MIDAS to continue connecting to the new MIDAS and vice-versa.

MIDAS RPC support for std::string should let us improve security by removing 
even more uses of fixed-size string buffers.

Support for std::vector<char> will allow removal of last places where 
MAX_EVENT_SIZE is used and simplify memory allocation in other "give me data" 
RPC calls, like RPC_JRPC and RPC_BRPC.

K.O.
  2766   14 May 2024 Konstantin OlchanskiInfoROOT v6.30.6 requires libtbb-dev
root_v6.30.06.Linux-ubuntu22.04-x86_64-gcc11.4 the libtbb-dev package.

This is a new requirement and it is not listed in the ROOT dependancies page (I left a note on the ROOT forum, hopefully it will be 
fixed quickly). https://root.cern/install/dependencies/

Symptom is when starting ROOT, you get an error:
cling::DynamicLibraryManager::loadLibrary(): libtbb.so.12: cannot open shared object file: No such file or directory
and things do not work.

Fix is to:
apt install libtbb-dev

K.O.
  2767   16 May 2024 Konstantin OlchanskiBug Reportmidas alarm borked condition evaluation
I am updating the TRIUMF IRIS experiment to the latest version of MIDAS. I see following error messages in midas.log:

19:06:16.806 2024/05/16 [mhttpd,ERROR] [odb.cxx:6967:db_get_data_index,ERROR] index (29) exceeds array length (20) for key 
"/Equipment/Beamline/Variables/Measured"

19:06:15.806 2024/05/16 [mhttpd,ERROR] [odb.cxx:6967:db_get_data_index,ERROR] index (30) exceeds array length (20) for key 
"/Equipment/Beamline/Variables/Measured"

The errors are correct, there is only 20 elements in that array. The errors are coming every few seconds, they spam midas.log. How do I fix 
them? Where do they come from? There is no additional diagnostics or information to go from.

In the worst case, they come from some custom web page that reads wrong index variables from ODB. mhttpd currently provides no diagnostics to 
find out which web page could be causing this.

But maybe it's internal to MIDAS? I save odb to odb.json, "grep Measured odb.json" yields:
iris00:~> grep Measured odb.json 
        "Condition" : "/Equipment/Beamline/Variables/Measured[29] > 1e-5",
        "Condition" : "/Equipment/Beamline/Variables/Measured[30] < 0.5",

So wrong index errors is coming from evaluated alarms.
ODB "/Alarms/Alarm system active" is set to "no" (alarm system is disabled), the errors are coming.
ODB "/Alarms/Alarms/TP4/Active" is set to "no" (specific alarm is disabled), the errors are coming.
WTF? (and this is recentish borkage, old IRIS MIDAS had the same wrong index alarms and did not generate these errors).

Breakage:
- where is the error message "evaluated alarm XXX cannot be computed because YYY cannot be read from ODB!"
- disabled alarm should not be computed
- alarm system is disabled, alarms should not be computed

K.O.

P.S. I am filing bug reports here, I cannot be bothered with the 25-factor authentication to access bitbucket.
  2769   17 May 2024 Konstantin OlchanskiBug Reportmidas alarm borked condition evaluation
> This is a common problem I also encountered in the past. You get a low-level ODB access error (could also be a read of a non-existing key) and you 
> have no idea where this comes from. Could be the alarm system, a mhttpd web page, even some user code in a front-end over which the midas library 
> has no control.

committed a partial fix, added an error message in alarm condition evaluation code to report alarm name and odb paths when we cannot get something from 
ODB. Now at least midas.log gives some clue that ODB errors are coming from alarms.

and the errors are actually coming from the alarms web page.

the alarms web page shows all the alarms even if alarms are disabled and it shows evaluated alarm conditions and current values even for alarms that 
are disabled.

I could change it to show "disabled" for disabled alarms, but I think it would not be an improvement,
right now it is quite convenient to see the alarm values for disabled/inactive alarms,
it is easy to see if they will immediately trip if I enable them. Hiding the values would make
them blind.

And I think I know what caused the original problem in IRIS experiment, I think the list of EPICS variables got truncated from 30 to 20 and EPICS 
values 29 and 30 used in the alarm conditions have become lost.

So the next step is to fix feepics to not truncate the list of variables (right now it is hardwired to 20 variables) and restore
the lost variable definition from a saved odb dump.

K.O.






> 
> One option would be to add a complete stack dump to each of these error (ROOT does something like that), but I hear already people shouting "my 
> midas.log is flooded with stack dumps!". So what I do in this case is I run a midas program in the debugger and set a breakpoint (in your case at 
> odb.cxx:6967). If the breakpoint triggers, I inspect the stack and find out where this comes from.
> 
> Not that I print a stack dump for such error in the odbxx API. This goes to stdout, not the midas log, and it helped me in the past. Unfortunately 
> stack dumps work only under linux (not MacOSX), and they do not contain all information a debugger can show you.
> 
> It is not true that alarm conditions are evaluated when the alarm system is off. I just tried and it works fine. The code is here:
> 
> alarm.cxx:591
> 
>       /* check global alarm flag */
>       flag = TRUE;
>       size = sizeof(flag);
>       db_get_value(hDB, 0, "/Alarms/Alarm system active", &flag, &size, TID_BOOL, TRUE);
>       if (!flag)
>          return AL_SUCCESS;
> 
> so no idea why you see this error if you correctly st "Alarm system active" to false.
> 
> Stefan
  2770   17 May 2024 Konstantin OlchanskiBug Reportodbedit load into the wrong place
Trying to restore IRIS ODB was a nasty surprise, old save files are in .odb format and odbedit "load xxx.odb" 
does an unexpected thing.

mkdir tmp
cd tmp
load odb.xml loads odb.xml into the current directory "tmp"
load odb.json same thing
load odb.odb loads into "/" unexpectedly overwriting everything in my ODB with old data

this makes it impossible for me to restore just /equipment/beamline from old .odb save file (without 
overwriting all of my odb with old data).

I look inside db_paste() and it looks like this is intentional, if ODB path names in the odb save file start 
with "/" (and they do), instead of loading into the current directory it loads into "/", overwriting existing 
data.

The fix would be to ignore the leading "/", always restore into the current directory. This will make odbedit 
load consistent between all 3 odb save file formats.

Should I apply this change?

K.O.
ELOG V3.1.4-2e1708b5