Back Midas Rome Roody Rootana
  Midas DAQ System, Page 117 of 142  Not logged in ELOG logo
ID Date Authordown Topic Subject
  2770   17 May 2024 Konstantin OlchanskiBug Reportodbedit load into the wrong place
Trying to restore IRIS ODB was a nasty surprise, old save files are in .odb format and odbedit "load xxx.odb" 
does an unexpected thing.

mkdir tmp
cd tmp
load odb.xml loads odb.xml into the current directory "tmp"
load odb.json same thing
load odb.odb loads into "/" unexpectedly overwriting everything in my ODB with old data

this makes it impossible for me to restore just /equipment/beamline from old .odb save file (without 
overwriting all of my odb with old data).

I look inside db_paste() and it looks like this is intentional, if ODB path names in the odb save file start 
with "/" (and they do), instead of loading into the current directory it loads into "/", overwriting existing 
data.

The fix would be to ignore the leading "/", always restore into the current directory. This will make odbedit 
load consistent between all 3 odb save file formats.

Should I apply this change?

K.O.
  2771   17 May 2024 Konstantin OlchanskiBug Reportmidas alarm borked condition evaluation
> 
> And I think I know what caused the original problem in IRIS experiment, I think the list of EPICS variables got truncated from 30 to 20 and EPICS 
> values 29 and 30 used in the alarm conditions have become lost.
> 
> So the next step is to fix feepics to not truncate the list of variables (right now it is hardwired to 20 variables) and restore
> the lost variable definition from a saved odb dump.

for the record, I restored the old ODB settings from feepics, my EPICS variables now have the correct size and the alarm now works correctly.

I also updated the example feepics to read the number of EPICS variables from ODB instead of always truncating them to 20 (IRIS MIDAS had a local change 
setting number of variables to 40).

I think I will make no more changes to the alarms, leave well enough alone.

K.O.
  2780   24 May 2024 Konstantin OlchanskiInfoadded ubuntu 22 arm64 cross-compilation
Ubuntu 22 has almost everything necessary to cross-build arm64 MIDAS frontends:

# apt install g++-12-aarch64-linux-gnu gcc-12-aarch64-linux-gnu-base libstdc++-12-dev-arm64-cross
$ aarch64-linux-gnu-gcc-12 -o ttcp.aarch64 ttcp.c -static

to cross-build MIDAS:

make arm64_remoteonly -j

run programs from $MIDASSYS/linux-arm64-remoteonly/bin
link frontends to libraries in $MIDASSYS/linux-arm64-remoteonly/lib

Ubuntu 22 do not provide an arm64 libz.a, as a workaround, I build a fake one. (we do not have HAVE_ZLIB anymore...). or you 
can link to libz.a from your arm64 linux image, assuming include/zlib.h are compatible.

K.O.
  2781   29 May 2024 Konstantin OlchanskiInfoMIDAS RPC add support for std::string and std::vector<char>
This is moving slowly. I now have RPC caller side support for std::string and 
std::vector<char>. RPC server side is next. K.O.
  2782   02 Jun 2024 Konstantin OlchanskiInfoMIDAS RPC data format
> MIDAS RPC data format.
> 3) RPC reply
> 3.1) header:
> 3.2) followed by data for RPC_OUT parameters:
> 
> data sizes and encodings are the same as for RPC_IN parameters.

Correction:

RPC_VARARRAY data encoding for data returned by RPC is different from data sent to RPC:

4 bytes of arg_size (before 8-byte alignement), (for data sent to RPC, it's 4 bytes of param_size, after 8-byte alignment)
4 bytes of padding
param_size of data

K.O.

P.S. bug/discrepancy caught by GCC/LLVM address sanitizer.
  2806   19 Aug 2024 Konstantin OlchanskiReleasekernel-module-universe updated to -KO7
The linux kernel driver for the Universe-II VME to PCI bridge is updated to 
version -KO7. It now builds and runs with Debian-12 stock kernel 6.1.0-22-686.

I pxe boot (isolinux/pxelinux) the linux kernel and NFS-mount the stock 32-bit 
Debian-12 userland. Userland tarball is available by request. PXE and NFS-Root 
configuration is written up on the wiki at daq.triumf.ca, example config files 
are available on request.

https://daq00.triumf.ca/DaqWiki/index.php/Ubuntu#setup_diskless_network_booting
https://daq00.triumf.ca/DaqWiki/index.php/VME-CPU

The Debian-11 kernel also works (use -KO6 driver is -KO7 bombs), but Debian-11 
kernel with Debian-12 userland and Ubuntu-22 NFS server fails with "file too 
big" errors, the best I can tell this has to do with old 32-bit kernels getting 
unhappy about 64-bit NFS inode numbers.

Cross-compilation from 64-bit Ubuntu-22 to 32-bit VME processors running 32-bit 
Debian-12 is written up here:
https://daq00.triumf.ca/DaqWiki/index.php/Ubuntu#32-bit_intel_cross-compiler

To cross-build 32-bit MIDAS for 32-bit VME processor use "make linux32" or build 
natively (pretty slow on 1 GHz Pentium-III).

K.O.
  2807   19 Aug 2024 Konstantin OlchanskiReleasekernel-module-universe updated to -KO7
> The linux kernel driver for the Universe-II VME to PCI bridge is updated to 
> version -KO7. It now builds and runs with Debian-12 stock kernel 6.1.0-22-686.

I have a report that this driver might work on 64-bit VME CPUs (minus a bug in the 
MIDAS VME library). I do not have such hardware, cannot test, cannot confirm. (All our 
64-bit VME CPUs have the tsi148 bridge and run Ubuntu kernels and userland).

https://daq00.triumf.ca/elog-midas/Midas/2566

K.O.
  2808   19 Aug 2024 Konstantin OlchanskiReleasekernel-module-universe updated to -KO7
> > The linux kernel driver for the Universe-II VME to PCI bridge is updated to 
> > version -KO7. It now builds and runs with Debian-12 stock kernel 6.1.0-22-686.

Ahem, and the location is:

git clone https://daq00.triumf.ca/~olchansk/git/kernel-module-universe.git

K.O.
  2830   11 Sep 2024 Konstantin OlchanskiInfoNews MSCB++ API
> Here is some example code:
> 
> #include "mscbxx.h"
>    f = m["In0"];     // name access
>    m["In0"] = 1.234;
> Any feedback is welcome.

Where is the example of error handling?

K.O.
  2831   11 Sep 2024 Konstantin OlchanskiForumPython frontend rate limitations?
> I'm trying to get a sense of the rate limitations of a python frontend.

1) python is single-threaded, for ultimate performance, a MIDAS frontend (or any DAQ 
application) has to be multithreaded:
a) thread with busy loop read the data and place it into a FIFO
b) thread to read data from FIFO and send it to SYSTEM buffer shared memory or to 
mserver
c) thread to respond to begin-run, end-run, etc RPCs
d) probably a thread to recycle memory from thread (b) back to thread (a) if per-event 
malloc()/free() adds too much overhead

2) data readout. C++ AXI bus access is compiled into 1 instruction and results in 1 AXI 
bus operation. comparable for python likely has much more overhead, slows you down.

3) event bank filling. C++ for() loop is compiled into very compact machine code, 
python loop cannot because each array element can be random data type, shows you down.

bottom line, there is a reason high speed data acquisitions are written in C/C++, not 
in shell, perl, tcl/tk, or (today's favourite) python.

> The C++ frontend is about 100 times faster in both data and event rates.

This is as expected. You can probably improve python code to get closer to 10 times 
slower than C++. But consider:

a) will it be "fast enough" for the task?
b) learning C++ and optimizing python to within "2-3-10x slower than C++" may involve a 
similar amount of time and effort.

And you have not looked at the real-time properties of your frontend. You may discover 
that it's actually faster than you think, but occasionally stops for a millisecond (or 
two or hundred). some applications a notorious for running memory garbage collection 
just at the wrong time.

I am working right now on exactly this problem, I have a 1 GHz ARM CPU (Cyclone-V FPGA) 
and I need to push data out at 100 Mbytes/sec while avoiding and bad-real-time dropouts  
that cause the FPGA data FIFO to overflow. And I only have 2 CPU cores, 1 to read the 
FPGA FIFO, 1 to run the TCP/IP stack and the ethernet driver. No this can be done with 
python.

K.O.
  2832   11 Sep 2024 Konstantin OlchanskiForumPython frontend rate limitations?
> 
> poll(INT count) {
>    for (i=0 ; i<count ; i++)
>       if (new_event())
>          return TRUE;
>    return FALSE;
> }

in the c++ frontend (tmfe.h) this loop usually runs in a separate thread, and I am now working on the linux magic to assign this thread maximum 
uninterruptible priority. otherwise on my Cyclone-V FPGA SoC I see 1-10 msec dropouts, I think from taking ethernet interrupts.

K.O.
  2833   11 Sep 2024 Konstantin OlchanskiForumPython frontend rate limitations?
> > I'm trying to get a sense of the rate limitations of a python frontend.

forgot one more:

c++ toolchain comes with extensive profiler tools aimed to answer the question "why is my 
program so slow, where is it spending all the time?". some of these tools go all the way to 
the hardware level and report CPU cache misses, TLB flushes, context switches and any other 
hardware events that interrupt or slow down computations. programmer than uses this 
information to restructure the code to avoid the worst slow downs (i.e. avoid branch mis-
predictions, avoid cache misses, etc).

I doubt the python toolchain will ever profiler tools as good.

K.O.
  2834   11 Sep 2024 Konstantin OlchanskiBug ReportMultiple issues with mhist
I think I can offer some insight into your problems:

1) your mhist crash is due to the ODB timeout, it is probably set to 30 seconds in ODB /programs/mhist. you will 
have to make it biigger.

2) 1.5 years of files. yes. I have 10 years of files for ALPHA at CERN. and the number of files is a problem. 
But it should be better than the old system with 3 files per day (1000 files per year).

One solution you can try is symlinks. Assuming you have 10 years of history files in 10 per-year directory, you 
symlink as many of them as you need into the "current" directory, then remove the symlinks.

Why remove the symlinks? I use "ls" to read the list of history files and Unix/Linux does not have a syscall to 
"give me the 100 files with the newest mtime". I have to read the whole directory and that takes forever (if ZFS 
on HDD), it is quick with ZFS on SSD if ZFS cache is hot (you can have a cron job do "ls" every 5 minutes to 
keep the ZFS cache hot).

Now that I wrote the above, I think I see a way to make it "automatic", let me ponder this. (plus I always 
wanted to implement compressed history files (using "free" lz4)).

K.O.



I am having some trouble with mhist. I suppose that the problems are at least partially due to our specific 
needs which might exceed what has been tested. For context, in MEG II we have some 10^4 history variables in ~30 
different events. 

1. mhist -l crashes. After displaying around 7000 lines, I get the following error message:
[CODE]
[mhist,ERROR] [midas.cxx:5949:bm_validate_client_index,ERROR] My client index 10 in buffer 'SYSMSG' 
is invalid: client name '', pid 0 should be my pid 3773321
[mhist,ERROR] [midas.cxx:5952:bm_validate_client_index,ERROR] Maybe this client was removed by a 
timeout. See midas.log. Cannot continue, aborting...
Aborted (core dumped)
[/CODE]
Timing the execution shows around 33 seconds before the process is aborted.

I'm not sure if this would actually fix the problem, but while trying to circumvent the issue, I tried the 
following: [CODE]mhist -e "Xenon" -l[/CODE] This doesn't seem to be implemented. Listing only the variables of a 
single event would be nice 
regardless of our specific issue.

2. mhist and history files.
We have a directory directory with about 2500 history files (mhf_...dat) for the past 1.5 years. Older 
history files are archived in other directories with similar numbers of files. When trying to access them, I 
encountered two issues:
It seems like it is not possible to pass a "history directory" as an argument. To dump the history for a full 
year in the archive directory, I would need to run mhist many times with -f and then combine all the dumps.

If it really does not work, please consider this a feature request.
Also, even using single files does not work at the moment:
[CODE]
$ mhist -e "Xenon" -v "Det XeTmp 0-0" -t 100000 -s 200101 -p 250101 -f 
/data2/history/2022/mhf_1644698398_20220212_xenon.dat
ID 980316009, Aug 13 19:10:56, size 1851749486
[/CODE]
This command was supposed to show me the rough time frame covered in this particular history file. I was 
informed that the history files are in the new "FILE" format and mhist might not work with them properly.

tl;dr
[LIST]
[*] Bug: mhist -l crashes
[*] Bug: mhist -f does not work with "FILE" history format
[*] Feature request: mhist -e "Name" -l to only show variables of event "Name"
[*] Feature request: Set temporary history dir with a flag
[/LIST]

Lukas[/quote]
  2835   11 Sep 2024 Konstantin OlchanskiSuggestionImprove Event Documentation
> I am writing a Rust based midas file reader however it was kind of hard to understand the full midas file 
> structure from the documentation.

MIDAS is old-school, when the code was the documentation.

This is very noticeable when you try to document things MIDAS (as I have done many times).

For MIDAS data format, file level and bank level, best if you look at my midasio library (included with MIDAS 
git clone) and translate it to Rust directly. I think a Rust version of C++ midasio would be very welcome.

Many data fields in MIDAS files are mysterious and I reverse-engineered them the best I could.

The main problems were:
- data padding
- "length" fields include padding or not?
- identification of big-endian vs little-endian data
- probably something I forget

K.O.
  2836   11 Sep 2024 Konstantin OlchanskiInfoHelp parsing scdms_v1 data?
Look at the C++ implementation of the MIDAS data file reader, the code is very 
simple to follow.

Depending on how old are your data files, you may run into a problem with 
misaligned 32-bit data banks. Latest MIDAS creates BANK32A events where all 
banks are aligned to 64 bits. old BANK32 format had banks alternating between 
aligned and misaligned. old 16-bit BANK format data hopefully you do not have.

If you successfully make a data format description file for MIDAS, please post 
it here for the next user.

K.O.



[quote="Adrian Fisher"]Hi! I'm working on creating a ksy file to help with 
parsing some data, but I'm having trouble finding some information. Right now, I 
have it set up very rudimentary - it grabs the event header and then uses the 
data bank size to grab the size of the data, but then I'm needing additional 
padding after the data bank to reach the next event.
However, there's some irregularity in the "padding" between data banks that I 
haven't been able to find any documentation for. For some reason, after the data 
banks, there's sections of data of either 168 or 192 bytes, and it's seemingly 
arbitrary which size is used. 
I'm just wondering if anyone has any information about this so that I'd be able 
to make some more progress in parsing the data.
The data I'm working with can be found at https://github.com/det-
lab/dataReaderWriter/blob/master/data/07180808_1735_F0001.mid.gz
And the ksy file that I've created so far is at https://github.com/det-
lab/dataReaderWriter/blob/master/kaitai/ksy/scdms_v1.ksy

There's also a block of data after the odb that runs for 384 bytes that I'm 
unsure the purpose of, if anyone could point me to some information about that.

Thank you![/quote]
  2837   11 Sep 2024 Konstantin OlchanskiInfomana.cxx
> Ok, no relevant complains so far, so I removed mana and rmana from the CMake build 
> process, but left the file mana.cxx still in the repository for educational 
> purposes ;-)

+1

K.O.
  2838   11 Sep 2024 Konstantin OlchanskiForum"Safe" abort of sequencer scripts
> We often use the MIDAS sequencer to temporarily control detector settings, such as:
> 
> * <change some setting>
> * WAIT 60 seconds
> * <revert setting to original value>
> 
> The question arises of what happens if the sequencer scripts gets aborted during that wait, preventing the value from being reset.

Common problem. Go have an elegant solution using the "defer" keyword.

https://go.dev/tour/flowcontrol/12

K.O.
  2839   12 Sep 2024 Konstantin OlchanskiBug Fixbitbucket builds repaired
bitbucket builds work again, also added ubuntu-24 and almalinux-9.

two problems fixed:
- cmake file in examples/experiment was replaced by a non-working version
- unannounced change of strlcpy() to mstrlcpy() broke "make remoteonly"

P.S. I should also fix the rootana and the roody bitbucket builds.

K.O.
  2840   13 Sep 2024 Konstantin OlchanskiBug Fixrootana bitbucket build fixed
rootana bitbucket build is fixed, only a few minor build problems. I am using the 
root official docker image (which turned out to not work right out of the box 
becuase of missing libvdt-dev package). K.O.
  2841   13 Sep 2024 Konstantin OlchanskiBug Reportmfe.cxx with RO_STOPPED and EQ_POLLED
> > I noticed that a check was added to mfe.cxx in 1961af0d6:

This is the reason I recommend against using mfe.c based frontends. There was never any
proper documentation on how they work and what different settings in ODB common
and elsewhere do. My attempts to document it by reverse-engineering were only partially
successful. Since then a number of changes was made that were also hard-to-impossible
to document.

I recommend that all use the new c++ tmfe frontend, which was designed for easy documentation,
and explanation. See tmfe.md for full documentation.

(pending improvements is to integrate TMEvent support, add the data-transmit thread and event fifo).

K.O.
ELOG V3.1.4-2e1708b5