Back Midas Rome Roody Rootana
  Midas DAQ System, Page 144 of 161  Not logged in ELOG logo
ID Date Author Topicdown Subject
  3215   27 Apr 2026 Konstantin OlchanskiBug Reportincreasing the max number of hot links in ODB
> Indeed, updating MIDAS clients on each and every RPI etc in a running experiment may be a real challenge.

actually, only local clients must be rebuilt, remote clients connecting to the mserver do not care about ODB 
internal structure.

> Thinking forward - would it help if the ODB clients, upon initial connection but before doing anything else 
> were reading the ODB parameters from the ODB itself, so the clients were "learning" about the ODB structure 
> dynamically, at run time? Or that knowledge has to be static ?

unfortunately, the "open records" structure is allocated at compile-time inside the ODB header,
making any change to this would break binary compatibility.

I think it is possible to allocate "space for additional open records" in the ODB data area
and have the ODB open records code use it in addition to the compile-time allocated
space in the database header. This would also work for extending MAX_CLIENTS.

Of course in this approach, old midas clients would see only the clients and open records
in the database header, new midas clients would see the additional data.

It is not super hard to add this code...

K.O.
  3216   27 Apr 2026 Konstantin OlchanskiBug Reportincreasing the max number of hot links in ODB
> I wonder why one needs more than 256 hotlinks at all.

I confirm that ALPHA is running with MAX_OPEN_RECORDS changed from 256 to 2048,
this is the only experiment I know of that had to increase any MIDAS ODB defaults.

The reason for this is mlogger, it opens an open record for each variable in each equipment.

This should be changed to 1 db_watch per equipment. We talked about it, but I guess we never did it.

I think this task just went almost to the top of my MIDAS to-do list.

K.O.
  3217   27 Apr 2026 Nick HastingsBug Reportincreasing the max number of hot links in ODB
For the record, the ND280 (T2K near detector) MIDAS GSC was initially set up
with MAX_OPEN_RECORDS = 2560 and MAX_CLIENTS = 128.

In 2023 one fairly simple part of the detector was replaced with several 
other more complex systems (many more midas frontends, equipments, and
variables being logged) so we updated MAX_OPEN_RECORDS = 4096 and 
MAX_CLIENTS = 256.

Nick.
  3218   27 Apr 2026 Pavel MuratBug Reportincreasing the max number of hot links in ODB
> I wonder why one needs more than 256 hotlinks at all. Please note that with the odbxx "watch" API, you can hotline a whole subdirectory, and get notified if ANY of the 
> underlying values or subdirectories change. In principle, one could have one hotlink to "/" and see all changes in the ODB (although that does not make sense and might slow 
> down ODB access a bit).

Thanks ! - I didn't know that. I did run into a number of hotlinks limit via mlogger which complained about not being able to create a hotlink 
to yet another event. Doubling the default value of MAX_OPEN_RECORDS solved the problem. 

I don't know the exact arithmetic defining the number of hotlinks in the system, but my today's case is a case of 
- 36 (linux servers) +18 (RPI) monitoring frontends managing one or several different equipment items each. 
- Each equipment item sends to ODB at least one monitoring event 
- in addition, each frontend created an individual hotlink for handling interactive commands 
- for MAX_OPEN_RECORDS=256, 4 equipment items per frontend easily make it into the dangerous zone. 

"Equipment items" also include the online processes running on the distributed computing farm processing the data .. 
(we are not using MIDAS event building capabilities)
 
> 
> Try the odbxx_test.cpp example in MIDAS. In line 210 it puts a single hotlink to /Experiment. If you change anything under /Experiment, the program gets notified. By checking the 
> path of the changed ODB entry, it can figure out which of the subways have been changed:
> 
>    // watch ODB key for any change with lambda function
>    midas::odb ow("/Experiment");
>    ow.watch([](midas::odb &o) {
>       std::cout << "Value of key \"" + o.get_full_path() + "\" changed to " << o << std::endl;
>    });
> 
> 
> Maybe that would solve your problem without having to change the maximum number of hotlinks.

I'll see how much mileage one can make here, but so far it looks that it is the number of various monitoring events 
handled by the mlogger which  drives the number of hotlinks 

-- thanks, regards, Pavel
  3219   27 Apr 2026 Pavel MuratBug Reportincreasing the max number of hot links in ODB
> > I wonder why one needs more than 256 hotlinks at all.
> 
> I confirm that ALPHA is running with MAX_OPEN_RECORDS changed from 256 to 2048,
> this is the only experiment I know of that had to increase any MIDAS ODB defaults.
> 
> The reason for this is mlogger, it opens an open record for each variable in each equipment.
> 
> This should be changed to 1 db_watch per equipment. We talked about it, but I guess we never did it.
> 
> I think this task just went almost to the top of my MIDAS to-do list.

I definitely had many more than 256 variables successfully monitored with MAX_OPEN_RECORDS=256.
Is it possible that mlogger creates a hotlink per monitoring event, not per variable ? 
- I think, that would make more sense in almost any scenario... 

-- thanks, regards, Pavel
  3220   27 Apr 2026 Pavel MuratBug Reportincreasing the max number of hot links in ODB
> > Indeed, updating MIDAS clients on each and every RPI etc in a running experiment may be a real challenge.
> 
> actually, only local clients must be rebuilt, remote clients connecting to the mserver do not care about ODB 
> internal structure.

thanks! I see - local clients do know about the memory mapping, remote ones - don't  

> unfortunately, the "open records" structure is allocated at compile-time inside the ODB header,
> making any change to this would break binary compatibility.

right, I guess, what I had in mind would require the very first fODB record to be a format descriptor, 
and that would be a breaking change... Anyway, the practical part of the problem is addressed, 
so I just add here a link which contains an answer to the original posting (I found it only after the fact):
 
https://daq00.triumf.ca/MidasWiki/index.php/FAQ#Increasing_Number_of_Hot-links

-- thanks again, regards, Pavel
  3224   21 May 2026 Konstantin OlchanskiBug Reportincompatible ODB XML dumps
While testing manalyzer, I found that it dies from an exception on odbxx, error message is "/home/olchansk/packages/midas/include/odbxx.h:1231: No "handle" 
attribute found in XML data".

Indeed, my data file is very old and it's XML ODB dump does not have the "handle" attribute:

daq00:midas$ more ~/git/midas/manalyzer/run9402bor.xml 

<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- created by MXML on Tue Aug 11 14:47:16 2020 -->
<odb root="/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://midas.psi.ch/odb.xsd">
  <dir name="Experiment">
    <key name="ODB timeout" type="INT32">10000</key>

While current MIDAS XML ODB dumps have it:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- created by MXML on Thu May 21 20:37:06 2026 -->
<odb root="/" filename="odb.xml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:noNamespaceSchemaLocation="/home/olchansk/packages/midas/odb.xsd">
  <dir name="System" handle="135320">
    <dir name="Flush" handle="135408">
      <key name="Flush period" type="UINT32" handle="135496">60</key>


And odbxx requires this attribute unconditionally:

            if (mxml_get_attribute(node, "handle") == nullptr)
               mthrow("No \"handle\" attribute found in XML data");
            o->set_hkey(std::stoi(std::string(mxml_get_attribute(node, "handle"))));

The "handle" attribute was added to XML ODB dumps in September 2024 (not sure to what purpose, JSON ODB dumps do not have a "handle" attribute):

git blame src/odb.cxx
...
dd23558fbd src/odb.cxx (Stefan Ritt 2024-09-20 15:30:00 +0200  9387) mxml_write_attribute(writer, "handle", std::to_string(hKey).c_str());

This change makes MIDAS data files written before this date un-analyzable (unless odbxx is turned off).

I can prevent manalyzer from crashing by catching the exception, but I think it is better if odbxx code is updated to accept the pre-Sep-2024 ODB XML data 
format (which were valid XML ODB dumps when they were made and users are stuck with them inside compress binary MIDAS data files).

P.S. also please check the odbxx code for other crashes on malformed XML ODB dumps, it should complain, fail to load the dump, but not core dump or abort. 
Malformed ODB dumps is not a theoretical situation, I am currently looking at MIDAS data files (mid.lz4) that have invalid JSON ODB dumps created from 
corrupted ODB. Luckily the JSON parser handles this gracefully, does not crash manalyzer and I can look at the data. I did have to go 10 runs into the past 
to find an uncorrupted ODB dump to reload a good ODB. Fixes to the JSON encoder and fixes for corrupt ODB are in progress.

K.O.
  3232   09 Jun 2026 Stefan RittBug Reportincompatible ODB XML dumps
I fixed that by not requiring the handle explicitly:

           if (mxml_get_attribute(node, "handle") != nullptr)
              o->set_hkey(std::stoi(std::string(mxml_get_attribute(node, "handle"))));

The reason for the handle is the following: If you attach an midas::odb object to a very large subtree of the ODB, this can take very long since each
ODB element requires a few RPC roundtrips. In Mu3e, this took up to one minute. 

To overcome the problem, we can initialize an midas::odb object remotely via an XML tree. The server creates a huge XML object, sends it over
a single RPC command, and the client re-creates the midas::odb tree from the XML object. Since each online midas::odb need the ODB handle for
watch functions etc. the handle got added to the XML file.

For a XML file, this makes no sense of course, so now it's optional with the code change above.

P.S.: The odbxx code does not core dump, it just produces an error which looks like core dump. This is an uncaught exception. Normal exceptions
just abort the program without much information. The odbxx exceptions add a stack dump if available on that OS. That makes it easier to debug.
If you do not want the abort, catch the exception. 

Stefan
  3235   25 Jun 2026 Konstantin OlchanskiBug Reportincompatible ODB XML dumps
> I fixed that by not requiring the handle explicitly ...
>
> [it was...] an uncaught exception. If you do not want the abort, catch the exception. 

Hi, Stefan! Thank you for fixing this!

The issue was not the exception, but the failure to load the XML ODB dump from an (immutable) data file.

This should now be fixed (TBC), so all good now.

K.O.
  170   22 Oct 2004 Konstantin OlchanskiBug Fixmhttpd message colouring
I commited a fix to mhttpd logic that decides which messages should be shown in
"red" colour- before, any message with square brackets and colons would be
highlighted in red. Now only messages matching the pattern [...:...] are
highlighted. The decision logic was moved into a function message_red(). K.O.
  174   09 Nov 2004 Pierre-Andre AmaudruzBug FixNew transition scheme
Problem:
If cm_set_transition_sequence() is used for changing the sequence number, the 
command odbedit> start/stop/resume/pause -v report the propre sequence but the
action on the client side is actually not performed!

Fix:
Local transition table updated in midas.c (1.226)

Note:
The transition number under /system/clients/<pid>/transition...
is used internally. Changing it won't have any effect on the client action
if sequence number is not registered.
  200   25 Feb 2005 Konstantin OlchanskiBug Fixfixed: double free in FORMAT_MIDAS ybos.c causing lazylogger crashes
We stumbled upon and fixed a "double free" bug in src/ybos.c causing crashes in
lazylogger writing .mid files in the FORMAT_MIDAS format (why does it use
ybos.c? Pierre says- for generic file i/o). Why this code had ever worked before
remains a mystery. K.O.
  211   05 May 2005 Konstantin OlchanskiBug Fixfix: minor bit rot in the example experiment
I fixed some minor bit rot in the example experiment: a few minor Makefile
problems, make the analyzer use the current histogram creation macros, etc. I
also added startup and shutdown scripts. These will be documented as we work
through them with our Summer student. K.O.
  212   02 Aug 2005 Konstantin OlchanskiBug Fixfix odb corruption when running analzer for the first time
I have been plagued by ODB corruption when I run the analyzer for the first time
after setting up the new experiment. Some time ago, I traced this to
mana.c::book_ttree() and now I found and fixed the bug, fix now commited to
midas cvs. In book_ttree(), db_find("/Analyzer/Bank switches") was returning an
error and setting hkey to zero. Then we called db_open_record() with hkey==0,
which cased ODB corruption later on. The normal db_validate_hkey() did not catch
this because it considers hkey==0 to be valid (when most likely it is not). K.O.
  216   18 Aug 2005 Konstantin OlchanskiBug Fixfix race condition between clients on run start/stop, pause/resume
It turns out that the new priority sequencing of run state transitions had a
flaw: the frontends, the analyzer and the logger all registered at priority 500
and were invoked in essentially a random order. For example the frontend could
get a begin-run transition before the logger and so start sending data before
the logger opened the output file. Same for the analyzer and same for the end of
run. Also the sequencing for pause/resume run and begin/end run was different
when the two pairs ought to have identical sequencing.

I now commited changes to mana.c and mlogger.c changing their transition sequencing:

start and resume:
200 - logger (mlogger.c, no change)
300 - analyzer (mana.c, was 500)
500 - frontends (mfe.c, no change)

stop and pause:
500 - frontends (mfe.c, no change)
700 - analyzer (mana.c, was 500)
800 - mlogger (mlogger.c, was 500)

P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
see an anomaly in the analyzer, where it receives data events after the
end-of-run transition.

K.O.
  219   01 Sep 2005 Stefan RittBug Fixfix race condition between clients on run start/stop, pause/resume
> It turns out that the new priority sequencing of run state transitions had a
> flaw: the frontends, the analyzer and the logger all registered at priority 500
> and were invoked in essentially a random order. For example the frontend could
> get a begin-run transition before the logger and so start sending data before
> the logger opened the output file. Same for the analyzer and same for the end of
> run. Also the sequencing for pause/resume run and begin/end run was different
> when the two pairs ought to have identical sequencing.
> 
> I now commited changes to mana.c and mlogger.c changing their transition sequencing:
> 
> start and resume:
> 200 - logger (mlogger.c, no change)
> 300 - analyzer (mana.c, was 500)
> 500 - frontends (mfe.c, no change)
> 
> stop and pause:
> 500 - frontends (mfe.c, no change)
> 700 - analyzer (mana.c, was 500)
> 800 - mlogger (mlogger.c, was 500)
> 
> P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
> see an anomaly in the analyzer, where it receives data events after the
> end-of-run transition.
> 
> K.O.

Thanks for fixing that bug. It happend because during the implementatoin of the priority
sequencing we have up the pre/post tansition, which took care of the proper sequence
between the logger, frontend and analyzer. The way you modified the sequence is
absolutely correct. It is important to have >10 numbers "around" the frontends (like
450...550) in case one has an experiment with >10 frontends which need to make a
transition in a certain sequence (like the DANCE experiment in Los Alamos).
  229   17 Oct 2005 Exaos LeeBug Fix"make install" error under MacOS X
Under MacOS X, "make install" will cours an error like this:
...
install: darwin/bin/dio: No such file or directory
make: *** [install] Error 71

This can be fixed as the following diff:
404,405c404,405
< $(BIN_DIR)/mcnaf: $(UTL_DIR)/mcnaf.c $(DRV_DIR)/camac/camacrpc.c
<       $(CC) $(CFLAGS) $(OSFLAGS) -o $@ $(UTL_DIR)/mcnaf.c $(DRV_DIR)/camac/camacrpc.c $(LIB) $(LIBS)
---
> $(BIN_DIR)/mcnaf: $(UTL_DIR)/mcnaf.c $(DRV_DIR)/bus/camacrpc.c
>       $(CC) $(CFLAGS) $(OSFLAGS) -o $@ $(UTL_DIR)/mcnaf.c $(DRV_DIR)/bus/camacrpc.c $(LIB) $(LIBS)
438c438,439
<       @for i in mserver mhttpd odbedit mlogger ; \
---
> 
>       @for i in mserver mhttpd odbedit mlogger dio ; \
444,447d444
<       chmod +s $(SYSBIN_DIR)/mhttpd
< 
< ifeq ($(OSTYPE),linux)
<       install -v -m 755 $(BIN_DIR)/dio $(SYSBIN_DIR)
449c446
< endif
---
>       chmod +s $(SYSBIN_DIR)/mhttpd
  235   23 Nov 2005 Stefan RittBug FixEndian swapping in mana.c
It was reported that following code in mana.c :
  /* swap event header if in wrong format */
  if (pevent->serial_number > 0x1000000) {
     WORD_SWAP(&pevent->event_id);
     WORD_SWAP(&pevent->trigger_mask);
     DWORD_SWAP(&pevent->serial_number);
     DWORD_SWAP(&pevent->time_stamp);
     DWORD_SWAP(&pevent->data_size);
  }

does not work correctly for events having a true serial number above 16777216 (=0x10000000). After some considerations, I concluded that there is no good way to determine automatically the endian format of midas events, without adding another field in the header, which would break the compatibility with all recorded data up to date. I therefore changed the above code to
  /* swap event header if in wrong format */
#ifdef SWAP_EVENTS
  WORD_SWAP(&pevent->event_id);
  WORD_SWAP(&pevent->trigger_mask);
  DWORD_SWAP(&pevent->serial_number);
  DWORD_SWAP(&pevent->time_stamp);
  DWORD_SWAP(&pevent->data_size);
#endif

So if one wants to analyze events with the midas analyzer on a PC system for example where the events come from a VxWorks system with the opposite endian encoding, one has to set the flag -DSWAP_EVENTS when compiling the analyzer for that type of analysis.
  252   07 May 2006 Konstantin OlchanskiBug FixUpdate & add VME drivers
I commited fixes for a few minor compilation errors in the VME drivers
(vmicvme.c, etc)
I also added new drivers for the v513 latch and v560 scaler that I wrote for
CERN-ALPHA.

(Maybe I should mention that we also have drivers for the SIS 3820 multiscaler,
the v895 VME discriminator and a few more modules. Will commit them as they mature).

K.O.
  256   18 May 2006 Konstantin OlchanskiBug Fixremoved a few "//" comments to fix compilation on VxWorks
Our VxWorks C compiler (gcc-2.8-something) does not like the "//" comments. Luckily, on VxWorks, we 
only compile a small subset of midas, so there is no point in banning all "//" comments. But I did have to 
convert a couple of them to /* commens */ in odb.c to make it compile. Changes to odb.c commited. K.O.
ELOG V3.1.6-083448f7