23 Apr 2026, Pavel Murat, Bug Report, increasing the max number of hot links in ODB
|
Dear MIDAS experts,
when I attempted to increase the max number of hotlinks in ODB , defined as
#define MAX_OPEN_RECORDS 256 /**< number of open DB records */
I started running into an assertion in midas/src/odb.cxx
https://bitbucket.org/tmidas/midas/src/fa5457b5274a6b42c5ed8b6dea5e3cdd43de38fe/src/odb.cxx#lines-1525 :
assert(sizeof(DATABASE_CLIENT) == 2112);
is it possible that the size of the DATABASE_CLIENT structure should be checked against 64+sizeof(OPEN_RECORD)*MAX_OPEN_RECORDS ?
- 64 clearly can be expressed in a better maintainable form
UPDATE: similar consideration holds for the size of the DATABLE_HEADER structure, which is also checked against a constant
https://bitbucket.org/tmidas/midas/src/fa5457b5274a6b42c5ed8b6dea5e3cdd43de38fe/src/odb.cxx#lines-1526
-- many thanks, regards, Pavel
|
25 Apr 2026, Pavel Murat, Bug Report, increasing the max number of hot links in ODB
|
I see - thank you for the explanation!
Indeed, updating MIDAS clients on each and every RPI etc in a running experiment may be a real challenge.
Thinking forward - would it help if the ODB clients, upon initial connection but before doing anything else
were reading the ODB parameters from the ODB itself, so the clients were "learning" about the ODB structure
dynamically, at run time? Or that knowledge has to be static ?
-- thanks, regards, Pavel |
27 Apr 2026, Pavel Murat, Bug Report, increasing the max number of hot links in ODB
|
> I wonder why one needs more than 256 hotlinks at all. Please note that with the odbxx "watch" API, you can hotline a whole subdirectory, and get notified if ANY of the
> underlying values or subdirectories change. In principle, one could have one hotlink to "/" and see all changes in the ODB (although that does not make sense and might slow
> down ODB access a bit).
Thanks ! - I didn't know that. I did run into a number of hotlinks limit via mlogger which complained about not being able to create a hotlink
to yet another event. Doubling the default value of MAX_OPEN_RECORDS solved the problem.
I don't know the exact arithmetic defining the number of hotlinks in the system, but my today's case is a case of
- 36 (linux servers) +18 (RPI) monitoring frontends managing one or several different equipment items each.
- Each equipment item sends to ODB at least one monitoring event
- in addition, each frontend created an individual hotlink for handling interactive commands
- for MAX_OPEN_RECORDS=256, 4 equipment items per frontend easily make it into the dangerous zone.
"Equipment items" also include the online processes running on the distributed computing farm processing the data ..
(we are not using MIDAS event building capabilities)
>
> Try the odbxx_test.cpp example in MIDAS. In line 210 it puts a single hotlink to /Experiment. If you change anything under /Experiment, the program gets notified. By checking the
> path of the changed ODB entry, it can figure out which of the subways have been changed:
>
> // watch ODB key for any change with lambda function
> midas::odb ow("/Experiment");
> ow.watch([](midas::odb &o) {
> std::cout << "Value of key \"" + o.get_full_path() + "\" changed to " << o << std::endl;
> });
>
>
> Maybe that would solve your problem without having to change the maximum number of hotlinks.
I'll see how much mileage one can make here, but so far it looks that it is the number of various monitoring events
handled by the mlogger which drives the number of hotlinks
-- thanks, regards, Pavel |
27 Apr 2026, Pavel Murat, Bug Report, increasing the max number of hot links in ODB
|
> > I wonder why one needs more than 256 hotlinks at all.
>
> I confirm that ALPHA is running with MAX_OPEN_RECORDS changed from 256 to 2048,
> this is the only experiment I know of that had to increase any MIDAS ODB defaults.
>
> The reason for this is mlogger, it opens an open record for each variable in each equipment.
>
> This should be changed to 1 db_watch per equipment. We talked about it, but I guess we never did it.
>
> I think this task just went almost to the top of my MIDAS to-do list.
I definitely had many more than 256 variables successfully monitored with MAX_OPEN_RECORDS=256.
Is it possible that mlogger creates a hotlink per monitoring event, not per variable ?
- I think, that would make more sense in almost any scenario...
-- thanks, regards, Pavel |
27 Apr 2026, Pavel Murat, Bug Report, increasing the max number of hot links in ODB
|
> > Indeed, updating MIDAS clients on each and every RPI etc in a running experiment may be a real challenge.
>
> actually, only local clients must be rebuilt, remote clients connecting to the mserver do not care about ODB
> internal structure.
thanks! I see - local clients do know about the memory mapping, remote ones - don't
> unfortunately, the "open records" structure is allocated at compile-time inside the ODB header,
> making any change to this would break binary compatibility.
right, I guess, what I had in mind would require the very first fODB record to be a format descriptor,
and that would be a breaking change... Anyway, the practical part of the problem is addressed,
so I just add here a link which contains an answer to the original posting (I found it only after the fact):
https://daq00.triumf.ca/MidasWiki/index.php/FAQ#Increasing_Number_of_Hot-links
-- thanks again, regards, Pavel |
09 Dec 2003, Paul Knowles, , db_close_record non-local/non-return
|
Hi All,
I have found a weird one:
The following code executes on the frontend machine in the
frontend_exit() routine, and connects to the odb running on
another separate machine:
...
cm_msg(MINFO,__func__, "line %d", __LINE__);
cm_get_experiment_database(&hdb, NULL);
cm_msg(MINFO,__func__, "line %d", __LINE__);
status = db_find_key(hdb, 0, "/Experiment/Run Parameters", &hkey);
cm_msg(MINFO,__func__, "line %d, hkey=%d, status=%d",
__LINE__, hkey, status);
checkstat("db_find_key returned status %d", status);
cm_msg(MINFO,__func__, "line %d", __LINE__);
status = db_close_record(hdb, hkey);
/* NOTREACHED!! the above call to db_close_record
doesn't return!
*/
cm_msg(MINFO,__func__, "line %d, status=%d", __LINE__, status);
checkstat("db_close_record returned status %d", status);
checkstat is a macro that does the following:
#define checkstat(format, arg...)\
do{ if(status != DB_SUCCESS) {\
cm_msg(MERROR, __func__, format, ## arg);\
return FE_ERR_ODB;}}while(0)
The key exists, and the status of the search is 1
(i.e., DB_SUCCESS) and rest of the code tries to run. What gets
really weird is that the db_close_record _doesn't_ _return_.
The code following the NOTREACHED comment just doesn't get
called. I get the message from the __LINE__ just in front
of the call, but not the message afterwards (cm_msg and printf
were tried). Somehow db_close_record is causing a non-local
exit or signal or something. No error message is printed and the
frontend continues to exit with exit code 0. But, since the rest
of my frontend_exit/odb closing doesn't happen, the odb is left in
a lost state requiring a cleanup. If I comment out the calls to
db_close_record, the rest of my frontend_exit runs normally
and the cm_disconnect_experiment() in mfe.c eventually closes my
open records correctly (I expect, anyway) and this is the present
workaround i am using. The terror i have is that several of my
hotlinked callback routines will call the close_record routine
when resetting illegal values. No end of hilarity will result there...
I was using the same code in the frontend under 1.9.2 and
have only recently upgraded to 1.9.3-? tarball from PAA and
there were no problems using the 1.9.2 code: this is a 1.9.3
issue.
I have localized the weirdness to what I think is the RPC interface.
Running the nullfrontend (no camac access) on the same machine as
hosts the ODB I can make the problem appear and disappear in the
following way:
(odb is local on machine ``monet'')
nullfe -h monet -e acqmonad : db_close_record will get lost
nullfe -e acqmonad : db_close_record works as expected.
I've tried also with the patch for the 256 byte odb string bug since
many of the open records have strings of that length, but that isn't
it. The only substancial looking change to mserver from 1.9.2 to 1.9.3
is the SIGPIPE ignore and that doesn't look like a good candidate either.
Can this be that some of the
#IFDEF LOCAL_ROUTINES
that got moved about in odb.c and others
are causing the remote call to get confused?
Clearly the answer is to just use stable and happy 1.9.2, but the
people for whom I am working now really want to use ROOT for
an analyzer...
cheers,
.p.
Paul Knowles. phone: 41 26 300 90 64
email: Paul.Knowles@unifr.ch Fax: 41 26 300 97 47
finger me at pexppc33.unifr.ch for more contact information |
18 Dec 2003, Paul Knowles, , Poll about default indent style
|
Hi Stefan,
> once and forever, I am considering using the "indent" program which comes
> with every linux installation. Running indent regularly on all our code
> ensures a consistent look.
I think this can be called a Good Thing.
> The "-kr" style does the standard K&R style,
> but used tabs (which is not good), and does a 4-column
> indention which is I think too much. So I would propose
> following flags:
> indent -kr -nut -i2 -di8 -bad <filename.c>
(some of this is a repeat from an earlier mail to SR):
You might also want a -l90 for a longer line length than 75
characters. K&R style with indentation from 5 to 8 spaces
is a good indicator of complexity: as soon as 40 characters
of code wind up unreadably squashed to the right of the
screen, you have to refactor to have less indentation
levels. This means you wind up rolling up the inner parts
of deeply nested conditionals or loops as separate
functions, making the whole code easier to understand.
I think that setting -i2 is ``going around the problem''
of deep nesting. If you really need to keep the indentation
tabs less than 4 (8 is ideal) because your code is falling off the
right edge of the screen, you are indented too deeply. Why do
I say that? There is the famous ``7+-1'' idea that you can hold
in you head only 7 ideas (give or take one) at any time. I'm not
that smart and I top out at about 5: So for example, a conditional
in a loop in a conditional in a switch is about as deep a level
of nesting as I can easily understand (remember that I also have
to hold the line i'm working on as well): that's 4 levels, plus one for the
function itself and we are at 40 characters away from the right edge
of the screen using -i8 and have some 40 characters available for writing code
(how often is a line of code really longer than about 40 characters?).
On top of that, the indentation is easily seen so you know immediately
wheather you are at the upper conditional, or inner conditional. A -i2
just doesn't make the difference big enough. -i5 is a happy balance
with enough visual clue as to the indentation level, but leaves you 50
to 60 characters for the code line itself.
However, if you are indenting very deeply, then the poor reader can't hold
on to the context: there are more than 6 or 7 things to keep in mind.
In those cases, roll up the inner levels as a separate function and
call it that way. The inner complexity of the nested statements gets
nicely abstracted and then dumb people like me can understand what
you are doing.
So, in brief: indent is a good idea, and -in with n>=4 will be best.
I don't think -i2 will lend itself to making the code so much easier
to read.
thanks for listening.
.p. |
07 Aug 2019, Paolo Baesso, Bug Report, ROOTANA bug?
|
Hi,
I posted on the ROOTANA elog but there seems to be little activity there...
Could someone confirm if this is a bug?
https://midas.triumf.ca/elog/Rootana/14
Another user replied that they are encountering the same issue, so I think it is unlikely it is just our installation.
While ROOTANA is unusable for us, I tried to use the example Frontend and Analyzer (under the Experiment source folder). The analyzer does not seem to do much though. A root file is produced but nothing is placed into it. Is that normal?
Any help would be welcome. |
21 May 2024, Nikolay, Bug Report, experiment from midas/examples
|
There are 2 bugs in midas/examples/experiment:
1) In fronted bank named "PRDC" is created for scaler event. But in analyzer
module scaler.cxx the bank named "SCLR" is searched for the same event.
2) In mana.cxx linked from analyzer.cxx is "Invalid name "/Analyzer/Tests/Always
true/Rate [Hz]" passed to db_create_key: should not contain "["".
Looks like ODB doesn't like '[', ']' characters. |
02 May 2023, Niklaus Berger, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option
|
Thanks for all the helpful hints. When finally managing to evade all timeouts and attach the debugger in just the right moment, we find that we get a segfault in mserver at L827:
case RPC_DB_COPY_XML:
status = db_copy_xml(CHNDLE(0), CHNDLE(1), CSTRING(2), CPINT(3), CBOOL(4));
Some printf debugging then pointed us to the fact that the culprit is the pointer de-referencing in CBOOL(4). This in turn can be traced back to mrpc.cxx L282 ff, where the line with the arrow was missing:
{RPC_DB_COPY_XML, "db_copy_xml",
{{TID_INT32, RPC_IN},
{TID_INT32, RPC_IN},
{TID_ARRAY, RPC_OUT | RPC_VARARRAY},
{TID_INT32, RPC_IN | RPC_OUT},
-> {TID_BOOL, RPC_IN},
{0}}},
If we put that in, the mserver process completes peacfully and we get a segfault in the client ("Wrong key type in XML file") which we will attempt to debug next. Shall I create a pull request for the additional RPC argument or will you just fix this on the fly? |
02 May 2023, Niklaus Berger, Forum, Problem with running midas odbxx frontends on a remote machine using the -h option
|
And now we also fixed the client segfault, odb.cxx L8992 also needs to know about the header:
if (rpc_is_remote())
return rpc_call(RPC_DB_COPY_XML, hDB, hKey, buffer, buffer_size, header);
(last argument was missing before). |
26 Jul 2019, Nik Berger, Bug Report, History/Endianness
|
Hi,
I have a bank of floats with slow control values that I store to the history and
ODB. When reading the history, both in the webbrowser and with mhist, the floats
get read with the wrong endianness; under /equipment/variables in the ODB they
however display correctly. System is a an intel OpenSuse linux box. Any ideas?
Thanks
Nik |
06 Oct 2019, Nik Berger, Bug Report, History data size mismatch
|
Logging a list of variables to the history via links in the history ODB subtree,
we get messages as follows at every run start:
19:43:24.009 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes
19:43:24.008 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes
19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes
19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes
The history calculates the size of a record from the size of the individual variables, (history_schema.cxx, L2666 ff), whereas the ODB delivers the data aligned/padded to the size of the largest value in the record.
In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit), leading to a padded response from the ODB, 4 byte longer than the history expects.
Quick fix: Add another 32 bit dummy variable to the history. Gets rid of the error messages...
Should probably be fixed at a deeper level... |
10 Oct 2019, Nik Berger, Bug Report, History data size mismatch
|
>I wonder why do you this via ODB links. The "standard" way of writing to the history should be to create events for an equipment and flag this equipment as being written to the
>history. All variables under /Equipment/<name>/Variables then automatically go into the history and you don't have to worry about ODB links. Only variables not fitting the
>equipment/variables scheme should be dealt with via ODB links, like variables under equipment/statistics or parameters in another ODB tree. In a typical midas experiment, only
>very few variables typically go into the 'System' event. This is however probably not a solution to your problem. If you have a similar structure (doubles plus an odd number of floats)
>under 'variables', you might get the same error.
>
> In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit)
>
We do this in the MuX DAQ and mix things that come directly from MIDAS (the MIDAS trigger rate) and things from the
analyzer (rates in the self-triggering detectors) and some temperatures from yet somewhere else. Yes, we could have
kept that apart, yes, in this case a double would also work (and not break things), but a bug is a bug...
I could think of senisble use cases where doubles and ints are mixed and I also know quite a few areas where it makes
sense to use floats...
Nik |
10 Jun 2025, Nik Berger, Bug Report, History variables with leading spaces
|
By accident we had history variables with leading spaces. The history schema check then decides that this is a new variable (the leading space is not read from the history file) and starts a new file. We found this because the run start became slow due to the many, many history files created. It would be nice to just get an error if one has a malformed variable name like this.
How to reproduce: Try to put a variable with a leading space in the name into the history, repeatedly start runs.
Sugested fix: Produce an error if a history variable has a leading space. |
28 Aug 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies 
|
Hello experts,
I have been writing a SC frontend for a powersupply. I have used the model
where the frontend can be started with "-i n" option so that each fe can
control a different supply. During the development/testing of the program I
would normally only run a single instance with "-i 1". However when I started
a second instance with "-i 2" I found problems with the history plots that
were being made for the original "-i 1" instance. The variable being plotted
seemed to randomly jump between the value from the "-i 1" instance and
the "-i 2" instance. confirmed that the "correct" values exist for each
frontend in the odb under /Equipment/Foo01/Variables and
/Equipment/Foo02/Variables
This is also not just a plotting artifact since I was also
able to see the two different values by running mhist.
I saw this behaviour using midas-2019-03 and also the head of the development
branch (686e4de2b55023b0d1936c60bcf4767c5e6caac0 from just under 48 hours ago).
I was able to reproduce this with a stripped down frontend that just
sets a variable that is equal to its frontend_index. Please find the code
and Makefile attached. Presumably I've done something wrong in my
implementation that hopefully a more experienced person can spot quite
quickly, but please let me know if any more information is needed.
I have seen this behaviour on both Debian 10 and on a CentOS 7 Singularity
image running on top of Debian 10.
Thanks,
Nick.
P.S. I made the topic of this post "Forum" and not "Bug Report" since I
expect the root of this problem is somewhere between the keyboard and chair. |
28 Aug 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hi Stefan,
thanks for you quick reply.
> My first question would be why are you using several font-ends at all?
Becuase I was following the model used for many of the frontends for the ND280 FGD.
> That makes things more
> complicated than needed. In the normal FE framework, you can define either several equipment
> served by one frontend, or even one equipment linked to several devices. In the MEG experiment
> we have one slow control frontend controlling ~100 devices without problem. In the old days
there
> was a problem that some slow devices could throttle the readout, but since the invention of
multi-
> threaded slow control equipment, each device gets its own thread so they don't block each
other.
Perhaps things have changed in the 10 years since the FGD SC code was written. I can do it
differently but doing it that way seemed naturual since around 90% of the frontend code that I
have see does it that way.
Nick. |
29 Aug 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hi LCP,
thanks for the suggestion and link. Unfortunatly I don't think this explains it.
Nick. |
29 Aug 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hi Ben,
thanks for your reply. I can confirm that your suggested workaround does indeed
make the problem dissapear.
I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
Thanks,
Nick. |
01 Sep 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies
|
Hi Ben,
thanks for your reply. I can confirm that your suggested workaround does indeed
make the problem dissapear.
I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
Thanks,
Nick. |
|