Back Midas Rome Roody Rootana
  Midas DAQ System, Page 45 of 146  Not logged in ELOG logo
    Reply  27 Nov 2020, Konstantin Olchanski, Info, Equipment "common" settings in ODB 
> Today I addressed a topic which bugged me since long time.

Right. No easy subject. For me, too, this has been a problem in MIDAS for a long time.

> Now, on each start of the frontend program, the values from equipment[] are written to 
> the ODB. They are still "live". If one changes them when the frontend is 
> running, that change takes effect immediately. But on the next restart of the 
> frontend, the old values from equipment[] is put back there.

There is a downside from this behaviour.

If some values in equipment/common are "live" and the user is expected to change them,
the user will be unpleasantly surprised when their changes magically disappear (after reboot,
after frontend crash, after run restart if experiment requires restarting some frontends
before starting a new run).

This change will also break some experiments that rely in things like specifying
event buffer names through ODB. But experiments can adapt and specify buffer names
through command line switch instead of ODB.

This new way also it makes the "live" Common/Period unusable. Sure I can speed up or slow
down a frontend even during the run, but if my change does not "stick", what good is it?

Personally, I think there is no easy solution for all these troubles.

I would advocate the following approach:

- think of MIDAS as a "mature" system,
- treasure backward compatibility
- (if we must break backward compatibility to introduce a new "must have" improvement, so be it)
- document how things work. if it is clearly written down what different fields in "common" do, fewer people 
"get burned" by unexpected or illogical things. (and any non-trivial system has plenty of those).

Going back to ODB equipment/common, my experience with midas and odb tells me
that one should avoid mixing together ODB entries set by user and ODB entries set by code.

For example, separating them as equipment/settings and equipment/variables works well. Mixing
them as in equipment/common and sequencer/state causes trouble.

So perhaps we should split Equipment/common into two pieces, user settable fields like
"Period" and "event buffer name" would move to equipment/settings or whatever.

This will open the discussion of which items in equipment/common should be user settable,
and some people would want event buffer specified in the code to prevail, while other
people would want the name from odb to prevail, and both are valid but conflicting preferences.

Or we could bite the bullet and say, equipment/common is controlled by the frontend code,
the user should not change it. (and mark it read-only in ODB).

For all the pain this may cause, at least this will make it self-consistent.

Per this proposal, in addition to Stefan's change, the hotlink on equipment/common goes away,
"period" is no longer "live" and the whole subdirectory is made "read-only".

K.O.
    Reply  27 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
Ok, so what about the following proposal:

- I change back the mfe.cxx code to behave like before (ODB has precedence and does not get overwritten when the 
front-end restarts)

- I add a global flag

BOOL equipment_common_overwrite;

and pre-set it to FALSE;

- So if nothing is changed the flag stays false and ODB keeps precedence

- If a frontend wants to overwrite equipment/common on each start, the user sets

BOOL equipment_common_overwrite = TRUE;

near the equipment[] structure in the front-end code. 

- If the flag is true, the mfe.cxx init code copies the equipment[] structure to the ODB on each frontend start

I believe this way we can keep backward compatibility, and add the new way with minimal effort. The only downside 
is that all frontends on this plane have to add at least "BOOL equipment_common_overwrite = FALSE;" in their 
code.

I know global variables are evil, but this way the user can just add the line above to the equipment[] array, so 
one sees this when one edits the equipment[] array, giving motivation to change as needed. So the code would be



BOOL equipment_common_overwrite = TRUE;

EQUIPMENT equipment[] = {
 ....
}



An alternative way would be to add a function

  set_equipment_common_overwrite(TRUE);

into the frontend_init() code. That's somehow cleaner (still needs an internal global variable), but it has to go 
into frontend_init() so won't be at the same place as the EQUIPMENT list in the frontend.

Thoughts?

Best,
Stefan
    Reply  27 Nov 2020, Konstantin Olchanski, Info, Equipment "common" settings in ODB 
Yes, I think this will work.

For old mfe.c frontends, global variable set to "do it the new way" should be okey,
new experiments will have it the new way. Old experiments, will be forced to add a one-line definition
of this global variable (otherwise mfe.o will not link), at that time they get to chose "new way" or "old way".

For the new TMFE c++ frontend, this will work naturally when they create the Equipment Common object,
in the object constructor, you can see how it explicitly honors or overwrites the ODB common entries.

The TMFE frontend does not do a live "period", so there should be no issue with that.

Should I open a bitbucket issue "update TMFE frontend to new Equipment/Common scheme", to make sure
I do not forget about it?

K.O.
    Reply  30 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
Ok, I implemented it the following way:

- Added a boolean flag "equipment_common_overwrite", which must be contained in EACH frontend, preferably just 
before the EQUIPMENT structure, such as:

BOOL equipment_common_overwrite = TRUE;

EQUIPMENT equipment[] = {
...
};

- If that flag is TRUE, then the contents of the "equipment" structure is copied to the ODB on each start of the 
front-end

- If the flag is FALSE, then the ODB values are kept on the start of the front-end

The setting of the flag depends now on the philosophy of the experiment. Some experiments say that everything 
needed should be in the front-end code, so when it starts everything gets set correctly. They don't change the 
values in the ODB, but in the frontend code, which then goes into their repository. Other experiments just need 
some default values from the frontend code, and the fine-tune things by changing values in the ODB. These 
experiments should set this flag to FALSE.

*****

Please note that EVERY frontend now needs this flag, so all of you have to add it to all of your front-ends, 
otherwise the front-end will not compile! I could not figure out how to this could be done without this 
requirement, since you can define a global variable only once.

*****


Stefan
    Reply  30 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
One more change: 

After using the new code for some hours, we realized that the "enabled" flag should not come from the frontend code, 
but always be defined by the ODB. So if you quickly have to disable some equipment because the associated hardware is 
off, you want to change this flag only in the ODB and not have to recompile the frontend. So we exclude that flag from 
being set by the frontend. It is anyhow special, because one sees all disable equipment in the main midas status page, 
so one knows what's on and what's off.

Please comment here if you think that change causes problem. Anyhow it's working now for the enabled flag as before 
all these changes.

Stefan
    Reply  30 Nov 2020, Konstantin Olchanski, Info, Equipment "common" settings in ODB 
> One more change: 
> 
> After using the new code for some hours, we realized that the "enabled" flag should not come from the frontend code, 
> but always be defined by the ODB. So if you quickly have to disable some equipment because the associated hardware is 
> off, you want to change this flag only in the ODB and not have to recompile the frontend. So we exclude that flag from 
> being set by the frontend. It is anyhow special, because one sees all disable equipment in the main midas status page, 
> so one knows what's on and what's off.
> 
> Please comment here if you think that change causes problem. Anyhow it's working now for the enabled flag as before 
> all these changes.
> 

Good catch. I still think this is fundamentally impossible to "get right". But good, you
are now in the same boat with me. The documentation will read: "if flag is TRUE, these data fields
are read from ODB, if flag is FALSE, those other fields are read from ODB". I will have to check
how this will work out for the TMFE C++ frontend (I think both mfe.c and TMFE frontends should
work "the same").

I think we have at least one month to play with this, I do not think we can do the next release
of midas until January.

K.O.
Entry  30 Nov 2020, Konstantin Olchanski, Info, more wisdom from linux kernel people 
As you may know, I am a big fan of two software projects - the linux kernel and ROOT. The linux kernel is one of 
the few software projects "done right". ROOT is where normal people try to "get it right" with real-world level 
of success. I use both softwares daily and I try to apply their ways and methods to MIDAS as much as I can.

So just in time for our discussion of array indexes, a talk by gregkh shows
up on slashdot. The title is "how to keep your users happy". (Nobody
ever wants to be nasty to their users, but do read his talk).

https://git.sr.ht/~gregkh/presentation-application_summit/tree/main/keep_users_happy.pdf

The talk refers to some older stuff, still relevant, of course, in case you miss the links
in the pdf file, here they are:

https://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html
https://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html
https://ozlabs.org/~rusty/ols-2003-keynote/img0.html (click on "continue" to see next page)

K.O.
Entry  06 Jan 2021, Isaac Labrie Boulay, Info, Recovering a corrupted ODB using odbinit. 
Hi all,

I am currently trying to recover my corrupted ODB using odbinit and I am still 
getting issues after doing 'odbinit --cleanup' and trying to reload the saved 
ODB (last.json). Here is the output:


************************************************
(odbinit cleanup) Note* the ERROR in system.cxx
************************************************


[caendaq@cu332 ANIS]$ odbinit --cleanup
Checking environment... experiment name is "ANIS", remote hostname is ""
Checking command line... experiment "ANIS", cleanup 1, dry_run 0, create_exptab                                
0, create_env 0
Checking MIDASSYS....../home/caendaq/packages/midas
Checking exptab... experiments defined in exptab file "/home/caendaq/ANIS/exptab                               
":
0: "ANIS" <-- selected experiment
 

Checking exptab... selected experiment "ANIS", experiment directory "/home/caend                               
aq/ANIS/"
 

Checking experiment directory "/home/caendaq/ANIS/"
Found existing ODB save file: "/home/caendaq/ANIS/.ODB.SHM"
 

Checking shared memory...
Deleting old ODB shared memory...
[system.cxx:1052:ss_shm_delete,ERROR] shm_unlink(/1001_ANIS_ODB__home_caendaq_AN                               
IS_) errno 2 (No such file or directory)
Good: no ODB shared memory
Deleting old ODB semaphore...
Deleting old ODB semaphore... create status 1, delete status 1
Preserving old ODB save file /home/caendaq/ANIS/.ODB.SHM" to "/home/caendaq/ANIS                               
/.ODB.SHM.1609951022"
 

Checking ODB size...
Requested ODB size is 0 bytes (0.00B)
ODB size file is "/home/caendaq/ANIS//.ODB_SIZE.TXT"
Saved ODB size from "/home/caendaq/ANIS//.ODB_SIZE.TXT" is 1048576 bytes (1.05MB                               
)
We will initialize ODB for experiment "ANIS" on host "" with size 1048576 bytes                                
(1.05MB)

 
Creating ODB...
Creating ODB... db_open_database() status 302
Saving ODB...
Saving ODB... db_close_database() status 1
Connecting to experiment...
 

Connected to ODB for experiment "ANIS" on host "" with size 1048576 bytes (1.05M                               
B)
Checking experiment name... status 1, found "ANIS"
Disconnecting from experiment...
 

Done
 

****************************************
(Loading the last copy of my ODB)
*************************************



[caendaq@cu332 data]$ odbedit
[local:ANIS:S]/>load last.json
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
 


**********************************************
(Now trying to run my frontend and analyzer)
*********************************************
 


[caendaq@cu332 ANIS]$ ./start_daq.sh

mlogger: no process found
fevme: no process found
manalyzer.exe: no process found
manalyzer_example_cxx.exe: no process found
roody: no process found
[ODBEdit,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" is 
corrupted, mismatch of buffer name in shared memory ""
 

11:38:30 [ODBEdit,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" 
is corrupted, mismatch of buffer name in shared memory ""
Becoming a daemon...
Becoming a daemon...
Please point your web browser to http://localhost:8081
To look at live histograms, run: roody -Hlocalhost
Or run: mozilla http://localhost:8081
[caendaq@cu332 ANIS]$ Frontend name          :     fevme
Event buffer size      :     1048576
User max event size    :     204800
User max frag. size    :     1048576
# of events per buffer :     5
 

Connect to experiment ANIS...

OK

[fevme,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" is 
corrupted, mismatch of buffer name in shared memory ""
[fevme,ERROR] [mfe.cxx:596:register_equipment,ERROR] Cannot open event buffer 
"SYSTEM" size 33554432, bm_open_buffer() status 219

Has anyone ever encountered these issues?

Thanks for your time.

Isaac
Entry  21 Jan 2021, Thomas Lindner, Info, Using external ELOG with newer mhttpd 
A warning, in case others have the same problem I had.

In the past you could configure mhttpd so that the 'Elog' button would redirect to an external ELOG server; to do this you only needed to create and set the ODB variable '/Elog/URL' to the URL of your external ELOG server.

But with the newer MIDAS you need to set two ODB variables:

* "/Elog/URL" needs to be set to the URL of the external ELOG.
* "/Elog/External Elog" needs to be set to 'y'

I hadn't noticed this and was confused why my Elog button wasn't working after upgrading MIDAS.

MIDAS documentation was updated to reflect this change:

https://midas.triumf.ca/MidasWiki/index.php/Electronic_Logbook_(ELOG)
Entry  02 Mar 2021, Konstantin Olchanski, Info, shortest possible sleep 
since I am implementing a polled equipment, I was curious what is the smallest possible sleep time on current computers.

in current UNIX, there are 2 system calls available for sleeping: select() (with microsecond granularity) and nanosleep() (with nanosecond granularity).

So I wrote a little test program to check it out (progs/test_sleep).

First, Linux result using select(). Typical run on AMD 3700X CPU (4.1 GHz turbo boost) with Ubuntu LTS 20, linux kernel 5.8:

daq13:midas$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1003368.855 usec actual, 100336.885 usec actual per loop, oversleep 336.885 usec, 0.3%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1008512.020 usec actual, 10085.120 usec actual per loop, oversleep 85.120 usec, 0.9%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1062137.842 usec actual, 1062.138 usec actual per loop, oversleep 62.138 usec, 6.2%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1528650.999 usec actual, 152.865 usec actual per loop, oversleep 52.865 usec, 52.9%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  6250898.123 usec actual, 62.510 usec actual per loop, oversleep 52.510 usec, 525.1%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total, 54056918.144 usec actual, 54.057 usec actual per loop, oversleep 53.057 usec, 5305.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,   210875.988 usec actual, 0.211 usec actual per loop, oversleep 0.111 usec, 110.9%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total,   204804.897 usec actual, 0.205 usec actual per loop, oversleep 0.195 usec, 1948.0%
daq13:midas$ 

How to read this:

First line is 10 sleeps of 100 ms, for a total of 1 sec. this actually sleeps for a bit longer,
average over-sleep is 300 usec out of 100 ms is 0.3%.

Next few lines use progressively shorter sleep, 10 ms, 1 ms and 0.1 ms. over-sleep is consistently around 50-60 usec,
which I conclude to be this linux sleep granularity.

Last two lines try sleep for 0.1 usec and 0.01 usec, resulting in a zero-time sleep of select(),
so we just measure the average time cost of a linux syscall, around 200 ns in this machine.

Going to different machines:

Intel E-2236 (4.8 GHz tutboboost), Ubuntu LTS 20, linux kernel 5.8: over-sleep is 60 usec, zero-sleep is 400 ns.
Intel E-2226G (same, see arc.intel.com), CentOS-7, linux kernel 3.10: over-sleep is 60 usec, zero-sleep is 600 ns.
VME processor (2 GHz Intel T7400), Ubuntu 20, linux kernel 5.8: over-sleep is 60 usec, zero-sleep is 1700 ns.

This is pretty consistent, select() over-sleep is 60 usec on all hardware, zero-sleep tracks CPU GHz ratings.

Next, MacOS result, MacBookAir2020, MacOS 10.15.7, CPU 1.2 GHz i7-1060G7:

4ed0:midas olchansk$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1031108.856 usec actual, 103110.886 usec actual per loop, oversleep 3110.886 usec, 3.1%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1091104.984 usec actual, 10911.050 usec actual per loop, oversleep 911.050 usec, 9.1%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1270800.829 usec actual, 1270.801 usec actual per loop, oversleep 270.801 usec, 27.1%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1370345.116 usec actual, 137.035 usec actual per loop, oversleep 37.035 usec, 37.0%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  1706473.112 usec actual, 17.065 usec actual per loop, oversleep 7.065 usec, 70.6%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total,  5150341.034 usec actual, 5.150 usec actual per loop, oversleep 4.150 usec, 415.0%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,   595654.011 usec actual, 0.596 usec actual per loop, oversleep 0.496 usec, 495.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total,   591560.125 usec actual, 0.592 usec actual per loop, oversleep 0.582 usec, 5815.6%
4ed0:midas olchansk$ 

things are quite different here, OS is Mach microkernel with an oldish FreeBSD UNIX single-server (from NextSTEP),
so the sleep granularity is different, better than linux. zero-sleep still measures the syscall time, 600 ns on this machine.

Next we measure the same using the nansleep() syscall.

daq13:midas$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1004133.940 usec actual, 100413.394 usec actual per loop, oversleep 413.394 usec, 0.4%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1046117.067 usec actual, 10461.171 usec actual per loop, oversleep 461.171 usec, 4.6%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1096894.979 usec actual, 1096.895 usec actual per loop, oversleep 96.895 usec, 9.7%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1526744.843 usec actual, 152.674 usec actual per loop, oversleep 52.674 usec, 52.7%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  6250154.018 usec actual, 62.502 usec actual per loop, oversleep 52.502 usec, 525.0%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total, 53344123.125 usec actual, 53.344 usec actual per loop, oversleep 52.344 usec, 5234.4%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total, 52641665.936 usec actual, 52.642 usec actual per loop, oversleep 52.542 usec, 52541.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total, 52637501.001 usec actual, 52.638 usec actual per loop, oversleep 52.628 usec, 526275.0%
daq13:midas$ 

Here everything is simple. sleep longer than 1000 usec works the same as select(), sleep for shorter than 100 usec sleeps for 52 usec, regardless of what 
we ask for.

MacOS does no better, long sleeps are same as select(), sleeps is 1 usec or less sleep for too long. no improvement over select().

4ed0:midas olchansk$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1023327.827 usec actual, 102332.783 usec actual per loop, oversleep 2332.783 usec, 2.3%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1130330.086 usec actual, 11303.301 usec actual per loop, oversleep 1303.301 usec, 13.0%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1333846.807 usec actual, 1333.847 usec actual per loop, oversleep 333.847 usec, 33.4%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1402330.160 usec actual, 140.233 usec actual per loop, oversleep 40.233 usec, 40.2%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  2034706.831 usec actual, 20.347 usec actual per loop, oversleep 10.347 usec, 103.5%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total,  6646192.074 usec actual, 6.646 usec actual per loop, oversleep 5.646 usec, 564.6%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,  7556284.189 usec actual, 7.556 usec actual per loop, oversleep 7.456 usec, 7456.3%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total, 15720005.035 usec actual, 15.720 usec actual per loop, oversleep 15.710 usec, 157100.1%
4ed0:midas olchansk$ 

On Linux, strace tells us that the actual syscall behind nanosleep() is this:
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=10000}, 0x7fffc159e200) = 0

Let's try it directly... result is the same.
Let's try it with CLOCK_MONOTONIC... result is the same.

The man page of clock_nanosleep() specifies that this syscall always suspends the calling thread,
so what we see here is the Linux scheduler tick size.

Bottom line.

On current linux, shortest sleep is around 100 usec both select() and nanosleep().
On MacOS, shortest sleep is down to 5 usec using select(), but I cannot tell if CPU sleeps or busy-loops.

select() is still the best syscall for sleeping.

K.O.
    Reply  02 Mar 2021, Stefan Ritt, Info, shortest possible sleep 
Why do you need that? Periodic equipment typically runs ever ten seconds or so, meaning one can do this easily in a scheduler.

For polled equipment, you don't want to sleep at all. Because if you sleep, you might miss an event. That's why I put my poll in mfe.c into a for() loop. No 
sleep, maximum polling rate. I just double checked on my macbook air. 

- If poll is always false (no event available), the loop executes 50M times in 100ms (calibrated during startup of the frontend). That means one iteration 
takes 2ns (!). So if an event occurs, the readout is started with a 2ns overhead. No sleep can beat that. In a real world application, one has to add of course 
the VME access or so to poll for the event.

- If poll is always true, the framework generates about 700k events each second (returning jus a few bytes of event data).

So if one adds any sleep here, things can get only worse, so I don't see the point for that. Of course polling eats one kernel at 100%, but these days every 
CPU has more than one, even my 800 MHz Xilinx embedded ARM CPU (Zynq).

Best,
Stefan
    Reply  03 Mar 2021, Konstantin Olchanski, Info, shortest possible sleep 
> Why do you need that?

UNIX/POSIX advertises functions for sleeping in microseconds and nanoseconds,
for sure it is interesting to know what they actually do and what happens
when you ask them to sleep for 1 microsecond or 1 nanosecond.

To sleep or not to sleep that is a question.

But if I do decide to sleep, and I call the sleep function, I want to know what actually happens.

Now I do and I share it with all.

On current Linux, shortest sleep is around 60 usec. select() with sleep
shorter than that will not sleep at all, nanosleep() will always sleep for
the shortest amount.

P.S. For fans of interrupts ("because they are fast"), sleep waiting for interrupt
probably has same latency/granularity as above (60 usec), so if I drive a DMA engine
and I except the DMA transfer to complete under 60 usec, I should use a busy loop
to poll the "DMA done" bit instead of going to sleep and wait for the DMA interrupt.

K.O.
    Reply  30 Mar 2021, Konstantin Olchanski, Info, INT64/UINT64/QWORD not permitted in ODB and history... Change of TID_xxx data types 
> We have to request of a 64-bit integer data type to be included in MIDAS banks.
> Since 64-bit integers are on some systems "long" and on other systems "long long",
> I decided to create the two new data types
> 
> TID_INT64
> TID_UINT64
> 

These 64-bit data types do not work with ODB and they do not work with the MIDAS history.

As of commits on 30 March 2021, mlogger will refuse to write them to the history and 
db_create_key() will refuse to create them in ODB.

Why these limitations:

a1) all reading of history is done using the "double" data type, IEEE-754 double precision 
floating point numbers have around 53 bits of precision and are too small to represent all 
possible values of 64-bit integers.
a2) SQL, SQLite and FILE history know nothing about reading and writing 64-bit integer data 
types (this should be easy to fix, as long as MySQL/MariaDB and PostgresQL support it)

b1) in ODB, odbedit and mhttd web pages do not display INT64/UINT64/QWORD data
b2) ODB save and restore from odb, xml and json formats most likely does not work for these 
data types

Fixing all this is possible, with a medium amount of work. As long as somebody needs it. 
Display of INT64/UINT64/QWORD on history plots will probably forever be truncated to 
"double" precision.

K.O.
Entry  04 Apr 2021, Konstantin Olchanski, Info, bk_init32a data format 
In April 4th 2020 Stefan added a new data format that fixes the well known problem with alternating banks being 
misaligned against 64-bit addresses. (cannot find announcement on this forum. midas commit 
https://bitbucket.org/tmidas/midas/commits/541732ea265edba63f18367c7c9b8c02abbfc96e)

This brings the number of midas data formats to 3:

bk_init: bank_header_flags set to 0x0000001 (BANK_FORMAT_VERSION)
bk_init32: bank_header_flags set to 0x0000011 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT)
bk_init32a: bank_header_flags set to 0x0000031 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT | BANK_FORMAT_64BIT_ALIGNED;

TMEvent (midasio and manalyzer) support for "bk_init32a" format added today (commit 
https://bitbucket.org/tmidas/midasio/commits/61b7f07bc412ea45ed974bead8b6f1a9f2f90868)

TMidasEvent (rootana) support for "bk_init32a" format added today (commit 
https://bitbucket.org/tmidas/rootana/commits/3f43e6d30daf3323106a707f6a7ca2c8efb8859f)

ROOTANA should be able to handle bk_init32a() data now.

TMFE MIDAS c++ frontend switched from bk_init32() to bk_init32a() format (midas commit 
https://bitbucket.org/tmidas/midas/commits/982c9c2f8b1e329891a782bcc061d4c819266fcc)

K.O.
    Reply  04 Apr 2021, Konstantin Olchanski, Info, Change of TID_xxx data types 
> 
> To be consistent, I renamed the old types:
> 
> TID_DWORD      -> TID_UINT32
> TID_INT        -> TID_INT32
> 

this created an incompatibility with old XML save files,
old versions of midas cannot load new XML save files,
variable types have changed i.e. from "INT" to "INT32".

it would have been better if XML save files kept using the old names.

now packages that read midas XML files also need updating.

specifically, in ROOTANA:
- the old TVirtualOdb/XmlOdb.cxx (no longer used, deleted),
- mvodb/mxmlodb.cxx

K.O.
Entry  05 Apr 2021, Konstantin Olchanski, Info, blog - convert mfe frontend to tmfe c++ framework 
notes from converting ALPHA-g chronobox frontend fechrono to tmfe c++ framework.

the chronobox device is a timestamp/low resolution tdc/scaler/generic TTL and ECL io
mainboard with an altera DE10_NANO plugin board. it has a cyclone-5 FPGA SOC running Raspbian linux.
FPGA communication is done by avalon-bus memory mapped registers, main data readout
is PIO from an FPGA 32-bit wide FIFO (no DMA yet).

- login to main computer (daq16)
- cd packages
- git clone https://bitbucket.org/tmidas/midas midas-develop
- cd midas-develop
- make mini ### creates linux-x86_64/{bin,lib}
- ssh agdaq@cb02 ### private network
- cd ~/packages/midas-develop
- make mini ### creates linux-linux-armv7l/{bin/lib}
- cd ~/online/chronobox_software
- cat fechrono.cxx ~/packages/midas-develop/progs/tmfe_example_everything.cxx > fechrono_tmfe.cxx
- edit fechrono_tmfe.cxx:

- rename "FeEverything" to "FeChrono"
- copy contents of frontend_init() to HandleFrontendInit()
- copy contents of frontend_exit() to HandleFrontendExit()
- replace get_frontend_index() with fFe->fFeIndex
- replace "return SUCCESS" with return TMFeOk()
- replace "return !SUCCESS" with return TMFeErrorMessage("boo!!!")
- this frontend has 3 indexed equipments, copy EqEverything 3 times, rename EqEverything to EqCbHist, EqCbms, EqCbFlow
- copy contents of begin_of_run() to EqCbHist::HandleBeginRun()
- copy contents of end_of_run() to EqCbHist::HandleEndRun()
- pause_run(), resume_run() are empty, delete all HandlePauseRun() and all HandleResumeRun()
- frontend_loop() is empty, delete
- poll_event() and interrupt_configure() are empty, delete
- delete all HandleStartAbortRun(), delete all calls to RegisterTransitionStartAbort();
- examine equipment[]:
- "cbhist%02d" - periodic, copy contents of read_cbhist() to EqCbHist::HandlePeriodic()
- "cbms%02d" - polled, copy contents of read_cbms_fifo() to EqCbms::HandlePollRead()
- "cbflow%02d" - periodic, copy contents of read_flow() to EqCbFlow::HandlePeriodic()
- delete unused HandlePoll(), HandlePollRead() and HandlePeriodic()
- replace bk_init32() with "size_t event_size = 100*1024; char* event = (char*)malloc(event_size); ComposeEvent(event, 
event_size); BkInit(event, event_size);"
- replace bk_create(pevent) with BkOpen(event)
- replace bk_close(pevent, ...) with BkClose(event, ...)
- replace "return bk_size(pevent)" with "EqSendEvent(event); free(event);"
- remove unused example SendData()
- if there linker complains about references to "hDB", add "HNDLE hDB" is global scope, add "hDB = fMfe->fDB"
- replace set_equipment_status() with EqSetStatus()
- move equipment configuration from the equipment[] array to the equipment constructors
- remove unused HandleRpc()
- remove unused HandleBeginRun() and unused HandleEndRun()
- remove all example code from HandleInit(), breakup frontend_init() code into per-equipment HandleInit() functions
- EqCbms::HandlePoll() replace all example code with "return true"
- if desired, replace ODB functions from utils.cxx with MVOdb RI(), RD(), etc
- if desired, replace cm_msg() with Msg() and delete "const char* frontend_name"
- update FeChrono() constructor:
      FeSetName("fechrono%02d");
      FeAddEquipment(new EqCbHist("cbhist%02d", __FILE__));
      FeAddEquipment(new EqCbms("cbms%02d", __FILE__));
      FeAddEquipment(new EqCbFlow("cbflow%02d", __FILE__));
- build:
g++ -std=c++11 -Wall -Wuninitialized -g -Ialtera -Dsoc_cv_av -I/home/agdaq/packages/midas-develop/include -
I/home/agdaq/packages/midas-develop/mvodb -c fechrono_tmfe.cxx
g++ -o fechrono_tmfe.exe -std=c++11 -Wall -Wuninitialized -g -Ialtera -Dsoc_cv_av -I/home/agdaq/packages/midas-develop/include 
-I/home/agdaq/packages/midas-develop/mvodb fechrono_tmfe.o utils.o cb.o /home/agdaq/packages/midas-develop/linux-
armv7l/lib/libmidas.a -lm -lz -lutil -lnsl -lpthread -lrt
- run:
- bombs on bm_set_cache_size(), reduce default cache size, old mserver cannot deal with the new default size, set 
fEqConfWriteCacheSize = 100*1024;
- run:
- prints too many messages, comment out print "HandlePollRead!"
- run:
- good now!

success, was not too bad.

also:
- replace gHaveRun with fMfe->fStateRunning
- replace gRunNumber with fMfe->fRunNumber

see tmfe.md section "variables provided by the framework"

K.O.
    Reply  05 Apr 2021, Konstantin Olchanski, Info, blog - convert mfe frontend to tmfe c++ framework 
Result is here:
https://bitbucket.org/expalpha/chronobox_software/src/master/fechrono_tmfe.cxx

Original code is in fechrono.cxx. Not super pretty, but representative of most mfe-based frontends
we see around here. A good example of why the old mfe.c structure no longer works so well for us.

After conversion to tmfe, we do not win a beauty contest yet, but the path for further
clean up and refactoring into better c++ is quite clear. (And it is very obvious where
the missing "event object" wants to be here)

K.O.
    Reply  13 Apr 2021, Konstantin Olchanski, Info, bk_init32a data format 
Until commit a4043ceacdf241a2a98aeca5edf40613a6c0f575 today, mdump mostly did not work with bank32a data.
K.O.


> In April 4th 2020 Stefan added a new data format that fixes the well known problem with alternating banks being 
> misaligned against 64-bit addresses. (cannot find announcement on this forum. midas commit 
> https://bitbucket.org/tmidas/midas/commits/541732ea265edba63f18367c7c9b8c02abbfc96e)
> 
> This brings the number of midas data formats to 3:
> 
> bk_init: bank_header_flags set to 0x0000001 (BANK_FORMAT_VERSION)
> bk_init32: bank_header_flags set to 0x0000011 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT)
> bk_init32a: bank_header_flags set to 0x0000031 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT | BANK_FORMAT_64BIT_ALIGNED;
> 
> TMEvent (midasio and manalyzer) support for "bk_init32a" format added today (commit 
> https://bitbucket.org/tmidas/midasio/commits/61b7f07bc412ea45ed974bead8b6f1a9f2f90868)
> 
> TMidasEvent (rootana) support for "bk_init32a" format added today (commit 
> https://bitbucket.org/tmidas/rootana/commits/3f43e6d30daf3323106a707f6a7ca2c8efb8859f)
> 
> ROOTANA should be able to handle bk_init32a() data now.
> 
> TMFE MIDAS c++ frontend switched from bk_init32() to bk_init32a() format (midas commit 
> https://bitbucket.org/tmidas/midas/commits/982c9c2f8b1e329891a782bcc061d4c819266fcc)
> 
> K.O.
    Reply  14 Apr 2021, Stefan Ritt, Info, INT64/UINT64/QWORD not permitted in ODB and history... Change of TID_xxx data types 
> These 64-bit data types do not work with ODB and they do not work with the MIDAS history.

They were never meant to work with the history. They were primarily implemented to put large 64-
bit data words into midas banks. We did not yet have a request to put these values into the ODB. 
Once such a request comes, we can address this.

Stefan
Entry  06 May 2021, Ben Smith, Info, New feature in odbxx that works like db_check_record() 
For those unfamiliar, odbxx is the interface that looks like a C++ map, but automatically syncs with the ODB - https://midas.triumf.ca/MidasWiki/index.php/Odbxx.

I've added a new feature that is similar to the existing odb::connect() function, but works like the old db_check_record(). The new odb::connect_and_fix_structure() function:
- keeps the existing value of any keys that are in both the ODB and your code
- creates any keys that are in your code but not yet in the ODB
- deletes any keys that are in the ODB but not your code
- updates the order of keys in the ODB to match your code

This will hopefully make it easier to automate ODB structure changes when you add/remove keys from a frontend.

The new feature is currently in the develop branch, and should be included in the next release.
ELOG V3.1.4-2e1708b5