Back Midas Rome Roody Rootana
  Midas DAQ System, Page 1 of 41  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
Entry  19 Aug 2022, Konstantin Olchanski, Bug Fix, "Detected duplicate or non-monotonous data" in history files 
serious (but rare) bug was fixed in the history reader. unlucky experiment would see 
errors about "Detected duplicate or non-monotonous data" in some history file, fixed by 
removing/renaming the offending file. (reported by MEG experiment)

it turns out there was nothing wrong with the data files (good), but there
was a nasty bug in the history reader. it did not ensure that we read history
files in chronological order. under some conditions order of files could be
reversed, older files would be read after newer files and trip the built-in
protection against returning non-monotonically increasing history data to the user.

fixed commit 
https://bitbucket.org/tmidas/midas/commits/9893f85ebe33e96cc63f501a0f89e1f8932c894d

for more details, see https://bitbucket.org/tmidas/midas/issues/350/file-history-non-
monotonic-time

K.O.
    Reply  23 Aug 2022, Konstantin Olchanski, Bug Fix, "Detected duplicate or non-monotonous data" in history files 
> serious (but rare) bug was fixed in the history reader.

previous fix was incomplete. please update to git commit
https://bitbucket.org/tmidas/midas/commits/b343c3c98e4e6fd00a00cf686c74c7ccc6da0c63

K.O.
       Reply  17 Nov 2022, Konstantin Olchanski, Bug Fix, "Detected duplicate or non-monotonous data" in history files 
> > serious (but rare) bug was fixed in the history reader.
> previous fix was incomplete. please update to git commit
> https://bitbucket.org/tmidas/midas/commits/b343c3c98e4e6fd00a00cf686c74c7ccc6da0c63

a race condition between reading history file in mhttpd and writing history file in 
mlogger was accidentally introduced. mhttpd would file spurious errors about "timestamp 
is after last timestamp".

fixed, please update to git commit
https://bitbucket.org/tmidas/midas/commits/7a9f6e0c58ffddcacb9ee19934ce3e2033a805ef

fix race condition in history file reader - a race condition was added accidentally - 
first the reader remembers the history file size and the time of the last entry, then it 
goes to read the file and bombs if at the same time mlogger added more entries - their 
time is after the remembered time of last entry and error "timestamp is after last 
timestamp" is triggered.

K.O.
Entry  11 Nov 2022, Frederik Wauters, Bug Fix, O_CREAT in open in split.cxx 
midas currently does not compile on linux

/usr/include/x86_64-linux-gnu/bits/fcntl2.h:50:24: error: call to ‘__open_missing_mode’ declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
   50 |    __open_missing_mode ();

giving the mode is mandatory: https://man7.org/linux/man-pages/man2/open.2.html

fix is to give open in midas/examples/lowlevel/split.cxx a default mode, e.g. 006600
    Reply  12 Nov 2022, Stefan Ritt, Bug Fix, O_CREAT in open in split.cxx 
> midas currently does not compile on linux
> 
> /usr/include/x86_64-linux-gnu/bits/fcntl2.h:50:24: error: call to ‘__open_missing_mode’ declared with attribute error: open with O_CREAT or O_TMPFILE in second argument needs 3 arguments
>    50 |    __open_missing_mode ();
> 
> giving the mode is mandatory: https://man7.org/linux/man-pages/man2/open.2.html
> 
> fix is to give open in midas/examples/lowlevel/split.cxx a default mode, e.g. 006600

Thanks. Fixed.

Stefan
       Reply  17 Nov 2022, Konstantin Olchanski, Bug Fix, O_CREAT in open in split.cxx 
> > midas currently does not compile on linux
> > fix is to give open in midas/examples/lowlevel/split.cxx a default mode, e.g. 006600

I got more warnings from split.cxx, looked at the code and see so many problems that it is easier
to delete it than it is to fix it.

Check for end of file is done incorrectly (check for read() return 0, -1 or short read),
memory overrun if given file name is longer than 80 bytes, no check for valid event length
read from the file, and so on and so on.

A better example for reading and writing midas files is in midasio/test_midasio.cxx. Proper c++ coding, and can read compressed files.

K.O.
Entry  05 Nov 2022, Zaher Salman, Suggestion, histories capture 'ruy' 
The histories capture key events from 'r' 'u' 'y' and 'Escape' for various functions like rescaling etc. However, this also means that if we include an editable modbvalue and a history in the same custom page then changing the modbvalue to something that includes 'ruy' is not possible.

In mhistory.js we have

// Keyboard event handler (has to be on the window!)
window.addEventListener("keydown", this.keyDown.bind(this));

I am not sure why it "has to be on the window". For now, I am bypassing this issue by changing the event listener to "keyup" but maybe there is a more elegant solution for this. Adding the event listener to the div element that includes the history does not seem to work.
Entry  22 Oct 2022, Lars Martin, Suggestion, read_only odbxx? 
I really like the concept of the odbxx interface.
I think it would be a nice feature if one could have a read_only connection, e.g. by declaring a "const midas::odb".
Just for fun I tried if this already works, but the compiler doesn't allow const midas::odb for e.g. the [] operator. I'm guessing this would be non-trivial to implement, but I like the idea of certain Midas clients being able to read the odb without risking corruption.
    Reply  24 Oct 2022, Stefan Ritt, Suggestion, read_only odbxx? 
> I really like the concept of the odbxx interface.
> I think it would be a nice feature if one could have a read_only connection, e.g. by declaring a "const midas::odb".
> Just for fun I tried if this already works, but the compiler doesn't allow const midas::odb for e.g. the [] operator. I'm guessing this would be non-trivial to implement, but I like the idea of certain Midas clients being able to read the odb without risking corruption.

Having a "const midas::odb" probably does not work (at least I would not know how to implement that).

But I could make an internal flag analog to the auto refresh flags. So you would have

    o.set_write_protect(true);

to turn on write protection. Would that work for you?

Best,
Stefan
       Reply  26 Oct 2022, Lars Martin, Suggestion, read_only odbxx? 
> Having a "const midas::odb" probably does not work (at least I would not know how to implement that).
> 
> But I could make an internal flag analog to the auto refresh flags. So you would have
> 
>     o.set_write_protect(true);
> 
> to turn on write protection. Would that work for you?

Absolutely. Looking at the underlying code I was also at a loss how const would work.
I'm mostly just interested in having small clients that only read from the odb (for whatever reason) without risking corrupting it by messing something 
up in the code, especially since such small clients are almost by definition hacked together quickly on the fly.
          Reply  29 Oct 2022, Stefan Ritt, Suggestion, read_only odbxx? 
Ok, I implemented and committed that. Just call

o.set_write_protect(true)

on a key you don't want to modify. If you do so, an exception gets thrown.

Best,
Stefan
Entry  14 Oct 2022, Lars Martin, Suggestion, Allow onchange to refer to arbitrary js function 
Maybe this is already possible, I have a hard time understanding the mhttpd source code.

I would like to use a function defined in the <script> block of my custom page as an onchange callback.

Specific example:
I have an modbthermo that I would like to change to three different colours for "too cold", "just right", and "too hot" (measuring porridge, presumably). The examples only show the explicit (condition)?(val1):(val2) syntax, which doesn't allow more than two values, so I had hoped to replace

onchange="this.dataset.color=this.value > 40?'red':'blue';"

with something like

onchange="this.dataset.color=check_Temp(this.value);"

or

onchange="check_Temp(this.value, this.dataset.color);"

if that's easier somehow. The function itself would then return the colour string, or set the color argument to that string (I'm not sure if JS passes references or just values.)

Is this a possibility?
    Reply  14 Oct 2022, Ben Smith, Info, Allow onchange to refer to arbitrary js function 
> I would like to use a function defined in the <script> block of my custom page as an onchange callback.
>
> Is this a possibility?

Yes, this is already possible. An example was shown in the "modb" section of the custom page documentation, but not in the "Changing properties of controls dynamically" section. I've updated the wiki with an example.

https://daq00.triumf.ca/MidasWiki/index.php/Custom_Page#Changing_properties_of_controls_dynamically
       Reply  22 Oct 2022, Lars Martin, Info, Allow onchange to refer to arbitrary js function 
I figured I wasn't the first to have this idea.
Works great, thanks!
          Reply  22 Oct 2022, Lars Martin, Info, Allow onchange to refer to arbitrary js function 
Actually, now that I look again, there is a mistake in the instructions:
you say

onchange="this.dataset.color=check_therm(this)"

but check_therm doesn't return anything and instead changes the color itself. So you either want the function to return the string and use the above assignment, or use the function you provide and use

onchange="check_therm(this)"
Entry  10 Oct 2022, Zaher Salman, Suggestion, JSON-RPC function to read files mjsonrpc_user.cxx
Hello ,

The midas sequencer uses the function js_seq_list_files to get a list of files in the /Sequencer/State/Path with extension *.msl. It would be nice to generalize this function to be able to read files with other (or any) extension.

Based on the js_seq_list_files I added a function in js_any_list_files mjsonrpc_user.cxx (attached) which does the job. Maybe a better/safer implementation can be made in midas. Are there any plans to do this?

thanks.
Entry  21 Aug 2022, Joseph McKenna, Suggestion, mvodb functionality to get the 'LastWritten' property of a key 


I want to read data from the ODB with the mvodb interface in one of my frontends, it's useful to know how old that data is, so I prototyped functionality in a pull request to mvodb:

https://bitbucket.org/tmidas/mvodb/pull-requests/2/add-readkeylastwritten-function-to-extract
    Reply  22 Aug 2022, Stefan Ritt, Suggestion, mvodb functionality to get the 'LastWritten' property of a key 
> I want to read data from the ODB with the mvodb interface in one of my frontends, it's useful to know how old that data is, so I prototyped functionality in a pull request to mvodb:
> 
> https://bitbucket.org/tmidas/mvodb/pull-requests/2/add-readkeylastwritten-function-to-extract

Thanks for raising that point. I realized that the odbxx API was also missing that functionality, so I added it: 
https://bitbucket.org/tmidas/midas/commits/6991a92c19292eaf67721cb80f182c61db077f45

Best,
Stefan
Entry  15 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
Firefox is hanging/becoming unresponsive due to javascript code. After stopping the script manually to get firefox back in control I have the following message in the console

17:21:28.821 Script terminated by timeout at:
MhistoryGraph.prototype.drawTAxis@http://lem03.psi.ch:8081/mhistory.js:2828:7
MhistoryGraph.prototype.draw@http://lem03.psi.ch:8081/mhistory.js:1792:9
mhistory.js:2828:7

Any ideas how to resolve this??
    Reply  15 Aug 2022, Stefan Ritt, Bug Report, firefox hangs due to mhistory 
> Firefox is hanging/becoming unresponsive due to javascript code. After stopping the script manually to get firefox back in control I have the following message in the console
> 
> 17:21:28.821 Script terminated by timeout at:
> MhistoryGraph.prototype.drawTAxis@http://lem03.psi.ch:8081/mhistory.js:2828:7
> MhistoryGraph.prototype.draw@http://lem03.psi.ch:8081/mhistory.js:1792:9
> mhistory.js:2828:7
> 
> Any ideas how to resolve this??

I have to reproduce the problem. Can you send me the full URL from your browser when you see that problem? Probably you have some "special" axis limits, so we don't see that 
problem anywhere else.

Stefan
       Reply  16 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
> > Firefox is hanging/becoming unresponsive due to javascript code. After stopping the script manually to get firefox back in control I have the following message in the console
> > 
> > 17:21:28.821 Script terminated by timeout at:
> > MhistoryGraph.prototype.drawTAxis@http://lem03.psi.ch:8081/mhistory.js:2828:7
> > MhistoryGraph.prototype.draw@http://lem03.psi.ch:8081/mhistory.js:1792:9
> > mhistory.js:2828:7
> > 
> > Any ideas how to resolve this??
> 
> I have to reproduce the problem. Can you send me the full URL from your browser when you see that problem? Probably you have some "special" axis limits, so we don't see that 
> problem anywhere else.
> 
> Stefan

Hi Stefan and Konstantin,

The URL (reachable only within PSI) is http://lem03.psi.ch:8081/?cmd=custom&page=Mudas
Firefox is version 91.12.0esr (64-bit), but I had similar issues with chrome/chromium too.
The hangs seem to happen randomly so I have not been able to reproduce it yet. 
I have histories here  http://lem03.psi.ch:8081/?cmd=custom&page=Mudas&tab=3 (30 minutes each), but I have also histories popping up in modals though they do not cause any issues. 
I'll try to reproduce it in the coming few days and report again.

thanks,
Zaher
          Reply  16 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
I found the bug. The problem is triggered by changing the firefox window. This calls a function that is supposed to change the size of the history plot and it works well when the history plots are visible but not if the history plots are hidden in a javascript tab (not another firefox tab).

Is there a clean way to resize the history plot if the parent div changes size?? The offending code is
mhist[i].mhg = new MhistoryGraph(mhist[i]);
mhist[i].mhg.initializePanel(i);
mhist[i].mhg.resize();
mhist[i].resize = function () {
   mhis.mhg.resize();
};
             Reply  17 Aug 2022, Stefan Ritt, Bug Report, firefox hangs due to mhistory 
The problem lies in your function mhistory_init_one() in Mudas.js:1965. You can only call "new MhistoryGraph(e)" with an element "e" which is something like

<div class="mjshistory" data-group="..." data-panel="..." data-base-u-r-l="https://host.psi.ch/?cmd=history" title="">

Please note the "data-base-u-r-l". This gets automatically added by the function mhistory_init() in mhistory.js:48. The URL is necessary sot that the upper right button in a history graph works which goes to a history page only showing the current graph.

In you function mhistory_init_one() you forgot the call

   mhist.dataset.baseURL = baseURL; 

where baseURL has to come from the current address bar like

   let baseURL = window.location.href;
   if (baseURL.indexOf("?cmd") > 0)
      baseURL = baseURL.substr(0, baseURL.indexOf("?cmd"));
   baseURL += "?cmd=history";

If you duplicate some functionality from mhistory.js, please make sure to duplicate it completely.

Best,
Stefan
                Reply  17 Aug 2022, Zaher Salman, Bug Report, firefox hangs due to mhistory 
> The problem lies in your function mhistory_init_one() in Mudas.js:1965. You can only call "new MhistoryGraph(e)" with an element "e" which is something like
> 
> <div class="mjshistory" data-group="..." data-panel="..." data-base-u-r-l="https://host.psi.ch/?cmd=history" title="">
> 
> Please note the "data-base-u-r-l". This gets automatically added by the function mhistory_init() in mhistory.js:48. The URL is necessary sot that the upper right button in a history graph works which goes to a history page only showing the current graph.
> 
> In you function mhistory_init_one() you forgot the call
> 
>    mhist.dataset.baseURL = baseURL; 
> 
> where baseURL has to come from the current address bar like
> 
>    let baseURL = window.location.href;
>    if (baseURL.indexOf("?cmd") > 0)
>       baseURL = baseURL.substr(0, baseURL.indexOf("?cmd"));
>    baseURL += "?cmd=history";
> 
> If you duplicate some functionality from mhistory.js, please make sure to duplicate it completely.
> 

Thanks Stefan, but this was not the problem since I am setting the baseURL. You may have looked at the code during my debugging.

Some of my histories are placed in an IFrame object. I eventually realized that my code fails when it tries to resize a history which is placed in an invisible IFrame. I resolved the issue by making sure that I am resizing plots only if they are in a visible IFrame.
  
                   Reply  17 Aug 2022, Stefan Ritt, Bug Report, firefox hangs due to mhistory 
> Some of my histories are placed in an IFrame object. I eventually realized that my code fails 
> when it tries to resize a history which is placed in an invisible IFrame. I resolved the issue 
> by making sure that I am resizing plots only if they are in a visible IFrame.

Just to be clear: You could resolve everything on your side, or do you need to change anything in mhistory.js?

Just a tip: IFrames are not good to put anything in. I recommend just to dynamically crate a <div> element, 
append it to the document body, make it floating and initially invisible. Then put all inside that div. Have
a look how control.js do it. This takes less resources than a complete IFrame and is much easier to handle.

Stefan
          Reply  16 Aug 2022, Konstantin Olchanski, Bug Report, firefox hangs due to mhistory 
> > > Firefox is hanging/becoming unresponsive due to javascript code.
> 
> The URL (reachable only within PSI) is http://lem03.psi.ch:8081/?cmd=custom&page=Mudas

so malfunction is not in the midas history page, but in a custom page. I could help you debug it,
but you would have to provide the complete source code (javascript and html).

> Firefox is version 91.12.0esr (64-bit), but I had similar issues with chrome/chromium too.

my firefox is 103.something. when you say google-chrome has "similar issues",
I read it as "google-chrome does not show this same bug, but shows some other
bug somewhere else". (if I misread you, you have to write better).

but this gives you a front to attack your bugs. basically all browsers should render your
custom page exactly the same (unless you use some obscure or experimental feature, which I
recommend against).

so you tweak your page to identify the source of different rendering results, and try to eliminate it,
hopefully by the time you get your page render exactly the same everywhere, all the real bugs
have gotten shaken out, too. (this is similar to debugging a c++ program by compiling
it on linux, mac, windows, vax, raspbery pi, etc and checking that you get the same result everywhere).

> The hangs seem to happen randomly so I have not been able to reproduce it yet.

I find that javascript debuggers are not setup to debug hangs. I think debugger runs partially
inside the same javascript engine you are debugging, so both hang and debugging is impossible.

(latest google-chrome has another improvement, all pages from the same computer run in the same
javascript engine, so if one midas page stops (on exception or because I debug it), all midas pages
stop and I have to run two different browsers if I want to debug (i.e.) a history page crash
and look at odb at the same time. fun).

K.O.
Entry  05 Aug 2022, Stefan Ritt, Info, Information for midas updates though git 
Several submodules of midas have been re-organized, so if you want to pull the 
newest version, you need a 

git pull --recurse-submodules
git submodule update --init --recursive

before you can build again. To do this automatically the next time, you can do

git config submodule.recurse true

which needs git 2.14 or later. I hope this works for everybody. If there is a 
better way to do that (I'm not a big expert on git) please reply here.

Stefan
    Reply  08 Aug 2022, Konstantin Olchanski, Info, Information for midas updates though git 
> git pull --recurse-submodules
> git submodule update --init --recursive
> git config submodule.recurse true

does not work for me, macos 12.4 git 2.32.1.

after I set "submodule.recurse true", I still have to type "git submodule update --
init --recursive", without --recursive, mscb/mxml is empty and the build bombs.

P.S. the underlying issue is that the mxml submodule is now included twice 
(midas/mxml and midas/mscb/mxml) and there is nothing to enforce that both copies are 
the same. (No idea what happens if the two mxml's are different).
       Reply  08 Aug 2022, Stefan Ritt, Info, Information for midas updates though git 
> after I set "submodule.recurse true", I still have to type "git submodule update --
> init --recursive", without --recursive, mscb/mxml is empty and the build bombs.

Indeed, doesn't work for me either. If some git guru has some more insight, please post 
here!

 
> P.S. the underlying issue is that the mxml submodule is now included twice 
> (midas/mxml and midas/mscb/mxml) and there is nothing to enforce that both copies are 
> the same. (No idea what happens if the two mxml's are different).

The version of each mxml is defined by last commit of the parent repository, which contains 
the hash of the submodule version. If we have to update mxml for some reason, we have to 
commit also mscb with the new version, and then midas with the same version of mxml. If one 
checks out midas then with 

git clone https://bitbucket.org/tmidas/midas --recursive

one gets the same versions for mxml.
Entry  08 Aug 2022, Konstantin Olchanski, Info, odb disallow key names that start or end with spaces 
while testing the new odb editor, we ran into a number of problems with key names 
that start or end with spaces. we cannot think of any valid use case for such key 
names (subdirectories and variables) and we think they could only have been 
created by mistake. ODB now disallows such names. K.O.
Entry  08 Aug 2022, Konstantin Olchanski, Info, midas on ubuntu LTS 22.04 
reporting that as of commit 78f707c0686d22f8329c7a1f1c46d7dccf35ceff, midas builds 
without errors or warnings on Ubuntu LTS 22.04, 20.04, CentOS-7 and MacOS 12.4. 
(except for some warnings from mscb and msc). K.O.
Entry  06 Aug 2022, Stefan Ritt, Info, Improvement of odbxx API 
While the odbxx API has been successfully used since the last months, a potential 
problem with large ODBs surfaced. If you have lots of data in the ODB and load it 
into an object like

midas::odb o("/Equipment");

this might take quite long, since each ODB value is fetched separately, which is 
very quick on a local machine but can take long over a client-server connection. 
For large experiments this can take up to minutes (!).

To get rid of this problem, the underlying object model has been modified. When an 
object is instantiated like above, then the whole ODB tree is fetched in an XML 
buffer in a single transfer, which even for large ODBs usually takes much less 
than a second. Then the XML buffer is decomposed on the client side and converted 
into the proper midas::odb objects. In one case this gave an improvement from 35 
seconds to 0.5 seconds which is significant. To enable the new method, the object 
can be created with a flag like

midas::odb o("/Equipment", true);

which then switches to the new method. One has to take care not to fool oneself 
(like I did) by printing the object like

midas::odb o("/Equipment", true);
std::cout << o << std::endl;

because each read access to any sub-object of o causes a separate read request to 
the server which again can take long. Therefore, one has to switch off the auto 
refresh via

midas::odb o("/Equipment", true);
o.set_auto_refresh_read(false);
std::cout << o << std::endl;

Accessing any sub-object of o then does not cause a client-server request, which 
is not necessary if all objects just have been pulled from the server before. If 
one keeps the object however for a long time in memory, one has to be aware that 
it only contains "old" values from the time if instantiation. If one needs more 
current ODB values, the auto read refresh has to be turned on again.

Stefan
    Reply  08 Aug 2022, Stefan Ritt, Info, Improvement of odbxx API 
After some thought, I changed the API again and removed the flag in the constructor,
so the system now automatically choses the best algorithm depending if the client
is connected to a local or a remote API. So in all cases you use again the old syntax:

midas::odb o("/Equipment");



Stefan
Entry  18 Jul 2022, Konstantin Olchanski, Release, midas-2022-05 
There is a release branch for midas-2022-05 and corresponding git tag midas-2022-
05-b. This branch is known to be stable and is working well for the ALPHA 
experiment at CERN. Latest update to this branch fixes two problems in the 
mserver (rpc timeout and a use-after-free internal error).

https://bitbucket.org/tmidas/midas/branch/release/midas-2022-05
K.O.
Entry  25 Jun 2022, Joseph McKenna, Bug Report, RPC timeout for manalyzer over network 

In ALPHA, I get RPC timeouts running a (reasonably heavy) analyzer on a remote machine (connected directly via a ~30 meter 10Gbe Ethernet cable) after ~5 minutes of running. If I run the analyser locally, I dont not see a timeout...

gdb trace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff5d35859 in __GI_abort () at abort.c:79
#2  0x00005555555a2a22 in rpc_call (routine_id=11111) at /home/alpha/packages/midas/src/midas.cxx:13866
#3  0x000055555562699d in bm_receive_event_rpc (buffer_handle=buffer_handle@entry=2, buf=buf@entry=0x0, buf_size=buf_size@entry=0x0, ppevent=ppevent@entry=0x0, pvec=pvec@entry=0x7fffffffd700,
    timeout_msec=timeout_msec@entry=100) at /home/alpha/packages/midas/src/midas.cxx:10510
#4  0x0000555555631082 in bm_receive_event_vec (buffer_handle=2, pvec=pvec@entry=0x7fffffffd700, timeout_msec=timeout_msec@entry=100) at /home/alpha/packages/midas/src/midas.cxx:10794
#5  0x0000555555673dbb in TMEventBuffer::ReceiveEvent (this=this@entry=0x555557388b30, e=e@entry=0x7fffffffd700, timeout_msec=timeout_msec@entry=100) at /home/alpha/packages/midas/src/tmfe.cxx:312
#6  0x0000555555607b56 in ReceiveEvent (b=0x555557388b30, e=0x7fffffffd6c0, timeout_msec=100) at /home/alpha/packages/midas/manalyzer/manalyzer.cxx:1411
#7  0x000055555560d8dc in ProcessMidasOnlineTmfe (args=..., progname=<optimized out>, hostname=<optimized out>, exptname=<optimized out>, bufname=<optimized out>, event_id=<optimized out>,
    trigger_mask=<optimized out>, sampling_type_string=<optimized out>, num_analyze=0, writer=<optimized out>, multithread=<optimized out>, profiler=<optimized out>,
    queue_interval_check=<optimized out>) at /home/alpha/packages/midas/manalyzer/manalyzer.cxx:1534
#8  0x000055555560f93b in manalyzer_main (argc=<optimized out>, argv=<optimized out>) at /usr/include/c++/9/bits/basic_string.h:2304
#9  0x00007ffff5d37083 in __libc_start_main (main=0x5555555b1130 <main(int, char**)>, argc=8, argv=0x7fffffffdda8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7fffffffdd98) at ../csu/libc-start.c:308
#10 0x00005555555b184e in _start () at /usr/include/c++/9/bits/stl_vector.h:94

Any suggestions? Many thanks
    Reply  18 Jul 2022, Konstantin Olchanski, Bug Report, RPC timeout for manalyzer over network 
> In ALPHA, I get RPC timeouts running a (reasonably heavy) analyzer on a remote machine (connected directly via a ~30 meter 10Gbe Ethernet cable) after ~5 minutes of running. If I run the analyser locally, I dont not see a timeout...

there is a subtle bug in the mserver. under rare conditions, ss_suspend() will recurse in an unexpected way
and mserver will go to sleep waiting for data from a udp socket (that will never arrive, so sleep forever).
remote client will see it as an rpc timeout. in my tests (and in ALPHA-g at CERN, as reported by Joseph),
I see this rare condition to happen about every 5 minutes. in normal use, this is the first time we become
aware of this problem, the best I can tell this bug was in the mserver since day one.

commit https://bitbucket.org/tmidas/midas/commits/fbd06ad9d665b1341bd58b0e28d6625877f3cbd0
to develop and
to release/midas-2022-05

The stack trace that shows the mserver hang/crash (sleep() is the stand-in for the sleep-forever socket read).

(gdb) bt
#0  0x00007f922c53f9e0 in __nanosleep_nocancel () from /lib64/libc.so.6
#1  0x00007f922c53f894 in sleep () from /lib64/libc.so.6
#2  0x0000000000451922 in ss_suspend (millisec=millisec@entry=100, msg=msg@entry=1) at /home/agmini/packages/midas/src/system.cxx:4433
#3  0x0000000000411d53 in bm_wait_for_more_events_locked (pbuf_guard=..., pc=pc@entry=0x7f920639b93c, timeout_msec=timeout_msec@entry=100, 
    unlock_read_cache=unlock_read_cache@entry=1) at /home/agmini/packages/midas/src/midas.cxx:9429
#4  0x00000000004238c3 in bm_fill_read_cache_locked (timeout_msec=100, pbuf_guard=...) at /home/agmini/packages/midas/src/midas.cxx:9003
#5  bm_read_buffer (pbuf=pbuf@entry=0xdf8b50, buffer_handle=buffer_handle@entry=2, bufptr=bufptr@entry=0x0, buf=buf@entry=0x7f9203d75020, 
    buf_size=buf_size@entry=0x7f920639aa20, vecptr=vecptr@entry=0x0, timeout_msec=timeout_msec@entry=100, convert_flags=0, 
    dispatch=dispatch@entry=0) at /home/agmini/packages/midas/src/midas.cxx:10279
#6  0x0000000000424161 in bm_receive_event (buffer_handle=2, destination=0x7f9203d75020, buf_size=0x7f920639aa20, timeout_msec=100)
    at /home/agmini/packages/midas/src/midas.cxx:10649
#7  0x0000000000406ae4 in rpc_server_dispatch (index=11111, prpc_param=0x7ffcad70b7a0) at /home/agmini/packages/midas/progs/mserver.cxx:575
#8  0x000000000041ce9c in rpc_execute (sock=10, buffer=buffer@entry=0xe11570 "g+", convert_flags=0)
    at /home/agmini/packages/midas/src/midas.cxx:15003
#9  0x000000000041d7a5 in rpc_server_receive_rpc (idx=idx@entry=0, sa=0xde6ba0) at /home/agmini/packages/midas/src/midas.cxx:15958
#10 0x0000000000451455 in ss_suspend (millisec=millisec@entry=1000, msg=msg@entry=0) at /home/agmini/packages/midas/src/system.cxx:4575
#11 0x000000000041deb2 in rpc_server_loop () at /home/agmini/packages/midas/src/midas.cxx:15907
#12 0x0000000000405266 in main (argc=9, argv=<optimized out>) at /home/agmini/packages/midas/progs/mserver.cxx:390
(gdb) 

K.O.
Entry  19 Jun 2022, Francesco Renga, Forum, Alarm on variable not updating 
Dear all,
I've an ODB equipment that sometimes loses the connection with the hardware, so that the variables are not updated anymore. The connection can be restored by restarting the frontend. It would be useful to have an alarm based on the time from the last update of some variable (i.e. the alarm is triggered if the variable is not updated for more than X seconds). Is there a method to implement such an alarm in MIDAS?

Thank you very much,
Francesco
    Reply  20 Jun 2022, Stefan Ritt, Forum, Alarm on variable not updating 
There are two functions to do that, one check the last write access, the other the last write access if the run is running. The alarm condition looks like:

access(/Equipment/.../Variables/Input[10]) > 60

which will cause an alarm if the Input[10] is not written for more than 60 seconds. The other function which checks the run status as well is like:

access_running(...odb key...) > 60

You can actually see an example on the MEG alarm page.

Rather than having an alarm for that I would however recommend that you program you frontend such that it realizes if it looses connections, then tries automatically to reconnect or trigger an alarm itself (so-called "internal" alarm). This is also how the MSCB system is working and is much more robust.

Stefan
Entry  20 Jun 2022, jianrun, Bug Report, Error in "midas/src/mana.cxx" 
Dear Midas developers,

When we are running the examples in $MIDASSYS/examples/experiment/, we meet some 
problems when analyzing the results:
1. When we analyze the data using the analyzer: ./analyzer -i run00001.mid -o 
run00001.rz  , we find some bugs: 
"
Root server listening on port 9090...
Running analyzer offline. Stop with "!"
[Analyzer,ERROR] [mana.cxx:1832:bor,ERROR] HBOOK support is not compiled in
[Analyzer,INFO] Set run number 6 in ODB
Load ODB from run 6...OK
run00006.mid:2680  events, 0.00s
"
We think this occurs in the "midas/src/mana.cxx ". How can we solve this?

2. When we analyze the above data, an error also occurs: 
[Analyzer,ERROR] [odb.cxx:847:db_validate_name,ERROR] Invalid name 
"/Analyzer/Tests/Always true/Rate [Hz]" passed to db_create_key_wlocked: should 
not contain "["

We simply fixed that just by replacing the "Rate [Hz]" with "Rate" in the 
test_write in midas/src/mana.cxx 
We are curious whether you can fix the problem permanently in the next version, 
or we are not running the code properly. Thanks!
Entry  15 Apr 2019, Konstantin Olchanski, Info, switch of MIDAS to C++ 
For a long time now we have keep the core of midas (odb.c, midas.c, etc) compatible with plain C and by default 
we have built the MIDAS library using the plain C compiler. Over time, we have switched most MIDAS programs 
(mhttpd, mlogger, mdump, odbedit, etc) to C++ (with happy results). (and for a long time now, all of MIDAS 
could be build as C++, even if the default build remained plain C).

The main reason for keeping the core of MIDAS as C has been to allow writing MIDAS frontends in C - for 
example, in environments with no C++ compilers or no C++ runtime (VxWorks) or where C++ had too much 
overhead (small memory machines, etc).

Today, all concerns against using C++ seem to have receded into the past. C++ compilers are now always 
available, even for small embedded systems. C++ overheads are now well understood and one can easily write 
C++ code that is as efficient as C for using limited CPU and memory resources. (While at the same time, today's 
embedded systems tend to have more CPU and RAM than "big" MIDAS DAQ machines had in the past - 1GHz 
CPU, 1GB RAM is pretty typical for embedded ARM).

As examples of small hardware where MIDAS frontends written in C++ worked just fine, consider the T2K ND280 
FGD data collector running on XILINX FPGA with a 300MHz PowerPC and 128 Mbytes of RAM (standard Linux 
kernel) and the GRIFFIN Clock distribution module control running on a Microsemi FPGA with a 300MHz ARM 
CPU (ucLinux without an MMU). More typical Cyclone-5 ARM SoCs with 1GB RAM and 1GHz CPU run standard 
Linux (CentOS7) and can build MIDAS natively (no need for cross-compiling).

With the removal of the requirement to make it possible to write MIDAS frontends in C, we can switch the MIDAS 
default build to C++ and start using C++ features in the MIDAS API (std::string, std::vector, etc).

Next to consider is "which C++ should we use?".

K.O.
    Reply  15 Apr 2019, Konstantin Olchanski, Info, switch of MIDAS to C++, which C++? 
>
> With the removal of the requirement to make it possible to write MIDAS frontends in C, we can switch the MIDAS 
> default build to C++ and start using C++ features in the MIDAS API (std::string, std::vector, etc).
> 

Consider the most basic C++ construct, std::string, and observe how many member functions are annotated "c++11", "c++17", etc:
https://en.cppreference.com/w/cpp/string/basic_string

For MIDAS this means that we cannot target "a" C++ or "the" C++, we have to chose between C++ "before C++11", C++11, C++17 
(plus the incoming c++20).

For example, the ROOT 6 package requires C++11 *and* g++ >= 4.8.

Now consider the platforms we use at TRIUMF:

- Linux RHEL/SL/CentOS6 - gcc 4.4.7, no C++11.
- Linux RHEL/SL/CentOS7 - gcc 4.8.5, full C++11, no C++14, no C++17
- Ubuntu 18.04.2 LTS - gcc 7.3.0, full C++11, full C++14, "experimental" C++17.
- MacOS 10.13 - llvm 10.0.0 (clang-1000.11.45.5), full C++11, full C++14, full C++17.

(see here for GCC C++ support: https://gcc.gnu.org/projects/cxx-status.html)
(see here for LLVM clang c++ support: https://clang.llvm.org/cxx_status.html)

As is easy to see from the std::string reference how C++17 has a large number of very useful new features.

Alas, at TRIUMF we still run MIDAS on many SL6 machines where C++11 and C++17 is not normally available. I estimate another 1-2 
years before all our SL6 machines are upgraded to RHEL/SL/CentOS7 (or Ubuntu LTS).

This means we cannot use C++11 and C++17 in MIDAS yet. We are stuck with pre-C++11 for now.

Remarks:
- there will be trouble right away as both Stefan and myself do MIDAS development on MacOS where full C++17 is available and is 
tempting to use. (as they say, watch this space)
- it is possible to install a newer C++ compiler into RHEL/SL/CentOS 6 and 7 systems, but we are loath to require this (same as we 
are loath to require cmake for building MIDAS) - the "I" in MIDAS means integrated, meaning "does not require installing 100 
additional packages before one can use it".
- the MS Windows situation is unclear, but since one has to install the C++ compiler as an additional package anyway, I do not see 
any problem with requiring C++17 support, with a choice of MS compilers, GCC and LLVM. I doubt we will support anything older 
than Windows 10.

K.O.
       Reply  15 Apr 2019, Konstantin Olchanski, Info, switch of MIDAS to C++, how much C++? 
> >
> > With the removal of the requirement to make it possible to write MIDAS frontends in C, we can switch the MIDAS 
> > default build to C++ and start using C++ features in the MIDAS API (std::string, std::vector, etc).
> > 

C++ is a big animal. Obviously we want to use std::string, std::vector and similar improvements over plain C (we already use "//" for comments).

But in keeping with the Camel's nose fable (https://en.wikipedia.org/wiki/Camel%27s_nose), there are some parts of C++ we definitely do not want to use in MIDAS. Even the C++ FAQ talks 
about "evil features", see https://isocpp.org/wiki/faq/big-picture#use-evil-things-sometimes

Here is my list of things to use and to avoid. Comments on this are very welcome - as everybody's experience with C++ is different (and everybody's experience is very valuable and very 
welcome).

- std::string, std:vector, etc are in. I am already using them in the MIDAS API (midas.h)
- extern "C" is out, everything has to be C++, will remove "extern "C"" from all midas header files.
- exceptions are out, see https://stackoverflow.com/questions/1736146/why-is-exception-handling-bad
- std::thread and std::mutex are in, at least for writing new frontends, but see discussion of "cannot use c++11". (maybe replace ss_mutex_xxx() with out own std::mutex look-alike).
- heavy use of templates and heavy use of argument overloading is out - just by looking at the code, impossible to tell what function will be called
- "auto" is on probation. I need to know if "auto v=f()" is an integer or a double when I write "auto w=v/2" or "auto w=v/2.0". see 
https://softwareengineering.stackexchange.com/questions/180216/does-auto-make-c-code-harder-to-understand
- unreadable gibberish is out (lambdas, etc)
- C-style malloc()/free() is in. C++ new and delete are okey, but "delete[]" confuses me.
- C-style printf() is in. C++ cout and "<<" gunk provide no way to easily format the output for easy reading.

K.O.
          Reply  16 Apr 2019, Pintaudi Giorgio, Info, switch of MIDAS to C++, how much C++? 
Dear Konstantin,

even if I am still quite young and have only limited experience (but not null), I would like to give my two cents. I have reflected a bit about the C++ issue, also because I am developing a 
brand new MIDAS interface for the WAGASCI-T2K experiment, and I feel that the future of MIDAS could influence the future of our DAQ system, too. I'll start from the conclusions: I completely 
agree with you on a practical level, even if I kind of disagree on an "ethical" level.

What you propose in essence is to migrate the MIDAS core from pure C to a version of C with some fancy C++ features. Let's say a kind of C+ with only one plus. Theoretically speaking, even if 
on the surface C and C++ are very similar, they are completely different languages and require different mindsets (and I am sure that everyone is aware of it). This is the reason why even if I 
would have preferred to develop the MIDAS frontend for our experiment in C++, I have chosen to stick to pure C because I feel that MIDAS is still very C-like in its architecture (or from what 
I can see from the documentation). So I wanted to "keep on track" for better internal coherence. What I mean is that, if someone told me to port a C project of mine to C++, I would end up 
rewriting it almost completely, instead of just modifying it (I really don't know how much of the MIDAS core has been written with C++ in mind, so if a large part of it is already C++-like, 
please ignore my comment above).

Anyway, on a practical level, I completely agree with your approach, because I imagine that a complete rewrite of MIDAS is off the table but, at the same time, some new C++ features like 
better string and vector handling are very tempting to use. Moreover, in general, physicists are more familiar with the C syntax than with the C++ one (but thanks to ROOT that is changing). As 
for the use of MIDAS in embedded devices, I have no experience so I refrain from judging. So, in the particular case of MIDAS, what you propose is probably the best and only option.

As far as the C++ standard to adopt, I would say that the C++11 standard is the best fit for the T2K experiment since the official OS for T2K is CentOS7 and, out of the box, it supports C++11 
only. Anyway, I acknowledge that there are many other experiments and requirements. For the records, I do development on Ubuntu 18.04.

Best regards
Giorgio
          Reply  17 Apr 2019, John M O'Donnell, Info, switch of MIDAS to C++, how much C++? 
some semi-random thoughts:

no templates strictly means you can't use std::string, std::vector etc.

printf is in any case part of C++ (#include <cstdio>), but std::ostreams can be faster (for std::cout, endl line causes buffer flushing, whereas "\n" does not flush the buffer but printf
always flushes the buffer), and formatting is possible (though very long winded).  printf does not allow to print things other than simple data, e.g. BANK_HEADER* bh; printf( "%?", *bh);

I've been writing all our DAQ code in C++ for a while now.

> > >
> > > With the removal of the requirement to make it possible to write MIDAS frontends in C, we can switch the MIDAS 
> > > default build to C++ and start using C++ features in the MIDAS API (std::string, std::vector, etc).
> > > 
> 
> C++ is a big animal. Obviously we want to use std::string, std::vector and similar improvements over plain C (we already use "//" for comments).
> 
> But in keeping with the Camel's nose fable (https://en.wikipedia.org/wiki/Camel%27s_nose), there are some parts of C++ we definitely do not want to use in MIDAS. Even the C++ FAQ talks 
> about "evil features", see https://isocpp.org/wiki/faq/big-picture#use-evil-things-sometimes
> 
> Here is my list of things to use and to avoid. Comments on this are very welcome - as everybody's experience with C++ is different (and everybody's experience is very valuable and very 
> welcome).
> 
> - std::string, std:vector, etc are in. I am already using them in the MIDAS API (midas.h)
> - extern "C" is out, everything has to be C++, will remove "extern "C"" from all midas header files.
> - exceptions are out, see https://stackoverflow.com/questions/1736146/why-is-exception-handling-bad
> - std::thread and std::mutex are in, at least for writing new frontends, but see discussion of "cannot use c++11". (maybe replace ss_mutex_xxx() with out own std::mutex look-alike).
> - heavy use of templates and heavy use of argument overloading is out - just by looking at the code, impossible to tell what function will be called
> - "auto" is on probation. I need to know if "auto v=f()" is an integer or a double when I write "auto w=v/2" or "auto w=v/2.0". see 
> https://softwareengineering.stackexchange.com/questions/180216/does-auto-make-c-code-harder-to-understand
> - unreadable gibberish is out (lambdas, etc)
> - C-style malloc()/free() is in. C++ new and delete are okey, but "delete[]" confuses me.
> - C-style printf() is in. C++ cout and "<<" gunk provide no way to easily format the output for easy reading.
> 
> K.O.
             Reply  22 Apr 2019, Pintaudi Giorgio, Info, switch of MIDAS to C++, how much C++? example.cpp
Dear Konstantin and others,
our recent discussion stimulated my curiosity and I wrote a small frontend for the trigger board of our experiment in C++.
The underlying hardware details are not relevant here. I would just like to briefly report and discuss what I found out.

I have written all the frontend files (but the bus driver) in C++11:
  • my_frontend.cpp
  • driver/class/my_class_driver.cpp
  • driver/device/my_device_driver.cpp

All went quite smoothly, but I feel that the overall structure is still very C-like (that may be a good thing or a bad thing depending on the point of view).
As far as I know, the MIDAS frontend mfe.c has still only the C version (I couldn't find any mfe.cxx). This means that all the points of contact between the MIDAS frontend code and the user
frontend code must be C compatible (no C++ features or name mangling). To accomplish this I needed to slightly modify the midas.h header file like this:
@@ -1141,7 +1141,13 @@ typedef struct eqpmnt {
+#ifdef __cplusplus
+extern "C" {
+#endif
 INT device_driver(DEVICE_DRIVER *device_driver, INT cmd, ...);
+#ifdef __cplusplus
+}
+#endif

I also tested the new strcomb1 function and it seems to work OK.

I have attached a source file to show how I implemented the device driver in C++. The code is not meant to be compilable: it is just to show how I implemented it. This is the most C++-like syntax that I could come out with. Feel free to comment it and if you think that it could be improved let me know.

Best Regards
Giorgio
                Reply  23 Apr 2019, Konstantin Olchanski, Info, switch of MIDAS to C++, how much C++? 
> Dear Konstantin and others, our recent discussion stimulated my curiosity and I wrote a small frontend for the trigger board of our 
experiment in C++.

Yay!

> my_frontend.cpp

In MIDAS we are using .cxx, not .cpp, per ROOT coding convention https://root.cern.ch/coding-conventions

> the overall structure is still very C-like

this is object-oriented programming done in C. (actually C++ looks exactly the same if you look behind the curtain)

right now we do not hope to rewrite the slow control class driver framework in C++, but if somebody does it,
we should be happy to add it to midas.

for the mfe.c framework, I have a new C++ class based frontend framework in development (and already in use
in the ALPHA-g experiment at CERN). There is a number of lose ends to polish befire I can add it to midas.
And as usual the last 10% of the work consume 90% of the time.

> the MIDAS frontend mfe.c has still only the C version (I couldn't find any mfe.cxx). 
> This means that all the points of contact between the MIDAS frontend code and the user frontend code must be C compatible
> (no C++ features or name mangling).

this will change with the switch to C++, mfe.c will become mfe.cxx and I shall add the required definitions to mfe.h (or midas.h, TBD)

> To accomplish this I needed to slightly modify the midas.h header file like this:
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
>  INT device_driver(DEVICE_DRIVER *device_driver, INT cmd, ...);

I intend for all "extern "C"" to go away, everything will use the C++ linkage (and name mangling). This will break existing frontends
and I will need to write clear instructions on converting them to the new scheme.

> I also tested the new strcomb1 function and it seems to work OK.

good.

> I have attached a source file to show how I implemented the device driver in C++

Yup, looks familiar, I have a couple of C++ frontends written like this, too.

K.O.
       Reply  11 May 2019, Konstantin Olchanski, Info, switch of MIDAS to C++, which C++? 
> [which c++]
> 
> - Linux RHEL/SL/CentOS6 - gcc 4.4.7, no C++11.
> - Linux RHEL/SL/CentOS7 - gcc 4.8.5, full C++11, no C++14, no C++17
>

The construct I now always use:

class X {
int a = 0; // do not leave data members uninitialized, see "Non-static data member initializers", N2756 and N2628
}

is only available starting from gcc 4.7, see https://gcc.gnu.org/projects/cxx-status.html

Another nail into the coffin of "pre c++11" c++ and el < el7.

Hmm...

K.O.
    Reply  22 May 2019, Konstantin Olchanski, Info, switch of MIDAS to C++ 
> switch MIDAS to C++

switch to C++ will proceed as follows:

- create a new branch off develop (feature/switch_to_cxx)
- remove all extern "C", ifdef c++, etc
- switch Makefile from gcc to g++
- test
- merge into develop
- before merge, tag the last "C" midas
- cut a new release branch (tentatively feature/midas-2019-06)

the last recommended "pre-C++" midas will remain the midas-2019-03 release (where we can retroactively apply bug fixes, as I just did a few minutes ago).

K.O.
       Reply  05 Jun 2019, Konstantin Olchanski, Info, MIDAS switched to C++ 
The last bits of code to switch MIDAS to C++ have been committed, see tag midas-2019-05-cxx.

Since the cmake conversion is still in progress, for now, I recommend using the old "make" build for trying this update.

From the switch to C++, the biggest change is the requirement that frontend programs be build and linked
using the C++ compiler. Since mfe.o and the rest of MIDAS are built with C++, building frontends
with C is no longer possible.

To help with this, I will post a short guide for converting C frontends to C++.

K.O.
          Reply  17 May 2022, Razvan Stefan Gornea, Info, MIDAS switched to C++ 
Hi, I have three naive questions about this:
 - have you posted somewhere this guide about converting C frontends to C++?
 - it was mentioned previously that there will be a 'tag the last "C" midas', which version is it?
 - it means that even a simple example like odb_test.c cannot be compile anymore? Even when using g++?

Something like

g++ -I $HOME/daq/packages/midas/include/ -L $HOME/daq/packages/midas/lib/ odb_test.c -l midas

is expected to fail or is just me glitching? Is it because of thread library differences?

Thanks!


> The last bits of code to switch MIDAS to C++ have been committed, see tag midas-2019-05-cxx.
> 
> Since the cmake conversion is still in progress, for now, I recommend using the old "make" build for trying this update.
> 
> From the switch to C++, the biggest change is the requirement that frontend programs be build and linked
> using the C++ compiler. Since mfe.o and the rest of MIDAS are built with C++, building frontends
> with C is no longer possible.
> 
> To help with this, I will post a short guide for converting C frontends to C++.
> 
> K.O.
             Reply  17 May 2022, Konstantin Olchanski, Info, MIDAS switched to C++ 
> Hi, I have three naive questions about this:

all good questions, ask more of them.

>  - have you posted somewhere this guide about converting C frontends to C++?

yes, in this elog here I posted a guide for converting C mfe.c frontends to C++ and
a guide for converting mfe.c frontend to C++ TMFE frontend. please use the "find" function,
if you cannot find them, let me know, I will look for it for you.

>  - it was mentioned previously that there will be a 'tag the last "C" midas', which version is it?

correct. please run "git tag", tags before "midas-2019-05-cxx"is "C", after is "C++".

>  - it means that even a simple example like odb_test.c cannot be compile anymore? Even when using g++?
> g++ -I $HOME/daq/packages/midas/include/ -L $HOME/daq/packages/midas/lib/ odb_test.c -l midas
> is expected to fail or is just me glitching? Is it because of thread library differences?

yes, it is expected to fail, you have spaces after "-I", "-L" and "-l", incorrect g++ command syntax. after
correcting this, it may or may not work depending on what you have inside odb_test.c. I would be happy
to help you debug this, but please start a separate thread instead of necroposting into the C++ announcements.

K.O.
                Reply  17 May 2022, Ben Smith, Info, MIDAS switched to C++ 
>  - have you posted somewhere this guide about converting C frontends to C++?

There's documentation in the wiki at:
https://daq00.triumf.ca/MidasWiki/index.php/Changelog#2019-06

It includes a step-by-step guide of how to upgrade, what changes need to be made to frontends, and common issues that people had.
Entry  08 May 2022, Stefan Ritt, Info, RO_STOPPED with triggered events 
We had issues in one of our experiment that people used RO_STOPPED in the 
equipment list together with triggered events (EQ_USER). If events are sent when 
a run is stopped, this leads to many unexpected results, so I added a check in 
the mfe.cxx code which prevents RO_STOPPED (or RO_ALWAYS which includes 
RO_STOPPED) together with EQ_TRIGGERED, EQ_INTERRUPT, EQ_MULTITHREAD and EQ_USER 
type of events.

I got now complaints that some old front-end are not running any more since they 
do use RO_ALWAYS together with triggered events. Can the author of these frontend 
please tell me the rationale why this is needed, then I can maybe add a better 
fix for that.

Stefan
    Reply  08 May 2022, Konstantin Olchanski, Info, RO_STOPPED with triggered events 
> some old front-end are not running any more since they do use RO_ALWAYS together with 
triggered events.

I confirm, if you have mfe.c frontends that have RO_ALWAYS, after you update MIDAS, 
some of these frontends will fail to start.
https://bitbucket.org/tmidas/midas/commits/1961af0d657e4f76ab9db17f9b70c0c492172b6d

tmfe c++ frontends do not have this restriction but by default only read data when run 
is active (per-equipment fEqConfReadOnlyWhenRunning default is true).

K.O.
       Reply  16 May 2022, Konstantin Olchanski, Info, RO_STOPPED with triggered events 
> > some old front-end are not running any more since they do use RO_ALWAYS together with 
> triggered events.
> 
> I confirm, if you have mfe.c frontends that have RO_ALWAYS, after you update MIDAS, 
> some of these frontends will fail to start.
> https://bitbucket.org/tmidas/midas/commits/1961af0d657e4f76ab9db17f9b70c0c492172b6d
> 
> tmfe c++ frontends do not have this restriction but by default only read data when run 
> is active (per-equipment fEqConfReadOnlyWhenRunning default is true).

As of commit 
https://bitbucket.org/tmidas/midas/commits/28d9c96bd6d4f65346ebcd6a04492ea764c90823 mfe.c 
frontends will no longer fail to start. an error will still be issued "Equipment \"%s\" 
contains RO_STOPPED or RO_ALWAYS. This can lead to undesired side-effect and should be 
removed."

BTW 1:

Some of our old frontends use EQ_MULTITHREAD to implement multithreaded periodic equipments. 
They do not generate any events when there is no run (some of them do not generate any 
events at all). Now they will start printing this error message, for no reason. (no we will 
not be rewriting them justy to get rid of this message. life is too short).

BTW 2:

the c++ tmfe frontend does not have any protections against these "undersired side-effects".

What are these undesired side effects and should we add protection against them?

K.O.
          Reply  17 May 2022, Stefan Ritt, Info, RO_STOPPED with triggered events 
> > > some old front-end are not running any more since they do use RO_ALWAYS together with 
> > triggered events.
> > 
> > I confirm, if you have mfe.c frontends that have RO_ALWAYS, after you update MIDAS, 
> > some of these frontends will fail to start.
> > https://bitbucket.org/tmidas/midas/commits/1961af0d657e4f76ab9db17f9b70c0c492172b6d
> > 
> > tmfe c++ frontends do not have this restriction but by default only read data when run 
> > is active (per-equipment fEqConfReadOnlyWhenRunning default is true).
> 
> As of commit 
> https://bitbucket.org/tmidas/midas/commits/28d9c96bd6d4f65346ebcd6a04492ea764c90823 mfe.c 
> frontends will no longer fail to start. an error will still be issued "Equipment \"%s\" 
> contains RO_STOPPED or RO_ALWAYS. This can lead to undesired side-effect and should be 
> removed."
> 
> BTW 1:
> 
> Some of our old frontends use EQ_MULTITHREAD to implement multithreaded periodic equipments. 
> They do not generate any events when there is no run (some of them do not generate any 
> events at all). Now they will start printing this error message, for no reason. (no we will 
> not be rewriting them justy to get rid of this message. life is too short).
> 
> BTW 2:
> 
> the c++ tmfe frontend does not have any protections against these "undersired side-effects".
> 
> What are these undesired side effects and should we add protection against them?
> 
> K.O.

The undesired side-effects are the following: The logger tries to collect all events at the end of 
the run by emptying the SYSTEM buffer. If events keep coming after the run is stopped, this loop in 
the logger might be an endless loop, crashing the whole experiment in the end. 

Another issue (and actually the reason for this change) is the funciton receive_trigger_event() in 
mfe.cxx which will get confused if events are still coming in after a run has been stopped and 
actually enters an infinite loop.

Combining EQ_MULTITHREAD with EQ_PERIODIC or EQ_SLOW is a wrong parameter combination as written in 
the documentation. If one wants to have multi-threaded slow control events, one has to use the 
DF_MULTITHREAD flag in the DEVICE_DRIVER structure.

Having triggered events being sent to the system after a run has been stopped I would consider 
simply wrong. Why should we ever use a run start/stop if events are always flowing? Adding 
protections in all places for this case is certainly much more work than just changing one flag for 
frontends which produce this error message now for a wrong parameter combination.
Entry  24 Apr 2022, Konstantin Olchanski, Bug Fix, mserver buffer overrun and crash 
There is a memory allocation bug in the mserver.

ALIGN8() was missing when receiving events from the event socket and data buffer 
was allocated 4 bytes too short. but only for some received events and only in 
very unlucky sequence of received events. result was a rare but obnoxious crash 
of fevme frontend in alpha-2 at CERN. (we do not see any crash from this in 
alpha-g or anywhere else, the best I can tell).

fixed in commit 4dc06ba47ff7caa5251fd8c48d8533f35799f3a6.

If you use the mserver, please update to this commit or apply following patch in 
midas.cxx:

-   int bufsize = sizeof(INT) + event_size;
+   int bufsize = sizeof(INT) + total_size;

K.O.
    Reply  16 May 2022, Konstantin Olchanski, Bug Fix, mserver buffer overrun and crash 
> There is a memory allocation bug in the mserver.

Fix for this problem introduced a new problem, an infinite loop in bm_flush_cache, 
bitbucket bugs https://bitbucket.org/tmidas/midas/issues/339/infinite-loop-in-
mserver-due-to-mfes and https://bitbucket.org/tmidas/midas/issues/331/stuck-
semaphore-of-system-buffer

This is now fixed and the buffer write cache logic and size was rejigged
according to calculations in https://daq00.triumf.ca/elog-midas/Midas/2401

Event buffer write cache (as set via ODB Equipment/Common and via 
bm_set_cache_size()) now take 2 possible values:
0 - write cache is disabled and
MIN_WRITE_CACHE_SIZE - (10 Mbytes) minimum permitted cache size
bigger cache size values are permitted, up to buffer_size/3, but probably not useful 
if my calculations are right.
smaller cache size values are generally not useful, if my calculations are right.

mfe.c and tmfe c++ frontends updated to request the new write cache size by default.

if events are getting stuck in the write cache for too long, instead of reducing the 
cache size, one should increase frequency of bm_flush_cache() calls (1/sec by 
default).

commit 373bcc3ab7f83c3c7bf6c051c237de043a982502

K.O.
Entry  13 May 2022, Konstantin Olchanski, Info, analysis of corner cases in event buffer write cache 
introduction:

to remember, bm_send_event() writes an event to the write cache, bm_flush_cache() 
writes the contents of the write cache into the shared memory event buffer, buffer 
free space is consumed. in the usual case, mlogger is reading events from the shared 
memory event buffer, buffer free space is released. there is also a read cache, not 
part of this discussion.

the purpose of the write cache is to reduce contention for the shared memory 
semaphore. in the case of large number of small events, semaphore is locked per 
cache-flush, instead of per-event. correct tuning of write cache and event size can 
reduce lock rate from >100 kHz to around 100 Hz or lower.

analysis:

for correct operation of bm_send_event() under all conditions we need to consider 
all corner cases:

1) no write cache: (cache size set to 0)

- event_size > buffer_size -> reject the event (obviously)
- event_size > 0.5 * buffer_size -> only 1 event fits into the buffer, next write 
will stall until mlogger reads the previous event (sequential operation, bad)
- event_size < 0.3 * buffer_size -> at least 2 events fit into the buffer (good)

decision: limit event size to 0.5 to 0.3 * buffer_size (current limit is 0.5 * 
buffer_size, I think).

consequence: buffer size limit is 2 Gbytes (32-bit byte offsets, code is only 31-
bit-clean), max event size is between 1 Gbytes and 0.6 Gbytes.

2) writing to write cache:

- event_size > cache_size -> flush cache, write event to directly to buffer
- event_size > 0.5 * cache_size -> inefficient use of cache: write to cache, next 
event does not fit, flush to buffer, repeat. no gain in semaphore locking (bad), one 
additional memcpy() (event to cache and cache to buffer) (bad)
- event_size < 0.3 * cache_size -> multiple events fit into cache, but probably no 
gain in semaphore locking

decision: events that are bigger than 0.3 to 0.1 * cache_size should not go through 
the cache. (flush cache, write directly to buffer).

3) flush write cache to buffer:

- cache_size > buffer_size -> cannot flush in 1 operation, must have a loop and 
flush the cache in pieces
- cache_size between 0.5 and 1.0 * buffer_size -> can flush in 1 operation, but must 
wait for mlogger to fully empty the buffer (sequential operation, bad)
- cache size < 0.3 * buffer_size -> can flush in 1 operation, at least 2 "flushes" 
fit inside the buffer (good)

decision: limit write cache size to 0.3 * buffer_size. (current limit is 
0.25*buffer_size).

consequences:

- write cache size limit is 0.3..0.25 * 2GB = 0.6..0.5 Gbytes
- cached event size limit is 0.3..0.1 * 0.5 GBytes = 150..50 Mbytes
- minimum number of cached events: 3 to 10
- semaphore locks reduced: 3 to 10 locks become 1 lock (all events cached),
4 to 11 locks become 2 locks (big event causes cache flush).

4) complications:

- there is a periodic 1/second bm_flush_cache() that flushes the cache early and 
reduces it's efficiency (but needed to avoid having data stuck in cache for long 
time)
- if multiple frontends use large write cache (~ 0.3..0.5 * buffer_size), again, 
sequential operation can happen (bad)
- write cache is per-frontend, not per-equipment. if different equipments request 
different cache sizes, mfe.c and tmfe c++ frontends complain about this, but the 
user has to sort it out.

K.O.
    Reply  16 May 2022, Konstantin Olchanski, Info, analysis of corner cases in event buffer write cache 
> for correct operation of bm_send_event() under all conditions we need to ...

to continue computation from last message:

default SYSTEM buffer size: 32 MiBytes
default max event size: 4 MiBytes

hard max buffer size: 2 Gbytes (code is only 31-bit-clean)
hard max event size: 2 Gbytes (code is only 31-bit-clean)

max event size currently: 32 Mbytes (same as buffer size)
max event size per (1) in previous post: 32*0.5..0.3 = 16..9 MiBytes

number of default-max-size events buffered: 32/4 = 8.
number of per (1) max-size events buffered: 2 or 3
number of current max-size events buffered: 0 (bad, frontend is serialized with mlogger)

default write cache size: 100 kbytes

max write cache size currently: buffer size / 4 = 32/4 = 8 MiBytes
max write cache size per (3) in previous post: buffer_size / 3 = 10 Mbytes
hard max write cache size per (3): 2 Gbytes/3 = 600 Mbytes

max size of cached events:

current: 100 kbytes (size as cache size)
per (2) in previous post: 0.1..0.3 * cache size = 10..30 kbytes
per (2), 1 Mbyte cahe: 0.1..0.3 * cache size = 100..300 kbytes
hard max size: 0.1..0.3 * hard_max_cache_size = 0.1..0.3 * 600 = 60..180 Mbytes.

max data rate before event buffer semaphore locking rate exceeds 100 Hz:

1 kbyte events, no write cache: 100 kbytes/sec
1 kbyte events, 100 kbyte cache: 100 events cached, cache flush rate 100 Hz -> 100*1kbyte*100Hz -> 10 Mbytes/sec
1 kbyte events, 1 Mbyte cache: 1000 events cached, cache flush rate 100 Hz -> 100 Mbytes/sec (1gige ethernet)
N kbyte events, 1 Mbyte cache: same thing (data rate is limited by cache flush rate 100 Hz)
100 kbyte events, 1 Mbyte cache, not cached per (2): 100kbyte*100Hz = 10 Mbytes/sec
300 kbyte events, 1 Mbyte cache, not cached per (2): 300kbyte*100Hz = 30 Mbytes/sec
N00 kbyte events: N0 Mbytes/sec (500->50, etc)
1 kbyte events, 10 Mbyte cache: 10000 events cached, cache flush rate 100 Hz -> 1000 Mbytes/sec (10gige ethernet)
N kbyte events, 10 Mbyte cache: same thing (data rate is limited by cache flush rate 100 Hz)
1000 kbyte events, 10 Mbyte cache, not cached per (2): 1000kbyte*100Hz = 100 Mbytes/sec
3000 kbyte events, 10 Mbyte cache, not cached per (2): 3000kbyte*100Hz = 300 Mbytes/sec
N000 kbyte events: N00 Mbytes/sec (4000->400, 5000->500, etc)
default max event size: 4 Mibytes*100Hz = 400 Mbytes/sec (exceeds 1gige ethernet)
hard max event size (divided by 10 to buffer 10 events): 200 Mbytes*100Hz -> 20 Gbytes/sec

max event rate before event buffer semaphore locking rate exceeds 100 Hz:

1 kbyte events, no write cache: 100 Hz (obviously)
1 kbyte events, 100 kbyte cache: 100 events cached, cache flush rate 100 Hz -> 10 kHz
1 kbyte events, 1 Mbyte cache: 1000 events cached, cache flush rate 100 Hz -> 100 kHz
N kbyte events, 1 Mbyte cache: 1000/N events cached, cache flush rate 100 Hz -> 100/N kHz
1 kbyte events, 10 Mbyte cache: 10000 events cached, cache flush rate 100 Hz -> 1000 kHz
N kbyte events, 10 Mbyte cache: 10000/N events cached, cache flush rate 100 Hz -> 1000/N kHz
100 kbyte events, not cached per (2): 100 Hz (obviously)
300 kbyte events, not cached per (2): 100 Hz (obviously)
default max event size: 100 Hz (obviously)

K.O.
       Reply  16 May 2022, Konstantin Olchanski, Info, analysis of corner cases in event buffer write cache 
> > for correct operation of bm_send_event() under all conditions we need to ...
> to continue computation from last message:

if I got my numbers right, for present-day hardware (1gige/10gige data rates, 100 Hz max locking rate), we should 
increase the default buffer write cache size from 100 kbytes to 10 Mbytes.

this cache size will permit processing of the full mix of small/big events
at the full mix of event rates without exceeding the 100 Hz semaphore locking rate.

with the 10 Mbyte write cache, default event buffer size should be 30-40 Mbytes (current size is 33 Mbytes, so does 
not need to change).

this computation is for 1 writer (1 reader, mlogger). it is a typical case for our experiments.

multiple writers can run into contention for event buffer space.

consider 10 writers want to flush their 10 Mbyte write cache all at the same time:

if buffer size is the default 33 Mbytes, the first 3 writers will have successful write cache flush,
but the other 7 will stall, there is no space in the buffer, we have to wait for mlogger to free
some (mlogger writing X Mbytes/sec will take Y milliseconds to liberate 10 Mbytes of space for the 4th writer
to successfully flush, writers 5..10 are still stalled).

but in a system with 10 writers writing at 10 Mbytes/sec (1 Hz default cache flush rate) is 100 Mbytes/sec
will likely have SYSTEM buffer size at least 200-300 Mbytes (to buffer 1-2 seconds of data against
any delays in writing to disk/network storage).

so there should be no problem in practice.

K.O.
Entry  06 May 2022, Stefan Ritt, Info, Increased timeout for program shut down 
We had the problem in our lab that a frontend took about 6 seconds to gracefully 
shut down, mainly it needed to park some motors. I found that the shutdown command 
had a hard-coded timeout of 5 seconds, after which the frontend gets killed, and 
cannot finish the park operation. I change the code so that the client timeout 
stored in the ODB is taken instead of the hard-coded 5 seconds. This allows each 
client to fine-tune its timeout, to allow graceful shutdown, but also not let the 
user wait too long if the client gets stuck and needs a hard kill.

The default timeout for mfe.cxx based frontends has been changed to 10 seconds 
now, but in the frontend_init function this can be changed by the user code 
easily.

I hope this char does not trigger any bad side effects, but if it does, please 
report here.

Stefan
Entry  04 May 2022, Konstantin Olchanski, Bug Fix, mysql history update 
the code for writing midas history to mysql has been updated to work against 
MYSQL 8.0.23 (CERN ALPHA-2):

- as ever mysql reports inconsistent data types (I create column with type 
"integer", mysql reports it has type "int" and so forth), the special kludge to 
take care of this had to be tweaked.

- this caused some columns to be marked "inactive" and the code to "reactivate" 
them was missing (fixed)

- binary history event data size was computed incorrectly for events with 
"inactive" columns (fixed) and caused assert() failure and mlogger crash.

- mysql read of column definitions for history event "system" (as in 
/history/links/system) bombed because of incorrect quoting (worked before, why? 
why bombed now?). this caused duplicate columns to be created in mysql table 
"system" and mlogger bomb-out with complaint about "duplicated columns" 
(actually the error message was missing, so it was a silent bomb-out). quoting 
fixed, missing error message fixed, but cleanup of duplicate columns has to be 
done by hand. in case of alpha-2 the fix was to remove the unused 
/history/links/system).

if you are using mysql history please update or patch src/history_schema.cxx.

commit 9d17d2fef233cf457121ca7c2a283c4c76ed33bc

K.O.
Entry  30 Apr 2022, Konstantin Olchanski, Info, added web pages for "show odb clients" and "show open records" 
for a long time, midas web pages have been missing the equivalent of odbedit 
"scl" and "sor" to display current odb clients and current odb open records.

this is now added as buttons "show open records" and "show odb clients" in the 
odb editor page.

as in odbedit, "sor" shows open records under the current subtree, i.e. if you 
are looking at /equipment, you will not see open records for /experiment. to see 
all open records, go to "/".

commit b1ab7e67ecf785744fff092708d8389f222b14a4

K.O.
    Reply  04 May 2022, Stefan Ritt, Info, added web pages for "show odb clients" and "show open records" 
Concerning the "scl" page, we are currently having a discussion. At the moment, one can 
see midas clients in three different places:

1) the main status page at the bottom, only names and hosts are there

2) the programs page, where one can also start/stop program

3) now the new page "Show ODB clients" in the ODB editor page, which shows also the 
alive status, PID and timeout

I'm thinking that three locations are two too much, so we are considering to merge the 
tree pages into one. That would mean that 1) goes away, and the "Programs" page will 
show more information. We have some rare cases that programs are removed from 
/System/Clients in the ODB but still attached to the ODB. For those "zombies" we would 
add a "hard kill" function.

I would like to hear feedback from the midas community before we proceed with the 
plans. Anybody desperately in need of the programs shown on the status page?

Best,
Stefan
Entry  01 May 2022, Konstantin Olchanski, Info, added web page for "mdump" 
added JSON RPC for bm_receive_event() and added a web page for "mdump".

the event dump is a hex dump for now.

if somebody can contribute a javascript decoder for midas bank format, it would be greatly appreciated.

otherwise, I will eventually write my own decoder library patterned on midasio.h and midasio.cxx.

as of commit 5882d55d1f5bbbdb0d9238ada639e63ac27d8825
K.O.
    Reply  01 May 2022, Konstantin Olchanski, Info, added web page for "mdump" 
> added JSON RPC for bm_receive_event()

there is a number of problems with implementing bm_receive_event() as a RPC:

1) mhttpd has only event buffer 1 read pointer for all javascript connections, if two browser tabs are 
running mdump, they will "steal" events from each other.
2) javascript connections are state-less and we cannot specify per-connection event_id and trigger_mask 
filters to bm_receive_event(). our bm_request_event() has to be for all event_id and all trigger_mask.
3) for same reason, we cannot have some requests to be GET_ALL, some to be GET_RECENT and some to be 
GET_OLD (if GET_OLD is ever implemented).

Problem (1) is hard to fix. Only solution I can see is to have mhttpd have it's own event buffer that can 
somehow track which events have been sent to which javascript connection.

The same scheme allows implementing GET_ALL and per-connection event_id and trigger_mask filters.

The difficulty is in detecting javascript connections that are no longer active and it's event request and 
events we have buffered for it can be deleted. Unlike proper rpc clients, javascript browser tabs can be 
closed without warning and without opportunity to tell rpc server that they are closed, gone.

K.O.
    Reply  01 May 2022, Konstantin Olchanski, Info, added web page for "mdump" 
> added a web page for "mdump".

missing functions:
- get a list of existing event buffers (should read event buffer names from /Experiment/Buffer sizes)
- selector box to select event buffer
- button for "get next" and "get new" (should call bm_skip_event() before bm_receive_event())
- entry fields for event_id and trigger_mask event filter
- check box for "keep getting new data" and entry field for update frequency
- (eventually) entry field for bank name filter

K.O.
       Reply  02 May 2022, Stefan Ritt, Info, added web page for "mdump" 
Here are some of my thoughts:

- I volunteer to write the JavaScript midas bank decoder. Just a couple of pure javascript functions, no 
midasio.cxx library needed.

- If different javascript connections "steal" events from each other, I would not be concerned. Actually I 
would rather like that all connections see the SAME event. So mhttpd keeps one event, serves it to all 
links, so displays are consistent. If a browser wants to see the "next" event, it send the old serial 
number and days "please send next event AFTER serial number". If the serial number is larger than the 
event in the buffer, mhttpd fetches a new event and puts it into its buffer.

- Since javascript connections are connectionless, I would rather pass event_id and trigger_mask with each 
request. Then mhttpd can retrieve events until event_id and trigger_mask match, then serve that event. 
Since reading events from a midas buffer is fast (many 10'000s of events per second), the won't be much of 
a delay.

- GET_ALL does not make sense for browsers, you don't want to slow down any frontend. If someone wants to 
do histogramming in the browser, then GET_SOME (which is kind of GET_OLD) would make sense, but most of 
the cases we have some single event display, and there a GET_RECENT is most appropriate.
Entry  30 Apr 2022, Giovanni Mazzitelli, Forum, S3 Object Storage 
Dear all,
We are storing raw MIDAS files to S3 Object Storage, but MIDAS file are not 
optimised for readout from such kind of storage. There is any work around on 
evolution of midas raw output or, beyond simulated posix fs,  to develop midas 
python library optimised to stream data from S3 (is not really clear to me if this 
is possible).
    Reply  30 Apr 2022, Konstantin Olchanski, Forum, S3 Object Storage 
> We are storing raw MIDAS files to S3 Object Storage, but MIDAS file are not 
> optimised for readout from such kind of storage. There is any work around on 
> evolution of midas raw output or, beyond simulated posix fs,  to develop midas 
> python library optimised to stream data from S3 (is not really clear to me if this 
> is possible).

We have plans for adding S3 object storage support to lazylogger, but have not gotten 
around to it yet.

We do not plan to add this in mlogger. mlogger works well for writing data to locally-
attached storage (local ext4, XFS, ZFS) but always runs into problems with timeouts and 
delays when writing to anything network-attached (even writing to NFS).

I envision that each midas raw data file (mid.gz or mid.lz4 or mid.bz2) will
be stored as an S3 object and there will be some kind of directory object
to map object ids to run and subrun numbers.

Choice of best file size is open, normally we use subruns to limit file size to 1-2 
Gbytes. If cloud storage prefers some other object size, we can easily to up to 10 
Gbytes and down to "a few megabytes" (ODB dumps will have to be turned off for this).

Other than that, in your view, what else is needed to optimize midas files for storage 
in the Amazon S3 could?

P.S. For reading files from the cloud, code needs to be written and added to 
midasio/midasio.cxx, for example, see the code that is already there for reading ssh-
attached files and dcache/dccp-attached files. (CERN EOS files can be read directly 
from POSIX mount point /eos).

K.O.
       Reply  01 May 2022, Giovanni Mazzitelli, Forum, S3 Object Storage 
> > We are storing raw MIDAS files to S3 Object Storage, but MIDAS file are not 
> > optimised for readout from such kind of storage. There is any work around on 
> > evolution of midas raw output or, beyond simulated posix fs,  to develop midas 
> > python library optimised to stream data from S3 (is not really clear to me if this 
> > is possible).
> 
> We have plans for adding S3 object storage support to lazylogger, but have not gotten 
> around to it yet.
> 
> We do not plan to add this in mlogger. mlogger works well for writing data to locally-
> attached storage (local ext4, XFS, ZFS) but always runs into problems with timeouts and 
> delays when writing to anything network-attached (even writing to NFS).
> 
> I envision that each midas raw data file (mid.gz or mid.lz4 or mid.bz2) will
> be stored as an S3 object and there will be some kind of directory object
> to map object ids to run and subrun numbers.
> 
> Choice of best file size is open, normally we use subruns to limit file size to 1-2 
> Gbytes. If cloud storage prefers some other object size, we can easily to up to 10 
> Gbytes and down to "a few megabytes" (ODB dumps will have to be turned off for this).
> 
> Other than that, in your view, what else is needed to optimize midas files for storage 
> in the Amazon S3 could?
> 
> P.S. For reading files from the cloud, code needs to be written and added to 
> midasio/midasio.cxx, for example, see the code that is already there for reading ssh-
> attached files and dcache/dccp-attached files. (CERN EOS files can be read directly 
> from POSIX mount point /eos).
> 
> K.O.

thanks, 
actually a I made a small work around with python boto3 library with file of any size (with 
the obviously limitation of opportunity and time to wait) eg:

key = 'TMP/run00060.mid.gz'

aws_session = creds.assumed_session("infncloud-iam")
s3 = aws_session.client('s3', endpoint_url="https://minio.cloud.infn.it/", 
                        config=boto3.session.Config(signature_version='s3v4'),verify=True)

s3_obj = s3.get_object(Bucket='cygno-data',Key=key)
buf = BytesIO(s3_obj["Body"]._raw_stream.data)

for event in MidasSream(gzip.GzipFile(fileobj=buf)):
    if event.header.is_midas_internal_event():
        print("Saw a special event")
        continue

    bank_names = ", ".join(b.name for b in event.banks.values())
    print("Event # %s of type ID %s contains banks %s" % (event.header.serial_number, 
event.header.event_id, bank_names))
    ....


where in MidasSream I just bypass the open, and the code work, but obviously in this way I 
need to have all the buffer in memory and it take time get all the buffer. I was interested to 
understand if some one have already develop the stream event by event (better in python but 
not mandatory). I'll look to the code you underline.
Thanks, G. 
 
Entry  16 Mar 2022, Stefan Ritt, Info, New midas sequencer version 
A new version of the midas sequencer has been developed and now available in the 
develop/seq_eval branch. Many thanks to Lewis Van Winkle and his TinyExpr library 
(https://codeplea.com/tinyexpr), which has now been integrated into the sequencer 
and allow arbitrary Math expressions. Here is a complete list of new features:


* Math is now possible in all expressions, such as "x = $i*3 + sin($y*pi)^2", or 
in "ODBSET /Path/value[$i*2+1], 10"


* "SET <var>,<value>" can be written as "<var>=<value>", but the old syntax is 
still possible.


* There are new functions ODBCREATE and ODBDLETE to create and delete ODB keys, 
including arrays


* Variable arrays are now possible, like "a[5] = 0" and "MESSAGE $a[5]"


If the branch works for us in the next days and I don't get complaints from 
others, I will merge the branch into develop next week.

Stefan
    Reply  22 Mar 2022, Stefan Ritt, Info, New midas sequencer version 
After several days of testing in various experiments, the new sequencer has
been merged into the develop branch. One more feature was added. The path to
the ODB can now contain variables which are substituted with their values.
Instead writing

ODBSET /Equipment/XYZ/Setting/1/Switch, 1
ODBSET /Equipment/XYZ/Setting/2/Switch, 1
ODBSET /Equipment/XYZ/Setting/3/Switch, 1

one can now write

LOOP i, 3
   ODBSET /Equipment/XYZ/Setting/$i/Switch, 1
ENDLOOP

Of course it is not possible for me to test any possible script. So if you 
have issues with the new sequencer, please don't hesitate to report them 
back to me.

Best,
Stefan
       Reply  15 Apr 2022, Stefan Ritt, Info, New midas sequencer version sequencer.pdf
I prepared some slides about the new features of the sequencer and post it here so 
people can have a quick look at get some inspiration.

Stefan
Entry  05 Apr 2013, Konstantin Olchanski, Info, ODB JSON support 
odbedit can now save ODB in JSON-formatted files. (JSON is a popular data encoding standard associated 
with Javascript). The intent is to eventually use the ODB JSON encoder in mhttpd to simplify passing of 
ODB data to custom web pages. In mhttpd I also intend to support the JSON-P variation of JSON (via the 
jQuery "callback=?" notation).

JSON encoding implementation follows specifications at:
http://json.org/
http://www.json-p.org/
http://api.jquery.com/jQuery.getJSON/  (seek to JSONP)

The result passes validation by:
http://jsonlint.com/

Added functions:
   INT EXPRT db_save_json(HNDLE hDB, HNDLE hKey, const char *file_name);
   INT EXPRT db_copy_json(HNDLE hDB, HNDLE hKey, char **buffer, int *buffer_size, int *buffer_end, int 
save_keys, int follow_links);

For example of using this code, see odbedit.c and odb.c::db_save_json().

Example json file:

Notes:
1) hex numbers are quoted "0x1234" - JSON does not permit "hex numbers", but Javascript will 
automatically convert strings containing hex numbers into proper integers.
2) "double" is encoded with full 15 digit precision, "float" with full 7 digit precision. If floating point values 
are actually integers, they are encoded as integers (10.0 -> "10" if (value == (int)value)).
3) in this example I deleted all the "name/key" entries except for "stringvalue" and "sbyte2". I use the 
"/key" notation for ODB KEY data because the "/" character cannot appear inside valid ODB entry names. 
Normally, depending on the setting of "save_keys" argument, KEY data is present or absent for all entries.

ladd03:midas$ odbedit
[local:testexpt:S]/>cd /test
[local:testexpt:S]/test>save test.js
[local:testexpt:S]/test>exit
ladd03:midas$ more test.js
# MIDAS ODB JSON
# FILE test.js
# PATH /test
{
  "test" : {
    "intarr" : [ 15, 0, 0, 3, 0, 0, 0, 0, 0, 9 ],
    "dblvalue" : 2.2199999999999999e+01,
    "fltvalue" : 1.1100000e+01,
    "dwordvalue" : "0x0000007d",
    "wordvalue" : "0x0141",
    "boolvalue" : true,
    "stringvalue" : [ "aaa123bbb", "", "", "", "", "", "", "", "", "" ],
    "stringvalue/key" : {
      "type" : 12,
      "num_values" : 10,
      "item_size" : 1024,
      "last_written" : 1288592982
    },
    "byte1" : 10,
    "byte2" : 241,
    "char1" : "1",
    "char2" : "-",
    "sbyte1" : 10,
    "sbyte2" : -15,
    "sbyte2/key" : {
      "type" : 2,
      "last_written" : 1365101364
    }
  }
}

svn rev 5356
K.O.
    Reply  10 May 2013, Konstantin Olchanski, Info, mhttpd JSON support 
> odbedit can now save ODB in JSON-formatted files.
> Added functions:
>    INT EXPRT db_save_json(HNDLE hDB, HNDLE hKey, const char *file_name);
>    INT EXPRT db_copy_json(HNDLE hDB, HNDLE hKey, char **buffer, int *buffer_size, int *buffer_end, int  save_keys, int follow_links);
> 

Added JSON encoding format to Javascript ODBCopy() ("jcopy"). Use format="json", Javascript example updated with an example example.

Also updated db_copy_json():
- always return NUL-terminated string
- "save_keys" values: 0 - do not save any KEY data, 1 - save all KEY data, 2 - save only KEY.last_written

odb.c, mhttpd.cxx, example.html
svn rev 5362
K.O.
       Reply  17 May 2013, Konstantin Olchanski, Info, mhttpd JSON-P support 
> 
> Added JSON encoding format to Javascript ODBCopy(path,format) ("jcopy"). Use format="json", Javascript example updated with an example example.
> 

More ODBCopy() expansion: format="json-p" returns data suitable for JSON-P ("script tag") messaging.

Also implemented multiple-paths for "jcopy" (similar to "jget"/ODBMGet()). An example ODBMCopy(paths,callback,format) is present in example.html (will move to mhttpd.js).

Added JSON encoding options:
- format="json-nokeys" will omit all KEY information except for "last_written"
- "json-nokeys-nolastwritten" will also omit "last_written"
- "json-nofollowlinks" will return ODB symlink KEYs instead of following them (ODBGet/ODBMGet always follows symlinks)
- "json-p" adds JSON-P encapsulation
All these JSON format options can be used at the same time, i.e. format="json-p-nofollowlinks"

To see how it all works, please look at examples/javascript1/example.html.

The new code seems to be functional enough, but it is still work in progress and there are a few problems:
- ODBMCopy() using the "xml" format returns gibberish (the MIDAS XML encoder has to be told to omit the <?xml> header)
- example.html does not actually parse any of the XML data, so we do not know if XML encoding is okey
- JSON encoding has an extra layer of objects (variables.Variables.foo instead of variables.foo)
- ODBRpc() with JSON/JSON-P encoding not done yet.

mhttpd.cxx, example.html
svn rev 5364
K.O.
          Reply  31 May 2013, Konstantin Olchanski, Info, mhttpd JSON-P support 
> To see how it all works, please look at examples/javascript1/example.html.
> 
> - JSON encoding has an extra layer of objects (variables.Variables.foo instead of variables.foo)
>

This is now fixed. See updated example.html. Current encoding looks like this:

{
  "System" : {
    "Clients" : {
      "24885" : {
        "Name/key" : { "type" : 12, "item_size" : 32, "last_written" : 1370024816 },
        "Name" : "ODBEdit",
        "Host/key" : { "type" : 12, "item_size" : 256, "last_written" : 1370024816 },
        "Host" : "ladd03.triumf.ca",
        "Hardware type/key" : { "type" : 7, "last_written" : 1370024816 },
        "Hardware type" : 44,
        "Server Port/key" : { "type" : 7, "last_written" : 1370024816 },
        "Server Port" : 52539
      }
    },
    "Tmp" : {
...

odb.c, example.html
svn rev 5368
K.O.
    Reply  27 Sep 2013, Konstantin Olchanski, Info, ODB JSON support 
> odbedit can now save ODB in JSON-formatted files.
> 
> JSON encoding implementation follows specifications at:
> http://json.org/
> 
> The result passes validation by:
> http://jsonlint.com/
> 

A bug was reported in my JSON ODB encoder: NaN values are not encoded correctly. A quick review found this:

1) the authors of JSON smoked some bad mushrooms and specifically disallowed NaN and Inf values for floating point numbers: 
http://tools.ietf.org/html/rfc4627
2) most JSON encoders and decoders do reasonable and unreasonable things with NaN and Inf values. The worst ones encode them as zero. More bad 
mushrooms.

There is a quick survey at: http://lavag.org/topic/16217-cr-json-labview/?p=99058

<pre>
Some Javascript engines allow it since it is valid Javascript but not valid Json however there is no concensus.
cmj-JSON4Lua: raw tostring() output (invalid JSON).
dkjson: 'null' (like in the original JSON-implementation).
Fleece: NaN is 0.0000, freezes on +/-Inf.
jf-JSON: NaN is 'null', Inf is 1e+9999 (the encode_pretty function still outputs raw tostring()).
Lua-Yajl: NaN is -0, Inf is 1e+666.
mp-CJSON: raises invalid JSON error by default, but runtime configurable ('null' or Nan/Inf).
nm-luajsonlib: 'null' (like in the original JSON-implementation).
sb-Json: raw tostring() output (invalid JSON).
th-LuaJSON: JavaScript? constants: NaN is 'NaN', Inf is 'Infinity' (this is valid JavaScript?, but invalid JSON).
</pre>

For the MIDAS JSON encoder (and decoder) I have several choices:
a) encode NaN and Inf using the printf("%f") encoding (as strings, making it valid JSON)
b) encode NaN and Inf as strings using the Javascript special values: "NaN", "Infinity" and "-Infinity", see 
http://www.w3schools.com/jsref/jsref_positive_infinity.asp

I note that the Python JSON encoder does (b), see section 18.2.3.3 at http://docs.python.org/2/library/json.html

In either case, behaviour of the JSON decoder on the Javascript side needs to be tested. (Silent conversion to value of zero is not acceptable).

If anybody has an suggestion on this, please let me know.



P.S. If you do not know all about NaN, Inf, "-0" and other floating point funnies, please read:  https://www.ualberta.ca/~kbeach/phys420_580_2010/docs/ACM-Goldberg.pdf

P.P.S. If you ever used the type "float" or "double", used the "/" operator or the function "sqrt()" you also should read that reference.

K.O.
       Reply  09 Oct 2013, Konstantin Olchanski, Info, ODB JSON support 
> > odbedit can now save ODB in JSON-formatted files.
> A bug was reported in my JSON ODB encoder: NaN values are not encoded correctly.

Tested the browser-builtin JSON.stringify() function in google-chrome, firefox, safari, opera:
everybody encodes numeric values NaN and Inf as JSON value [null].

To me, this clearly demonstrates a severe defect in the JSON standard and in it's Javascript implementation:
a) NaN, Inf and -Inf are valid, useful and commonly used numeric values defined by the IEEE754/854 standard (as opposed to the special value "-0", which is also defined by the standard, but is not nearly as useful)
b) they are all distinct numeric values, encoding them all into the same JSON value [null] is the same as encoding all even numbers into the JSON value [42].
c) on the decoding end, JSON value [null] is decoded into Javascript value [null], which works as 0 for numeric computation, so effectively NaN, Inf and -Inf are made equal to zero. A neat trick.

Note that (c) - NaN, Inf is same as 0 - eventually produces incorrect numerical results by breaking the IEEE754/854 standard specification that number+NaN->NaN, number+infinity->infinity, etc.

In MIDAS we have a requirement that results be numerically correct: if an ODB value is "infinity", the corresponding web page should not show "0".

In addition we have a requirement that JSON encoding should be lossess: i.e. ODB contents encoded by JSON should decode back into the same ODB contents.

To satisfy both requirements, I now encode NaN, Inf and -Inf as JSON string values "NaN", "Infinity" and "-Infinity". (Corresponding to the respective Javascript values).

Notes:
1) this is valid JSON
2) it survives decode/encode in the browser (ODBMCopy()/JSON.parse/modify some values/JSON.stringify/ODBMPaste() does not destroy these special values)
3) it is numerically correct for "NaN" values (Javascript [1+"NaN"] -> NaN)
4) it fails in an obvious way for Inf and -Inf values (Javascript [1+"Infinity"] is NaN instead of Infinity).

https://bitbucket.org/tmidas/midas/commits/82dd203cc95dacb6ec9c0a24bc97ffd45bb58427
K.O.
          Reply  17 Mar 2014, Konstantin Olchanski, Info, ODB JSON support 
> > > odbedit can now save ODB in JSON-formatted files.
> encode NaN, Inf and -Inf as JSON string values "NaN", "Infinity" and "-Infinity". (Corresponding to the respective Javascript values).

A new standard just came out - Oasis OData JSON format 4.0 - 
http://docs.oasis-open.org/odata/odata-json-format/v4.0/os/odata-json-format-v4.0-os.html

Section 7.1 reads:

> Values of types [...] Edm.Single, Edm.Double, and Edm.Decimal are represented as JSON numbers, except for NaN, INF, and –INF which are represented as strings.

This is consistent with what we do in MIDAS - encode special numbers as strings. For now I think we stay with Javascript-standard "Infinity", "-Infinity",
but if more standards start using "INF", "-INF", maybe we will switch. It is easy enough to support both encodings in the JSON parser and in the ODB decoder.

https://xkcd.com/927/
K.O.
             Reply  12 Apr 2022, Konstantin Olchanski, Info, ODB JSON support 
> > > > odbedit can now save ODB in JSON-formatted files.
> > encode NaN, Inf and -Inf as JSON string values "NaN", "Infinity" and "-Infinity". (Corresponding to the respective Javascript values).
> http://docs.oasis-open.org/odata/odata-json-format/v4.0/os/odata-json-format-v4.0-os.html
> > Values of types [...] Edm.Single, Edm.Double, and Edm.Decimal are represented as JSON numbers,
> except for NaN, INF, and –INF which are represented as strings "NaN", "INF" and "-INF".
> https://xkcd.com/927/

Per xkcd, there is a new json standard "json5". In addition to other things, numeric
values NaN, +Infinity and -Infinity are encoded as literals NaN, Infinity and -Infinity (without quotes):
https://spec.json5.org/#numbers

Good discussion of this mess here:
https://stackoverflow.com/questions/1423081/json-left-out-infinity-and-nan-json-status-in-ecmascript

K.O.
                Reply  13 Apr 2022, Stefan Ritt, Info, ODB JSON support 
> Per xkcd, there is a new json standard "json5". In addition to other things, numeric
> values NaN, +Infinity and -Infinity are encoded as literals NaN, Infinity and -Infinity (without quotes):
> https://spec.json5.org/#numbers

Just for curiosity: Is this implemented by the midas json library now?
                   Reply  13 Apr 2022, Konstantin Olchanski, Info, ODB JSON support 
> > Per xkcd, there is a new json standard "json5". In addition to other things, numeric
> > values NaN, +Infinity and -Infinity are encoded as literals NaN, Infinity and -Infinity (without quotes):
> > https://spec.json5.org/#numbers
> 
> Just for curiosity: Is this implemented by the midas json library now?

MIDAS encodes NaN, Infinity and -Infinity as javascript compatible "NaN", "Infinity" and "-Infinity",
this encoding is popular with other projects and allows correct transmission of these values
from ODB to javascript. The test code for this is on the MIDAS "Example" page, scroll down
to "Test nan and inf encoding".

I think this type of encoding, using strings to encode special values, is more in the spirit of json,
compared to other approaches such as adding special literals just for a few special cases
leaving other special cases in the cold (ieee-754 specifies several different types of NaN,
you can encode them into different nan-strings, but not into the one nan-literal (need more nan-literals,
requires change to the standard and change to every json parser).

As editorial comment, it boggles my mind, what university or kindergarden these people went to
who made the biggest number, the smallest number and the imaginary number (sqrt(-1))
all equal to zero (all encoded as literal null).

K.O.
Entry  31 Mar 2022, Stefan Ritt, Suggestion, Maximum ODB size 
Anybody some idea what the maximum ODB size can be? In the old days, the linux 
kernels had a severe limit on shared memory of usually 8MB, but in the age of 
64GB RAM being a standard, we should be able to grow bigger. Tried

odbinit -s 1024MB --cleanup

which went through without complain, even put that value in to .ODB_SIZE.TXT, but 
when I started odbedit doing "mem", I only see a size of 1MB. Probably somewhere 
deep inside we have a limit which prevents the user to create very large ODBs, 
but this should be mentioned more prominently in odbinit. Like "size too large, 
maximum allowd is xxx MB".

Stefan
    Reply  04 Apr 2022, Konstantin Olchanski, Suggestion, Maximum ODB size 
> Anybody some idea what the maximum ODB size can be?

It turns out ODB size limit is hardwired on db_open_database() at 100 Mbytes.

I now committed an improved error message for this.

I confirm that "odbinit -s 100MB" works and creates ODB with 50 Mbyte data area and 50 
Mbyte key area.

> in the age of 64GB RAM being a standard, we should be able to grow bigger ...

I agree, I think we can safely bump the limit from 100 Mbytes to 1 Gbyte, maybe 1.5 or 
1.99 Gbytes. Above that we run into 32-bit/31-bit cleanliness problems.

And creating extra large 1 GB ODB but using only a few megabytes will not waste any 
RAM, because the .ODB.SHM file is demand-paged and non-used parts of ODB will not be 
mapped into RAM. (It will waste disk space, file .ODB.SHM will be 1 GByte size).

However, 1 GByte (FPGA based) and 4-8 GByte (Raspberry Pi & co) machines are again
becoming popular and relevant for running MIDAS, and they have very slow "disk" 
subsystems, with NAND, SD and USB flash, so we should not go crazy here.

> odbinit -s 1024MB --cleanup

there is a bug in odbinit, if initial odbinit fails, ODB with default size is creates, 
and original rejected ODB size is written to .ODB_SIZE.TXT (an inconsistency).

bitbucket bug 328

> [ how do I resize ODB ??? ]

we need odbresize. bitbucket bug 329.

K.O.
Entry  31 Mar 2022, Konstantin Olchanski, Bug Fix, "run stop" trouble in mlogger, fixed 
while debugging something else, I ran into a bit of trouble in mlogger.

I set the mlogger event limit to 100, and after reaching 100 events, mlogger
sayd "stopping run", but nothing happened, run kept going.

it turns out mlogger tried stopping the run too soon, the run-start
transition did not finish yet and the error message about trying
to stop a run while another transition is in progress was missing.

(fixed - if another transition is in progress, we try again later)

it also turns out that cm_transition() checks if another transition
is in progress way too late, all the way in the transition thread,
where it cannot return it is an error to mlogger.

(fixed - first thing done in cm_transition() is this check).

while debugging this, I tested the ODB flags "/Logger/Async transitions"
and /Logger/Multithread transitions". It turns out only two transition
types still work from inside mlogger - multithread transition
and detached transition (via the mtransition helper).

the issue is the dead lock between mlogger and frontend. while mlogger
is inside cm_transition(), it is not reading the SYSTEM buffer,
while at the same time frontends are writing into it. If SYSTEM
buffer happens to be pretty full, we dead lock - frontends are waiting
for free space in the SYSTEM buffer do not respond to RPCs, mlogger is not 
reading from the SYSTEM and it stuck trying to issue "run stop" RPC
to frontend. (this dead lock is not forever, eventually frontend
is killed by RPC timeout, mlogger survives and stops the run).

this is a well known problem and as solution, mlogger has been using the 
multithreaded transitions for years.

now I removed the OBD /Logger/Async transition and /Logger/Multithread 
transition flags, instead, there is now a flag /Logger/Detached transitions
set to FALSE by default. Setting it to TRUE will cause mlogger to fork 
"mtransition STOP" and "mtransition START" for stopping and starting runs,
this is useful in case there is trouble with multithreading in mlogger.

K.O.
Entry  30 Mar 2022, Konstantin Olchanski, Bug Fix, erroneous removal of odb clients, fixed 
commit https://bitbucket.org/tmidas/midas/commits/b1fe21445109774be3f059c2124727b414abf835
made on 2022-02-21 fixed a serious bug in ODB.

a multithread race condition against an incorrectly updated shared variable caused removal of 
random clients from ODB with error message:

My client index %d in ODB is invalid: out of range 0..%d. Maybe this client was removed by a 
timeout, see midas.log. Cannot continue, aborting...

the race is between db_open_database() in one program (executed when any midas program starts) and 
db_get_my_client_locked() in all running midas programs.

as long as no midas programs are started (db_open_database() is not executed), this bug does not 
happen.

if i.e. odbedit is executed very often, i.e. from a script, probability of hitting this bug becomes 
quite high.

fixed now.

K.O.
Entry  29 Mar 2022, Konstantin Olchanski, Bug Fix, mdump can read lz4 and bz2 files now 
I converted mdump file i/o from older mdsupport library to newer midasio library 
and it can now read .mid, .mid.gz, .mid.lz4 and .mid.bz2 files. Output should be 
identical to what it printed before, if you see any differences, please report 
them here or on bitbucket. K.O.
Entry  29 Mar 2022, Hunter Lowe, Forum, Triggering without LAM signal - mcstd_libgpmc_camac driver 
Hello,

I have a question for anyone experienced with simple CAMAC systems.
 My understanding is that for a single ADC system you can use a gate to generate a
 LAM signal for triggering on ADC.
 The driver that I have "mcstd_libgpmc_camac" has LAM "not implemented" though,
 so I'm not sure how I should trigger DAQ. The frontend code that I have seems to use a TDC
 as trigger for ADC via "EQ_POLLED" type equipment setting. Should I simply plug in TDC in my
 system and use this as trigger? Is it as simple as TDC generates signal via gate and ADC performs job? 

Sorry if question is super basic, just confused how to trigger without LAM signal.

Thank you :)

Hunter Lowe
UNBC Grad Physics
Entry  12 Dec 2021, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  dummy_fe.cpp
Dear all,

in 13 Feb 2020 to 21 Feb 2020 we had a talk about how I try to create MIDAS events directly on a FPGA and 
than use DMA to hand the event over to MIDAS. In the thread I also explained how I do it in my MIDAS frontend. 

For testing the DAQ I created a dummy frontend which was emulating my FPGA (see attached file). The interesting code is 
in the function read_stream_thread and there I just fill a array according to the 32b BANKS which are 64b aligned (more or less
the lines 306-369). And than I do:

    uint32_t * dma_buf_volatile;
    dma_buf_volatile = dma_buf_dummy;

    copy_n(&dma_buf_volatile[0], sizeof(dma_buf_dummy)/4, pdata);

    pdata+=sizeof(dma_buf_dummy);
    rb_increment_wp(rbh, sizeof(dma_buf_dummy)); // in byte length

to send the data to the buffer.

This summer (Mai - July) everything was working fine but today I did not get the data into MIDAS. 
I was hopping around a bit with the commits and everything was at least working until: 3921016ce6d3444e6c647cbc7840e73816564c78.

Thanks,
Marius
    Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, Writting MIDAS Events via FPGAs  
> today I did not get the data into MIDAS. 

Any error messages printed by the frontend? any error message in midas.log? core dumps? crashes?

I do not understand what you mean by "did not get the data into midas". You create events
and send them to a midas event buffer and you do not see them there? With mdump?
Do you see this both connected locally and connected remotely through the mserver?

BTW, I see you are using the mfe.c frontend. Event data handling in mfe.c frontends
is quite convoluted and impossible to straighten out. I recommend that you use
the tmfe c++ frontend instead. Event data handling is much simplified and is easier to debug
compared to the mfe.c frontend. There is examples in the midas repository and there are
tutorials for converting frontends from mfe.c to tmfe posted in this forum here.

BTW, the commit you refer to only changed some html files, could not have affected
your data.

K.O.
       Reply  26 Jan 2022, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  
> Any error messages printed by the frontend? any error message in midas.log? core dumps? crashes? 
> I do not understand what you mean by "did not get the data into midas". You create events
> and send them to a midas event buffer and you do not see them there? With mdump?
> Do you see this both connected locally and connected remotely through the mserver?

I simply don't see the event counter counting up and I also don't see them using mdump. No logs, no dumps and no crashes - every is quite. I only tested it locally.
 
> BTW, I see you are using the mfe.c frontend. Event data handling in mfe.c frontends
> is quite convoluted and impossible to straighten out. I recommend that you use
> the tmfe c++ frontend instead. Event data handling is much simplified and is easier to debug
> compared to the mfe.c frontend. There is examples in the midas repository and there are
> tutorials for converting frontends from mfe.c to tmfe posted in this forum here.

I know the code I used is really old that's why I was so surprised that it suddenly did not work. But I am on the way to change it. Also Stefan gave me some comments on how to improve the code. But still changing them did not really change the behavior. 

> BTW, the commit you refer to only changed some html files, could not have affected
> your data.

I just hopped around and the commit I send was the first one which worked again. But it's of course not the one where the stuff broke. I did a bit of git-bisect and ended up with this commit as the first one where my frontend is not working anymore: 91582e4172d534bf9b10e661a423c399fd1a69f4

Cheers,
Marius
          Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, Writting MIDAS Events via FPGAs  
> 
> > Any error messages printed by the frontend? any error message in midas.log? core dumps? crashes? 
> > I do not understand what you mean by "did not get the data into midas". You create events
> > and send them to a midas event buffer and you do not see them there? With mdump?
> > Do you see this both connected locally and connected remotely through the mserver?
> 
> I simply don't see the event counter counting up and I also don't see them using mdump. No logs, no dumps and no crashes - every is quite. I only tested it locally.
>

If you are connected locally (no mserver), I want to know the value returned by bm_send_event(). Simplest
if you edit mfe.c and everywhere it calls bm_send_event() and rpc_send_event(), print the returned value.

It would be very interesting to see if bm_send_event() returns 1 (SUCCESS), but the event vanishes
without a trace.

Before you do that, try something simpler:

Run "mdump -s -d", it will print some event buffer internals.

Watch to see if any data pointers change when you send your events ("wp", "rp", etc).

If nothing changes at all, then we are not sending anything (fault is in your code or on mfe.c).

If you see "wp" counting up, then we definitely write your events into the buffer and mdump & mlogger should see them.

But there is some funny logic for event_id and trigger_mask and it is worth checking their
values. For a good test, set event_id=1 and trigger_mask=0x1. There might be trouble if either is set to zero.

K.O.
             Reply  26 Jan 2022, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  
> If you are connected locally (no mserver), I want to know the value returned by bm_send_event(). Simplest
> if you edit mfe.c and everywhere it calls bm_send_event() and rpc_send_event(), print the returned value.
> 
> It would be very interesting to see if bm_send_event() returns 1 (SUCCESS), but the event vanishes
> without a trace.

I checked bm_send_event(rbh, (EVENT_HEADER*)(&pdata[0]), 0, 20); which gives me back 1. I also check the status of rb_increment_wp which is also 1.

> Before you do that, try something simpler:
> Run "mdump -s -d", it will print some event buffer internals.
> Watch to see if any data pointers change when you send your events ("wp", "rp", etc).

"rp" & "wp" are not counting up. 

> But there is some funny logic for event_id and trigger_mask and it is worth checking their
> values. For a good test, set event_id=1 and trigger_mask=0x1. There might be trouble if either is set to zero.

Changing both to 0x1 did not change the behavior. 

Cheers,
Marius
    Reply  28 Jan 2022, Stefan Ritt, Bug Report, Writting MIDAS Events via FPGAs  dummy_fe.cpp
I finally got the dummy program working. There were several issues:

- event_buffer_size was defined as 10000 * 32 MB = 320 GB, exceeding the RAM of the computer

- SERIAL number starting with 1. Actually in midas, event serial numbers always started with zero, but this was wrong in the documentation at 
https://midas.triumf.ca/MidasWiki/index.php/Event_Structure, so I also fixed the documentation

- the event header time stamp must be seconds since 1.1.1970, and thus the function ss_time() should be used to set it

- calling set_equipment_status() for each event slows down the event collection considerably, since this function access the ODB each time

- dma_buf_dummy is defined inside the event loop, so it gets allocated and de-allocated on the stack for each event. Of course this might vanish 
when the real FPGA buffer will be used.

- The line pdata+=sizeof(dma_buf_dummy); is wrong. pdata is pointer to uint32_t, but the sizeof() operation returns the size of the 
dma_buf_dummy in bytes. Therefore, pdata gets incremented by four times the size of dma_buf_dummy

- Instead the call to std::this_thread::sleep_for(std::chrono::milliseconds(2000)); one can call the standard midas call ss_sleep(2000); which 
is a bit shorter

- Finally, sending many events to the ring buffer triggered a bug in the midas ring buffer functions which were lingering there since 2007. I'm 
glad that this happened and now could be fixed. Not sure if other experiments where affected in the last decade by that. This could have 
manifested itself in lost events or crashing front-ends. Anyhow, now it's fixed. You need to update midas to get the fix.

I attached a working version of the dummy program for your reference. Banks a different but the principle should become clear.

Stefan
       Reply  16 Feb 2022, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  
I just came back to this and started to use the dummy frontend.
Unfortunately, I have a problem during run cycles: 

Starting the frontend and starting a run works fine -> seeing events with mdump and also on the web GUI. 
But when I stop the run and try to start the next run the frontend is sending no events anymore.
It get stuck at line 221 (if (status == DB_TIMEOUT)).
I tried to reduce the nEvents to 1 which helped in terms of DB_TIMEOUT but still I don't get any events after I did a stop / start cycle -> no events in mdump and no events counting up at the web GUI.
If I kill the frontend in the terminal (ctrl+c) and restart it, while the run is still running, it starts to send events again.

Cheers,
Marius
          Reply  03 Mar 2022, Stefan Ritt, Bug Report, Writting MIDAS Events via FPGAs  
> Starting the frontend and starting a run works fine -> seeing events with mdump and also on the web GUI. 
> But when I stop the run and try to start the next run the frontend is sending no events anymore.
> It get stuck at line 221 (if (status == DB_TIMEOUT)).
> I tried to reduce the nEvents to 1 which helped in terms of DB_TIMEOUT but still I don't get any events after I did a stop / start cycle -> no events in mdump and no events counting up at the web GUI.
> If I kill the frontend in the terminal (ctrl+c) and restart it, while the run is still running, it starts to send events again.

This problem has (likely) been fixed in the current version. Please pull develop and try again. Was a recursive call to the event collection routine which is only triggered if you send events faster than 
the logger can digest, so not many people see it.

Best,
Stefan
             Reply  07 Mar 2022, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  
> This problem has (likely) been fixed in the current version. Please pull develop and try again. Was a recursive call to the event collection routine which is only triggered if you send events faster than 
> the logger can digest, so not many people see it.

I just pulled the current version (d945fa9) but the problem as explained in 2347 stays the same.

Best,
Marius
                Reply  25 Mar 2022, Marius Koeppel, Bug Report, Writting MIDAS Events via FPGAs  
I finally found the problem why the readout stops after a run transition. 

In my dummy frontend the serial number was not reset to zero at run start. 
This leads to a mismatch of the serial number in the function receive_trigger_event of mfe.cxx:1247.
Which is than resulting in the problem that the function founds never a new event in all ring buffers and nothing get read out of the buffer.

Nevertheless, it would be nice that the system would tell the user that there is a mismatch in the serial number (printing a warning / error etc.). 

Cheers,
Marius
Entry  23 Mar 2022, Konstantin Olchanski, Bug Fix, mhttpd bug fixed 
the mhttpd bug should be fixed now (branch feature/buffer_mutex).

simplest way to reproduce:

wget http://localhost:8080/
quickly ctrl-C it
wget http://localhost:8080/
inside mhttpd (by hook or crook) observe that the second wget got the data meant for the first wget.

if you cannot ctrl-C the first wget quickly enough, put a sleep somewhere in the worker thread (in 
mongoose_write(), I think).

this is what happens.

1st wget stops (by ctrl-C), socket is closed, mongoose frees it's mg_connection object
(corresponding worker is still labouring, hmm... actually sleeping, and now has a stale nc pointer)

2nd wget starts, new socket is opened, mongoose allocates a new mg_connection object,
but malloc() gives it back the same memory we just freed(), and the 1st wget's worker thread
nc pointer is no longer stale, but points to 2nd wget's connection.

so we think we are clever and we check the socket file descriptors. but same thing
happens there, too. if 1st wget was file descriptor 7, it is closed, (1st wget worker now has
a stale file handle), then reopened for the 2nd wget, per POSIX, we get back the same
file descriptor 7. 1st wget worker now has the file handle for the 2nd wget tcp socket and
the famous test/crash for "sending data to wrong socket" is defeated.

now, worker thread for the 1st wget wants to send a reply, it has a valid nc pointer (points to 2nd wget's
mg_connection object) and a valid file descriptor (points to 2nd wget's tcp socket),
reply meant for the 1st wget is successfully sent to the 2nd wget, 2nd wget finishes, it's socket
is closed, mg_connection object is free'ed. Now the worker thread for the 2nd wget has stale
connection info, but this is okey, mongoose does not find a matching connection, 2nd wget
worked thread reply goes nowhere, thread finishes silently (no memory leaks here, I checked).

so, connection for 2nd wget completely impersonates the closed connection of 1st wget (I guess I could
check the full socket address info, remote ip address, remote port number, etc, but...)

in practice, this bug does not happen often because modern browsers tend to keep tcp sockets open
for very long time. (not sure about sundry web proxies, etc).

solution of course is very simple. match worker thread data to mongoose mg_connection objects
using our own connection sequential number, which are unique and very easy to keep track
of through the mongoose event handler. all this mess runs in the main thread,
so no locking trouble here, small blessing.

K.O.
    Reply  24 Mar 2022, Stefan Ritt, Bug Fix, mhttpd bug fixed 
> 1st wget stops (by ctrl-C), socket is closed, mongoose frees it's mg_connection object
> (corresponding worker is still labouring, hmm... actually sleeping, and now has a stale nc pointer)
> 
> 2nd wget starts, new socket is opened, mongoose allocates a new mg_connection object,
> but malloc() gives it back the same memory we just freed(), and the 1st wget's worker thread
> nc pointer is no longer stale, but points to 2nd wget's connection.

Why don't we CLEAR the memory (memset(object,0,sizeof(object)) before the free(), this way it cannot be 
mistakenly re-used by the next thread.

Stefan
       Reply  24 Mar 2022, Konstantin Olchanski, Bug Fix, mhttpd bug fixed 
> > 1st wget stops (by ctrl-C), socket is closed, mongoose frees it's mg_connection object
> > (corresponding worker is still labouring, hmm... actually sleeping, and now has a stale nc pointer)
> > 
> > 2nd wget starts, new socket is opened, mongoose allocates a new mg_connection object,
> > but malloc() gives it back the same memory we just freed(), and the 1st wget's worker thread
> > nc pointer is no longer stale, but points to 2nd wget's connection.
> 
> Why don't we CLEAR the memory (memset(object,0,sizeof(object)) before the free(), this way it cannot be 
> mistakenly re-used by the next thread.
> 

My description was unclear. I will try better now.

When http replies are generated by worker threads, matching of reply to mg_connection is done
by checking the address of the mg_connection object. (mongoose itself unhelpfully offers
to send the reply to every mg_connection, see the responder to mg_broadcast() messages).

This works for open/active connections, addresses of all mg_connections are unique.

But if connection is closed and a new connection is opened, the address is reused (by malloc()/free()
reusing memory blocks or by mongoose using a pool of mg_connection objects, does not matter).

So matching http reply to mg_connection using only address of mg_connection can match the wrong connection.

(contents of mg_connection object does not matter, only address is used by matching. so memzero() of
mg_connection object does not help).

I saw this during my testing - wrong data was sent to wrong browser often enough - but did
not understand that the above problem is happening.

Because I was unable to reliably reproduce the problem, I could not debug it. I tried to add
a check for the tcp socket file descriptor number, in case there is a straight bug or multithread race
or simple memory corruption. This replaced "we sent wrong data to wrong browser, poisoned browser
cache, confused the user" with a crash. This "fix" seemed effective at the time.

Maybe I should mention browser cache poisoning again. What happened is html pages and rpc replies
were returned as responses to load things like CSS files, these bad responses are cached by the browser
pretty much forever, so all subsequent midas pages will look wrong (bad css!) forever, until
user manually clears browser cache. reload of page did not help, restart of browser did not help (I think).

So a very bad bug.

Unfortunately, the check for file descriptor was not effective because file descriptors are also
reused. And I did see wrong data returned by mhttpd, but even more rarely. And everybody (myself
included) complained about mhttpd crashes.

Now, matching of responses to connections is done by connection sequential/serial number,
which is unique 32-bit counter. Mismatch of reply to connection should not happen again.

P.S. Latest version of the mongoose web server library does not help with this problem,
the example code for matching reply to connection in their multithread example looks bogus:
https://github.com/cesanta/mongoose/blob/master/examples/multi-threaded/main.c

K.O.
          Reply  24 Mar 2022, Stefan Ritt, Bug Fix, mhttpd bug fixed 
I see, now I understand.

As for the browser cache problem: This Chrome extension is your friend: 

https://chrome.google.com/webstore/detail/clear-cache/cppjkneekbjaeellbfkmgnhonkkjfpdn?hl=en

I use it all the time I change the CSS or a JS file. Having the "Developer Tools" open in Chrome helps as well 
(cache is then turned off). Firefox has similar extensions.

Stefan
             Reply  24 Mar 2022, Konstantin Olchanski, Bug Fix, mhttpd bug fixed 
> As for the browser cache problem: This Chrome extension is your friend ...

for google chrome, it is easy, open the javascript debugger (left-click "inspect"),
the reload button becomes a left-click menu, one left-click option is "clear cache and reload".
(there is no button for "clear cookies and reload", re recent elog cookie problem).

but this does not help me personally any. if midas web pages get confused, I will also get confused, too,
and I will spend hours debugging mhttpd before thinking "hmm... maybe I should clear the browser cache!"

not sure about firefox, safari, microsoft edge and opera. if I ever need it, I google it.

K.O.
Entry  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
Dear all

We just started our beam time at ILL and just found yesterday that for certain 
settings of our detector the data is not saved into the .mid files. Running "mdump 
-l 10" online we see the data coming in as they should. Nevertheless, if we run 
"mdump -x runXXXXXX.mid" offline, the data file has no events and the banks are 
missing. Any ideas where the data could go lost?

Thanks in advance,
Ivo
    Reply  10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
> Dear all
> 
> We just started our beam time at ILL and just found yesterday that for certain 
> settings of our detector the data is not saved into the .mid files. Running "mdump 
> -l 10" online we see the data coming in as they should. Nevertheless, if we run 
> "mdump -x runXXXXXX.mid" offline, the data file has no events and the banks are 
> missing. Any ideas where the data could go lost?
> 
> Thanks in advance,
> Ivo

Have you checked 

/Logger/Channels/0/Settings/Event ID = -1
/Logger/Channels/0/Settings/Trigger mask = -1

If these settings are not -1, they filter the data stream for certain events and trigger 
masks.

Stefan
       Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> > Dear all
> > 
> > We just started our beam time at ILL and just found yesterday that for certain 
> > settings of our detector the data is not saved into the .mid files. Running "mdump 
> > -l 10" online we see the data coming in as they should. Nevertheless, if we run 
> > "mdump -x runXXXXXX.mid" offline, the data file has no events and the banks are 
> > missing. Any ideas where the data could go lost?
> > 
> > Thanks in advance,
> > Ivo
> 
> Have you checked 
> 
> /Logger/Channels/0/Settings/Event ID = -1
> /Logger/Channels/0/Settings/Trigger mask = -1
> 
> If these settings are not -1, they filter the data stream for certain events and trigger 
> masks.
> 
> Stefan

Good morning Stefan

Both set to -1. We only have one logging channel. If we run a sequence with a few runs and the 
same settings, sometimes data is in the .mid file and sometimes it is not.

Best,
Ivo 
          Reply  10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
> Both set to -1. We only have one logging channel. If we run a sequence with a few runs and the 
> same settings, sometimes data is in the .mid file and sometimes it is not.

Then I'm running out of ideas. Things I would check:

- Are the file sizes about the same? 

- When you dump the .mid file, you do you see your bank names? 

This would tell you if the events are really missing or if mdump would just not find them.

But I guess without being able to debug the system at ILL I cannot be of any more help. You are the 
first one reporting such a problem, so it must have to do with your local setup.

Stefan
             Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> Then I'm running out of ideas. Things I would check:
> 
> - Are the file sizes about the same? 
> 
> - When you dump the .mid file, you do you see your bank names? 
> 
> This would tell you if the events are really missing or if mdump would just not find them.
> 
> But I guess without being able to debug the system at ILL I cannot be of any more help. You are the 
> first one reporting such a problem, so it must have to do with your local setup.
> 
> Stefan

So I did a quick check. The file size is about the same (322K and 329K). When I dump the .mid I don't see 
the banks. It only prints two lines with "------ Event# 0 ------" and "------ Event# 1 ------" whereas for 
the file with data I get the two banks with all the data. Our online analyzer also fails to see the banks. 
Is there another way to check what is in the .mid file?

Best,
Ivo
                Reply  10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
> So I did a quick check. The file size is about the same (322K and 329K). When I dump the .mid I don't see 
> the banks. It only prints two lines with "------ Event# 0 ------" and "------ Event# 1 ------" whereas for 
> the file with data I get the two banks with all the data. Our online analyzer also fails to see the banks. 
> Is there another way to check what is in the .mid file?

with "dump" I meant a true object dump like "hexdump -C run000001.mid". I produced a file with ADC0 and TDC0 
banks (that's the example from the distribution under exampels/experiments/frontend.cxx), and I get

....
00024220  01 00 00 00 41 44 43 30  04 00 08 00 eb 06 35 04  |....ADC0......5.|
00024230  31 09 4f 06 54 44 43 30  04 00 08 00 93 04 fb 07  |1.O.TDC0........|
00024240  5c 09 88 0b 01 00 00 00  01 00 00 00 2a 0b 31 5f  |\...........*.1_|
00024250  28 00 00 00 20 00 00 00  01 00 00 00 41 44 43 30  |(... .......ADC0|
00024260  04 00 08 00 c3 09 24 05  85 05 f3 06 54 44 43 30  |......$.....TDC0|
00024270  04 00 08 00 88 08 2d 03  3b 0d d6 02 01 00 00 00  |......-.;.......|
00024280  02 00 00 00 2a 0b 31 5f  28 00 00 00 20 00 00 00  |....*.1_(... ...|
00024290  01 00 00 00 41 44 43 30  04 00 08 00 a5 0a 69 09  |....ADC0......i.|

where you clearly see the ADC0 and TDC0 banks.

Stefan
                   Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> with "dump" I meant a true object dump like "hexdump -C run000001.mid". I produced a file with ADC0 and TDC0 
> banks (that's the example from the distribution under exampels/experiments/frontend.cxx), and I get
> 
> ....
> 00024220  01 00 00 00 41 44 43 30  04 00 08 00 eb 06 35 04  |....ADC0......5.|
> 00024230  31 09 4f 06 54 44 43 30  04 00 08 00 93 04 fb 07  |1.O.TDC0........|
> 00024240  5c 09 88 0b 01 00 00 00  01 00 00 00 2a 0b 31 5f  |\...........*.1_|
> 00024250  28 00 00 00 20 00 00 00  01 00 00 00 41 44 43 30  |(... .......ADC0|
> 00024260  04 00 08 00 c3 09 24 05  85 05 f3 06 54 44 43 30  |......$.....TDC0|
> 00024270  04 00 08 00 88 08 2d 03  3b 0d d6 02 01 00 00 00  |......-.;.......|
> 00024280  02 00 00 00 2a 0b 31 5f  28 00 00 00 20 00 00 00  |....*.1_(... ...|
> 00024290  01 00 00 00 41 44 43 30  04 00 08 00 a5 0a 69 09  |....ADC0......i.|
> 
> where you clearly see the ADC0 and TDC0 banks.
> 
> Stefan

So at least I learned something new. I tried it with the hexdump and the banks are not existent in the .mid file. I 
only have the ODB inside the file. The 7K difference in size is actually just about what I expect to be the data 
(1792 x 4 bytes)

Best, Ivo
                      Reply  10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
Have you tried longer files? Maybe a few 100 MB or so. Maybe a buffer is not flushed correctly at the end of a run.
                         Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> Have you tried longer files? Maybe a few 100 MB or so. Maybe a buffer is not flushed correctly at the end of a run.

Yes, I did. This 7 KB of the data bank is about the limit. If we go only 1 KB higher it seems that we save all data. In 
our specific case, this is the number of time bins (256 pixels with 7 time bins results in data loss, with 8 time bins it 
seems to be okay, data type is DWORD). 

Of course, a workaround for us is to save at least 8 time bins and throw 7 of them away later on. Nevertheless, since we 
are only in the commissioning phase now this is okay, I would just like to avoid data loss in the data taking phase of the 
experiment so knowing where the problem origins could help. 

I did another test with another FE running that produces a lot of data. The behavior is the same though. If the bank size 
is less than about 8 KB, the bank is not saved anymore. But probably this is anyway the expected behavior since it is a 
different FE that produces the data. 

So if it is coming from the buffer, is there something I could change to test or solve the problem?

Best, Ivo
                            Reply  10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
I have to reproduce the problem to fix it. Why don't you go and modify midas/examples/experiment/frontend.cxx in such a way that 
it creates exactly the banks you have, just with random data. If you see the same problem, send me your frontend file so that I 
can reproduce it.
                               Reply  11 Aug 2020, Konstantin Olchanski, Bug Report, data missing in runXXXXXX.mid 
> I have to reproduce the problem to fix it. Why don't you go and modify midas/examples/experiment/frontend.cxx in such a way that 
> it creates exactly the banks you have, just with random data. If you see the same problem, send me your frontend file so that I 
> can reproduce it.

It would be good to pin point there the data is lost. This is the sequence:

frontend user code -> mfe.c code -> SYSTEM buffer -> mlogger -> disk

To see if correct data arrives to the SYSTEM buffer, run:
mdump -z SYSTEM

To see if mlogger is receiving events from the SYSTEM buffer, run:
mlogger -v ### mlogger should report all events, history and data

To see if mlogger writes events to disk, examine the disk file (in this case, you already did, data is not there).

I would guess that your data does not make it out from the frontend (mdump shows "nothing"),
if data were to arrive into the SYSTEM buffer, it would make it to disk, unless
mlogger is misconfigured (but you already checked that).

If you have trouble with the frontend framework code, you can try to switch from the mfe.c frontend
to the newer c++ tmfe frontend (see progs/fetest_tmfe.cxx and progs/fetest_tmfe_thread.cxx).

K.O.
                                  Reply  11 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> It would be good to pin point there the data is lost. This is the sequence:
> 
> frontend user code -> mfe.c code -> SYSTEM buffer -> mlogger -> disk
> 
> To see if correct data arrives to the SYSTEM buffer, run:
> mdump -z SYSTEM
> 
> To see if mlogger is receiving events from the SYSTEM buffer, run:
> mlogger -v ### mlogger should report all events, history and data
> 
> To see if mlogger writes events to disk, examine the disk file (in this case, you already did, data is not there).
> 
> I would guess that your data does not make it out from the frontend (mdump shows "nothing"),
> if data were to arrive into the SYSTEM buffer, it would make it to disk, unless
> mlogger is misconfigured (but you already checked that).
> 
> If you have trouble with the frontend framework code, you can try to switch from the mfe.c frontend
> to the newer c++ tmfe frontend (see progs/fetest_tmfe.cxx and progs/fetest_tmfe_thread.cxx).
> 
> K.O.

Good evening

I tried to reproduce the behavior in a very simple FE but it did not work out. The next thing for me would be to take the FE that is producing this behavior, replace all the device communication and data with dummies. If the problem is still there I would start to simplify as much as possible. 

Following the inputs of KO, I pin-pointed the data loss. The system buffer still gets the data but the mlogger does not write the data event. Then of course the data is also not anymore present in the data file. Therefore, I checked the logger settings again, Event ID and Trigger Mask still -1. Nothing else, at least from my point of view, that is misconfigured. Nevertheless, if it helps I can send my ODB settings. 

When doing the tests just before I found something else that probably can give a hint to the problem. The data is only lost if the time between two runs is long (a few seconds). As an example: If I run a sequence with a loop and after the FE stops the run the loop ends and the next run is started automatically, then only the first run has no data, which is the one after a longer time of no data taking. When I add a "WAIT Seconds 5" after the run before starting the next, not data is written to the disk for any run. I also found this once when adding a sleep(1) at the end of the FE readout function but back then did not think about it any further. 

Best, Ivo
                                     Reply  24 Mar 2022, Konstantin Olchanski, Bug Report, data missing in runXXXXXX.mid 
> > It would be good to pin point there the data is lost. This is the sequence:
> > 
> > frontend user code -> mfe.c code -> SYSTEM buffer -> mlogger -> disk
> > 
> > To see if correct data arrives to the SYSTEM buffer, run:
> > mdump -z SYSTEM
> > 
> > To see if mlogger is receiving events from the SYSTEM buffer, run:
> > mlogger -v ### mlogger should report all events, history and data
> > 
> > To see if mlogger writes events to disk, examine the disk file (in this case, you already did, data is not there).
> > 
> > I would guess that your data does not make it out from the frontend (mdump shows "nothing"),
> > if data were to arrive into the SYSTEM buffer, it would make it to disk, unless
> > mlogger is misconfigured (but you already checked that).
> > 
> > If you have trouble with the frontend framework code, you can try to switch from the mfe.c frontend
> > to the newer c++ tmfe frontend (see progs/fetest_tmfe.cxx and progs/fetest_tmfe_thread.cxx).
> > 
> > K.O.
> 
> Good evening
> 
> I tried to reproduce the behavior in a very simple FE but it did not work out.
> The next thing for me would be to take the FE that is producing this behavior,
> replace all the device communication and data with dummies. If the problem is still
> there I would start to simplify as much as possible. 
> 
> Following the inputs of KO, I pin-pointed the data loss. The system buffer still
> gets the data but the mlogger does not write the data event. Then of course the data
> is also not anymore present in the data file. Therefore, I checked the logger
> settings again, Event ID and Trigger Mask still -1. Nothing else, at least from my point of view,
> that is misconfigured. Nevertheless, if it helps I can send my ODB settings. 
> 
> When doing the tests just before I found something else that probably
> can give a hint to the problem. The data is only lost if the time between
> two runs is long (a few seconds). As an example: If I run a sequence with a loop
> and after the FE stops the run the loop ends and the next run is started automatically,
> then only the first run has no data, which is the one after a longer time of
> no data taking. When I add a "WAIT Seconds 5" after the run before starting
> the next, not data is written to the disk for any run. I also found this
> once when adding a sleep(1) at the end of the FE readout function
> but back then did not think about it any further. 
> 

Looks like this problem fell into the covid crack.

As far as I know, MIDAS does not lose any events between bm_send_event() and the shared memory
buffer. It does not lose any events in the mlogger (unless the "event request" is misconfigured).
(there is lots of opportunity to lose events in complicated frontends).

If you have some evidence otherwise, I would very much like to hear about
it and I want to fix all problems that cause it.

In your previous report I was under the impression that you lose random events here and there,
but your latest report is about mlogger not writing anything at all.

Which case is it?

If you can definitely say that all your events make it to the SYSTEM buffer
but mlogger sometimes does not see some of them and sometimes does not see all of them,
we should look very closely at bm_receive_event() and mlogger itself.

In the case where mlogger is not seeing any events at all (output file is empty), as this is
happening, I would like to see the output of mdump (to confirm events are written to SYSTEM
buffer with correct event_id and trigger_mask) and the output of (say)
"manalyzer_test.exe --dump run01161.mid.lz4" on your output file.

If the output is very long, you can email it to me directly instead of posting it here.

K.O.
                                        Reply  24 Mar 2022, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid 
One idea: we should have a look at mlogger::close_channels(). There the SYSTEM buffer is emptied through the cm_yield() call. Instrumenting this with some debugging code will enlighten us. 

Another possible problem: If the frontend requested to be notified for a run stop AFTER the logger, then the problem might happen: Logger closes file, and THEN the frontend flushes events ending up in the SYSTEM buffer and being logged at the beginning of the next run. The mfe.cxx framework takes care of this by calling 

cm_register_transition(TR_SOP, 500);

while the mlogger does 

cm_register_transition(TR_STOP, tr_stop, 800);

and since 800 > 500 the logger will be called AFTER the frontend. If one use a framework different from mfe.cxx, this could however be different.

Stefan
                                           Reply  24 Mar 2022, Konstantin Olchanski, Bug Report, data missing in runXXXXXX.mid 
> One idea: we should have a look at mlogger::close_channels().
> There the SYSTEM buffer is emptied through the cm_yield() call.
> Instrumenting this with some debugging code will enlighten us.

right. this will "last few events are lost at the end of run".

but that code in the mlogger was not touched in years, if there is a problem there,
we would have seen it by now, most experiments check that the number
of events in the data file is same as number of triggers generated, both
numbers are shown on the midas status page.

> Another possible problem: If the frontend requested to be notified for a run stop AFTER the logger, then the problem might happen: Logger closes file, and THEN the frontend flushes events ending up in the SYSTEM buffer and being logged at the beginning of the next run. The mfe.cxx framework takes care of this by calling 
> cm_register_transition(TR_STOP, 500);

default sequence, both mfe.c frontend and c++ tmfe frontend:

start of run:
- mlogger first (configure history, open data file)
- frontends last
- (if any frontend fails, TR_STARTABORT is sent to mlogger to close the output file and "undo" the run start)

end of run:
- frontends first (must not send any events after after processing the TR_STOP RPC call, inside the TR_STOP handler, bm_flush_cache() takes care of the write cache)
- mlogger last
- (if any frontend fails, failure is ignored, run stops regardless)

wrong order will be only if they manually change it, and whatever order
they set, you see it on the midas transition page (and mtransition -v and odbedit stop now -v, etc).

K.O.
Entry  22 Mar 2022, Konstantin Olchanski, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
multithreaded frontends have an unusual event buffer corruption if the write 
cache is enabled. For a long time now I had to disable the write cache on
all multithreaded frontends in alpha-g, I was hitting this bug quite often.
(somehow I do not see this problem reported on bitbucket!)

last week I reworked the multithread locking of event buffers, in hope
that this bug will turn up, but nope, all mutexes and locking look okey,
except for a number of unrelated problems (races against bm_close_buffer()
were the most troublesome to fix).

but finally found the trouble.

first, some background.

because multiprocess locking is expensive, frontends that generate
a large number of small events can use the write cache to reduce
this overhead. instead of locking the shared memory event buffer for
each event, events are accumulated in the write cache, and periodic
calls to bm_flush_buffer() flush them to shared memory. For best effect,
one should increase the size of the write cache until lock rate is around
10/second.

it turns out introduction of multithreading broke bm_flush_cache().

it does this:

- int ask_free = pbuf->wp; // how much data we have in the write cache now
- call bm_wait_for_free_space(ask_free); // ensure we have this much free shared 
memory space
- copy pbuf->wp worth if events to shared memory

looks okey at first sight. this is what happens to trigger the bug:

- int ask_free = pbuf->wp; // ok
- call bm_wait_for_free_space(ask_free); // ok, but if shared memory is full, it 
will go to sleep waiting for free space
- in the mean time, another thread calls bm_send_event(), this adds more data to 
the write cache, moves pbuf->wp
- bm_wait_for_free_space() eventually returns
- copy pbuf->wp worth of data to shared memo KABOOM! shared memory corruption!

we just overwrote some unlucky event in shared memory: we only have "ask_free"
free bytes available, but pbuf->wp moved and now has more data,
and it does not fit, and there is no check against it.

of course in the single threaded world this bug did not exist, there was no 
other thread to call bm_send_event() while bm_flush_cache() is sleeping.

the obvious fix is to ask for more free space if cached data does not fit.

this is now implemented on the branch feature/buffer_mutex. after a bit more 
tested I will merge it into develop.

so that's it?

not so fast. there was more going on. as described, the bug will only happen
when shared memory event buffer is full. (i.e. rarely or never). It turns
out the old version of thread locking code was defective and permitted
a race between bm_send_event() and bm_send_event() in another thread:

thread 1: while (1) { bm_send_event(very small event); }
thread 2:
-> bm_send_event(very big event)
-> no space in the cache for the very big event, call bm_flush_cache()
-> bm_flush_cache() asks bm_wait_for_free_space() to make space for cached data
-> this was done with write cache mutex released (mistake!)
-> at the same time bm_send_event(very small event) added 1 more small event to 
the cache
-> back in bm_flush_cache() write cache mutex is locked correctly, we copy 
cached data to shared memory and again KABOOM because we now have more data than 
we asked free space for.

So in the original implementation, corruption was possible even when share 
memory event buffer was pretty much empty.

The reworked locking code closed that loop hole - bm_flush_cache() is now
called with write cache locked, and bm_send_event() from another thread
cannot confuse things, unless shared memory buffer is full and we go to
sleep inside bm_wait_for_free_space(). And this is now fixed, too.

K.O.
    Reply  22 Mar 2022, Stefan Ritt, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
Thanks Konstantin for your detailed description. 

I wonder why we never saw this problem at PSI. Here is the reason: In multil-threaded environments, we never call bm_send_event() directly 
from all threads (since in the old days nothing was thread safe in midas). Instead, we use a collector thread which gets all events via the 
rb_xxx functions from the individual readout threads. This is well integrated into the mfe.cxx framework. Look at examples/mtfe/mfte.cxx. 
Each thread does (simplified):

while (true) {
  do {
     status = rb_get_wp(&pevent);
  } while (status == DB_TIMEOUT)

  bm_compose_event_threadsafe(pevent, ..., &serial_number);
  bk_init32(pevent+1);
  ... fill event ...
  bk_close(pevent)

  rb_increment_wp(sizeof(EVENT_HEADER) + pevent->data_size);
}

The framework now collects all these events in receive_trigger_event() which runs in the main thread:

  for (i=0 ; i<n_thread ; i++) {
     rb_get_rp(i, pevent);
     if (pevent->serial_number == prev_serial+1)
        break;
  }
  
  prev_serial = pevent->serial_number;
  rpc_send_event(pevent);
  rb_increment_rp(sizeof(EVENT_HEADER) + pevent->data_size);

This code ensures that all events are in the right sequence (before the serial numbers where mixed up) and that all events are sent only 
from a single thread, so the write buffer can be used effectively without complicated multi-thread locks.

This solution works nicely at PSI since many years, maybe you should put some thought to use it in your tmfe framework in Alpha-g as well 
instead of struggling with all your locks.

Stefan
       Reply  23 Mar 2022, Konstantin Olchanski, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
I confirm, there is no problem in single-threaded programs, and
there is no problem if all bm_send_event() and bm_flush_cache() are called
from the same thread.

> ... instead of struggling with all your locks.

it is better to have midas fully thread safe. ODB has been so for a long time,
event buffer partially (except for this bug), now fully.

without that the problem still exists, because in many frontends,
bm_flush_buffer() is called from the main thread, and will race
against the "bm_send_event() thread". Of course you can do
everything on the main thread, but this opens you to RPC timeouts
during run transitions (if you sleep in bm_wait_for_free_space()).

also the SYSMSG buffer is subject to the same bug. cm_msg() is of course
safe to call from anywhere, but cm_msg_flush_buffer() and cm_periodic_tasks()
can be called from any thread, and they issue bm_send_event(SYSMSG),
and there will be mysterious crashes and SYSMSG corruptions, probably
only during message storms, but still!

K.O.
          Reply  24 Mar 2022, Stefan Ritt, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
> > ... instead of struggling with all your locks.
> 
> it is better to have midas fully thread safe. ODB has been so for a long time,
> event buffer partially (except for this bug), now fully.
> 
> without that the problem still exists, because in many frontends,
> bm_flush_buffer() is called from the main thread, and will race
> against the "bm_send_event() thread". Of course you can do
> everything on the main thread, but this opens you to RPC timeouts
> during run transitions (if you sleep in bm_wait_for_free_space()).

Just for the record: in the mfe.cxx framework both bm_send_event() and 
bm_flush_buffer() are called from the main thread, as can be seen in the 
midas/examples/mtfe/mtfe.cxx example.

But I agree that having all buffer operations thread safe is a clear benefit.

Stefan
    Reply  23 Mar 2022, Ivo Schulthess, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
Thanks for the investigation. Back in 2020, we had some issues of losing data between the system buffer and the logger writing them to disk (https://daq00.triumf.ca/elog-midas/Midas/1966). This was polled equipment but we had a multithreaded FE running at the same time. Could this be related to the same problem?

Best, Ivo
       Reply  24 Mar 2022, Konstantin Olchanski, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
> Thanks for the investigation. Back in 2020, we had some issues
> of losing data between the system buffer and the logger writing them
> to disk (https://daq00.triumf.ca/elog-midas/Midas/1966). This was polled equipment
> but we had a multithreaded FE running at the same time. Could this be related to the same problem?

I think we will have to follow up on your problem 1966 separately.

I think this bug cannot lose events. Writing events to the write cache has correct
locking, no loss here. writing write cache to shared memory has correct locking,
no loss there. the bug will cause the *next* event in the event buffer to be overwritten,
this will be detected by most programs as shared memory corruption and everybody
will quit. (mhttpd, mserver, odbedit will probably survive).

I guess there could be unlucky corruption that looks like nothing was corrupted,
but this will affect only a few events right at the shared memory read/write
pointer, it so happens that they are the oldest events in the buffer and likely
mlogger already wrote them to disk. mlogger read pointer will likely follow
the shared memory write pointer closely, well ahead of the shared memory
read pointer which always pointe to the older event and where this bug's corruption
will happen.

So no, I do not think this bug can cause event loss between frontend and mlogger.

K.O.
Entry  23 Mar 2022, Hunter Lowe, Forum, ODB has issue with example analyzer 
Trying to play with midas file but I get error:

[Analyzer,ERROR] [odb.cxx:845:db_validate_name,ERROR] Invalid name "/Analyzer/Tests/low_sum/Rate [Hz]" passed to db_create_key_wlocked: should not contain "["

I'm not sure what sets the name so I'm not sure how to fix this.

Thanks
Entry  15 Mar 2022, Konstantin Olchanski, Bug Fix, mhttpd ipv6 bind should be fixed now 
Something changed after my initial implementation of ipv6 in mhttpd
and listening to ipv6 http/https connections was broken.

It turns out, I do not need to listen to both ipv4 and ipv6 sockets,
it is sufficient to listen to just ipv6. ipv4 connections will also
magically work. see linux kernel "bindv6only" sysctl setting: https://sysctl-
explorer.net/net/ipv6/bindv6only/

The specific bug in mhttpd was to bind to ipv4 socket first, subsequent bind() to ipv6 socket 
fails with error "Address already in use", which is silent, not reported by the mongoose library. 
For reasons unknown, this does not happen to bind() to "localhost" aka ipv6 "::1".

Apparently other web servers (apache, nginx) are/were also affected by this problem. 
https://chrisjean.com/fix-nginx-emerg-bind-to-80-failed-98-address-already-in-use/

First fix was to bind to ipv6 first (success) and to ipv4 second (fails). Second fix
committed to git is to only listen to ipv6.

This works both on MacOS and on Linux. Linux reports the listener socket is "tcp6", MacOS reports 
the listener socket as "tcp46":

4ed0:javascript1 olchansk$ netstat -an | grep 808 | grep LISTEN
tcp46      0      0  *.8081                 *.*                    LISTEN     
tcp6       0      0  ::1.8080               *.*                    LISTEN     
tcp4       0      0  127.0.0.1.8080         *.*                    LISTEN     
4ed0:javascript1 olchansk$ 

K.O.
    Reply  23 Mar 2022, Konstantin Olchanski, Bug Fix, mhttpd ipv6 bind should be fixed now 
> Something changed after my initial implementation of ipv6 in mhttpd
> and listening to ipv6 http/https connections was broken.

Reporting that mhttpd ipv6 works at CERN. The hostnames for ipv6 connections
come back as alphacpc05.ipv6.cern.ch instead of alphacpc05.cern.ch
so both are added to the http "insecure port" whitelist.

K.O.
Entry  10 Mar 2022, Gennaro Tortone, Bug Report, Python ODB watch 
Hi,

I have an issue with ODB watch on MIDAS Python library;

I wrote a simple frontend that read/write FPGA registers through 
ODB keys (simplified version at link below):

https://gist.github.com/gtortone/cd035a9ac4ea7a78ea9cd931e80e2c75

Everything works fine but there is a boolean array
in Settings (Enable ADC sampling) that I need to "toggle" 
(19 bit to 0 and 19 bit to 1). This operation is handled by
detailed_settings_changed_func that write the value of
toggled bit to FPGA.

The issue is that if I quickly toggle the boolean array by
odbedit:

set "/Equipment/odbtest/Settings/Enable ADC sampling[0-18]" 0
set "/Equipment/odbtest/Settings/Enable ADC sampling[0-18]" 1

I see in the Python script the following list of callbacks:

detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[0] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[1] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[2] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[3] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[4] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[5] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[6] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[7] - new value 0
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[8] - new value 1    ***
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[9] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[10] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[11] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[12] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[13] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[14] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[15] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[16] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[17] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[18] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[0] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[1] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[2] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[3] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[4] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[5] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[6] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[7] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[8] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[9] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[10] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[11] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[12] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[13] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[14] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[15] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[16] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[17] - new value 1
detailed_settings_changed_func: /Equipment/odbtest/Settings/Enable ADC sampling[18] - new value 1

It seems that the second write operation "overlaps" the first one...

The same behavior is not observed using a 'watch' in odbedit...

I can overcame this problem using the value of register as ODB key to avoid 
array of boolean... but I report this issue as "possible" bug/limitation on Python ODB watch;

Cheers,
Gennaro
    Reply  16 Mar 2022, Ben Smith, Bug Report, Python ODB watch 
> It seems that the second write operation "overlaps" the first one...

Hi Gennaro,

In principle the same issue can happen in C++ code, but is much less likely as the callbacks get executed  more quickly (partly due to C++/python in general, and partly because the python code does some extra work to make the interface more user-friendly). The C++ code at the end of this message adds a 100ms sleep to the callback and can result in output like this when you do quick edits of "Test[0-19]" in odbedit.

Element 1 is 0
Element 2 is 0
Element 3 is 0
Element 4 is 0
Element 5 is 0
Element 6 is 0
Element 7 is 1
Element 8 is 1
Element 9 is 1
etc...


I agree that this can be a really nasty source of bugs if you need to react to every change. I'll add a warning to the python docstrings, but I can't think of a way to make this more robust at the midas level - I think we'd need some sort of ODB "snapshot" system...









#include "midas.h"

void watch_fn(HNDLE hDB, HNDLE hKey, int index, void *info) {
  DWORD data = 0;
  INT buf_size = sizeof(data);
  db_get_data_index(hDB, hKey, &data, &buf_size, index, TID_DWORD);
  printf("Element %d is %u\n", index, data);
  ss_sleep(100);
}

int main() {
  HNDLE hDB, hClient, hTestKey;

  std::string host, expt;
  cm_get_environment(&host, &expt);
  cm_connect_experiment(host.c_str(), expt.c_str(), "test_odb", nullptr);
  cm_get_experiment_database(&hDB, &hClient);

  static const DWORD numValues = 20;
  DWORD data[numValues] = {};
  db_set_value(hDB, 0, "Test", data, sizeof(DWORD) * numValues, numValues, TID_DWORD);
  db_find_key(hDB, 0, "Test", &hTestKey);
  db_watch(hDB, hTestKey, watch_fn, nullptr);

  printf("Press any key to exit loop...\n");

  while (!ss_kbhit()) {
    cm_yield(1);
  }

  db_unwatch_all();
  db_delete_key(hDB, hTestKey, FALSE);
  cm_disconnect_experiment();

  return 0;
}
       Reply  21 Mar 2022, Stefan Ritt, Bug Report, Python ODB watch 
What you describe is a well-known problem with the ODB. At PSI we have similar issues. There are
two approaches to solve it:

1) Write values one-by-one to the ODB, but do not trigger a watch update. In the sequencer, this
can be achieved with the ODBSET command (see https://daq00.triumf.ca/MidasWiki/index.php/Sequencer 
and the last paragraph right of the ODBSET command). You use notify=0 for all set commands except
the last one where you use notify=1. On the C++ API, you can use db_set_data_index1() which has
this notify flag as the last parameter.

2) You add intelligence to your front-end. If you get a watchdog update, you do not apply this
directly to the hardware, but put it into a FIFO. Once you do not get any more update for a certain
period (like 1s is a good value), you empty the FIFO and apply all setting immediately.

Both methods have been used at PSI successfully, although 1) is much easier to implement, especially
if you use the midas sequencer.

Stefan
Entry  03 Mar 2022, Konstantin Olchanski, Info, manalyzer updated 
manalyzer was updated to latest version. mostly multi-threading improvements from 
Joseph and myself. K.O.
Entry  03 Mar 2022, Konstantin Olchanski, Info, zlib required, lz4 internal 
as of commit 8eb18e4ae9c57a8a802219b90d4dc218eb8fdefb, the gzip compression
library is required, not optional.

this fixes midas and manalyzer mis-build if the system gzip library
is accidentally not installed. (is there any situation where
gzip library is not installed on purpose?)

midas internal lz4 compression library was renamed to mlz4 to avoid collision
against system lz4 library (where present). lz4 files from midasio are now
used, lz4 files in midas/include and midas/src are removed.

I see that on recent versions of ubuntu we could switch to the system version 
of the lz4 library. however, on centos-7 systems it is usually not present
and it still is a supported and widely used platform, so we stay
with the midas-internal library for now.

K.O.
Entry  23 Feb 2022, Stefan Ritt, Info, Midas slow control event generation switched to 32-bit banks 
The midas slow control system class drivers automatically read their equipment and generate events containing midas banks. So far these have been 16-bit banks using bk_init(). But now more and more experiments use large amount of channels, so the 16-bit address space is exceeded. Until last week, there was even no check that this happens, leading to unpredictable crashes.

Therefore I switched the bank generation in the drivers generic.cxx, hv.cxx and multi.cxx to 32-bit banks via bk_init32(). This should be in principle transparent, since the midas bank functions automatically detect the bank type during reading. But I thought I let everybody know just in case.

Stefan
Entry  08 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue mslerror.PNG
Hi all,

I am having some issues getting the ODBINC command to work within the MIDAS 
sequencer. I am trying to increment one of the ODB values but it is returning a 
mismatch data-type size error (see attached image).

All the ODB variables are MIDAS data-type FLOAT and should all be 32-bit values, 
but for some reason MIDAS is thinking they are 4-bit values. I have tried creating 
new ODB keys of type INT32, UINT32 and DOUBLE but they all give the same error.

If anybody has any suggestions I would really appreciate some help:)

Thanks



 
    Reply  08 Feb 2022, Konstantin Olchanski, Bug Fix, ODBINC/Sequencer Issue 
Please post the output of odbedit "ls -l" for /eq/ar.../variables. (you posted the 
variable name as an image, and I cannot cut-and-paste the odb path!). BTW data size 4 is 
correct, 4 bytes for INT32/UINT32/FLOAT. For DOUBLE it should be 8. For you it prints 32 
and this is wrong, we need to see the output of "ls -l".
K.O.
       Reply  09 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> Please post the output of odbedit "ls -l" for /eq/ar.../variables. (you posted the 
> variable name as an image, and I cannot cut-and-paste the odb path!). BTW data size 4 is 
> correct, 4 bytes for INT32/UINT32/FLOAT. For DOUBLE it should be 8. For you it prints 32 
> and this is wrong, we need to see the output of "ls -l".
> K.O.

Hi,

Thanks for getting back to me regarding this. The output of "ls -l" is:

[local:mu3eMSci:S]/>cd Equipment/ArduinoTestStation/Variables
[local:mu3eMSci:S]Variables>ls -l
Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
_T_                             FLOAT   1     4     1h   0   RWD  20.93
_F_                             FLOAT   1     4     1h   0   RWD  12.8
_P_                             FLOAT   1     4     1h   0   RWD  56
_S_                             FLOAT   1     4     1h   0   RWD  5
_H_                             FLOAT   1     4     60h  0   RWD  44.74
_B_                             FLOAT   1     4     60h  0   RWD  18.54
_A_                             FLOAT   1     4     1h   0   RWD  14.41
_RH_                            FLOAT   1     4     1h   0   RWD  41.81
_AT_                            FLOAT   1     4     1h   0   RWD  20.46
SP                              INT16   1     2     1h   0   RWD  10

Many Thanks
Jago
          Reply  09 Feb 2022, Konstantin Olchanski, Bug Fix, ODBINC/Sequencer Issue 
> 
> [local:mu3eMSci:S]/>cd Equipment/ArduinoTestStation/Variables
> [local:mu3eMSci:S]Variables>ls -l
> Key name                        Type    #Val  Size  Last Opn Mode Value
> ---------------------------------------------------------------------------
> _T_                             FLOAT   1     4     1h   0   RWD  20.93
> _F_                             FLOAT   1     4     1h   0   RWD  12.8
> _P_                             FLOAT   1     4     1h   0   RWD  56
> _S_                             FLOAT   1     4     1h   0   RWD  5
> _H_                             FLOAT   1     4     60h  0   RWD  44.74
> _B_                             FLOAT   1     4     60h  0   RWD  18.54
> _A_                             FLOAT   1     4     1h   0   RWD  14.41
> _RH_                            FLOAT   1     4     1h   0   RWD  41.81
> _AT_                            FLOAT   1     4     1h   0   RWD  20.46
> SP                              INT16   1     2     1h   0   RWD  10
> 

This looks okey, so we still have no explanation for your error. Please post your sequencer 
script?

K.O.
             Reply  10 Feb 2022, Stefan Ritt, Bug Fix, ODBINC/Sequencer Issue 
I tried following script:

ODBSET /Equipment/ArduinoTestStation/Variables/_S_, 10

LOOP 10
  WAIT seconds, 3
  ODBINC /Equipment/ArduinoTestStation/Variables/_S_
ENDLOOP

and it worked as expected. So I conclude the problem must be in your script. Probably a typo in 
the ODB path pointing to a 32-byte string instead to a 4-byte float.

Stefan  
                Reply  14 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> I tried following script:
> 
> ODBSET /Equipment/ArduinoTestStation/Variables/_S_, 10
> 
> LOOP 10
>   WAIT seconds, 3
>   ODBINC /Equipment/ArduinoTestStation/Variables/_S_
> ENDLOOP
> 
> and it worked as expected. So I conclude the problem must be in your script. Probably a typo in 
> the ODB path pointing to a 32-byte string instead to a 4-byte float.
> 
> Stefan  

Hi Stefan,

Cheers for the reply. I believe the syntax we are using is correct. I have tried copying the script 
you used above and it results in the same error as before. Perhaps something is going wrong 
elsewhere - I will take another look today.

Jago
                   Reply  14 Feb 2022, Stefan Ritt, Bug Fix, ODBINC/Sequencer Issue 
Just post here a minimal script which produces the error, so that I can try myself.

... and make sure that you have the latest develop version of midas.

Stefan
                      Reply  14 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> Just post here a minimal script which produces the error, so that I can try myself.
> 
> ... and make sure that you have the latest develop version of midas.
> 
> Stefan

Here is the simplest script which produces the error:

WAIT seconds, 3
ODBINC /Equipment/ArduinoTestStation/Variables/_S_

I noticed that "Jacob Thorne"  in the forum had the same issue as us in Novemeber last 
year. Indeed we have not installed any later versions of MIDAS since then so we will 
double check we have the latest version.

Jago
                         Reply  14 Feb 2022, Stefan Ritt, Bug Fix, ODBINC/Sequencer Issue 
> I noticed that "Jacob Thorne"  in the forum had the same issue as us in Novemeber last 
> year. Indeed we have not installed any later versions of MIDAS since then so we will 
> double check we have the latest version.

As you see from my reply to Jacob, the bug has been fixed in midas since then, so just 
update.

Stefan
                            Reply  14 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> > I noticed that "Jacob Thorne"  in the forum had the same issue as us in Novemeber last 
> > year. Indeed we have not installed any later versions of MIDAS since then so we will 
> > double check we have the latest version.
> 
> As you see from my reply to Jacob, the bug has been fixed in midas since then, so just 
> update.
> 
> Stefan

We have tried updating using both:

git submodule update --init --recursive

and:

git pull --recurse-submodules

But the error still persists. Is there another way to update which we are missing?

Cheers
Jago
                               Reply  15 Feb 2022, Stefan Ritt, Bug Fix, ODBINC/Sequencer Issue 
> But the error still persists. Is there another way to update which we are missing?

The bug was definitively fixed in this modification:

https://bitbucket.org/tmidas/midas/commits/5f33f9f7f21bcaa474455ab72b15abc424bbebf2

You probably forgot to compile/install correctly after your pull. Of you start "odbedit" and do 
a "ver" you see which git revision you are currently running. Make sure to get this output:

MIDAS version:      2.1
GIT revision:       Fri Feb 11 08:56:02 2022 +0100 - midas-2020-08-a-509-g585faa96 on branch 
develop
ODB version:        3


Stefan
                                  Reply  16 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> > But the error still persists. Is there another way to update which we are missing?
> 
> The bug was definitively fixed in this modification:
> 
> https://bitbucket.org/tmidas/midas/commits/5f33f9f7f21bcaa474455ab72b15abc424bbebf2
> 
> You probably forgot to compile/install correctly after your pull. Of you start "odbedit" and do 
> a "ver" you see which git revision you are currently running. Make sure to get this output:
> 
> MIDAS version:      2.1
> GIT revision:       Fri Feb 11 08:56:02 2022 +0100 - midas-2020-08-a-509-g585faa96 on branch 
> develop
> ODB version:        3
> 
> 
> Stefan

We we're having some problems compiling but have got it sorted now - thanks for your help:)

Jago
             Reply  14 Feb 2022, jago aartsen, Bug Fix, ODBINC/Sequencer Issue 
> > 
> > [local:mu3eMSci:S]/>cd Equipment/ArduinoTestStation/Variables
> > [local:mu3eMSci:S]Variables>ls -l
> > Key name                        Type    #Val  Size  Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > _T_                             FLOAT   1     4     1h   0   RWD  20.93
> > _F_                             FLOAT   1     4     1h   0   RWD  12.8
> > _P_                             FLOAT   1     4     1h   0   RWD  56
> > _S_                             FLOAT   1     4     1h   0   RWD  5
> > _H_                             FLOAT   1     4     60h  0   RWD  44.74
> > _B_                             FLOAT   1     4     60h  0   RWD  18.54
> > _A_                             FLOAT   1     4     1h   0   RWD  14.41
> > _RH_                            FLOAT   1     4     1h   0   RWD  41.81
> > _AT_                            FLOAT   1     4     1h   0   RWD  20.46
> > SP                              INT16   1     2     1h   0   RWD  10
> > 
> 
> This looks okey, so we still have no explanation for your error. Please post your sequencer 
> script?
> 
> K.O.

Hey, thanks for getting back to me

We are fairly confident the syntax is correct. Having tried the test script posted by Stefan:

> ODBSET /Equipment/ArduinoTestStation/Variables/_S_, 10
> 
> LOOP 10
>   WAIT seconds, 3
>   ODBINC 

The same error is returned:/

We will take another look today.

Jago
Entry  02 Dec 2021, Alexey Kalinin, Bug Report, some frontend kicked by cm_periodic_tasks 
Hello,
We have a small experiment with MIDAS based DAQ.
Status page shows :
ES	ESFrontend@192.168.0.37	207	0.2	0.000
Trigger06	Sample Frontend06@192.168.0.37	1.297M	0.3	0.000
Trigger01	Sample Frontend01@192.168.0.37	1.297M	0.3	0.000
Trigger16	Sample Frontend16@192.168.0.37	1.297M	0.3	0.000
Trigger38	Sample Frontend38@192.168.0.37	1.297M	0.3	0.000
Trigger37	Sample Frontend37@192.168.0.37	1.297M	0.3	0.000
Trigger03	Sample Frontend03@192.168.0.38	1.297M	0.3	0.000
Trigger07	Sample Frontend07@192.168.0.38	1.297M	0.3	0.000
Trigger04	Sample Frontend04@192.168.0.38	59898	0.0	0.000
Trigger08	Sample Frontend08@192.168.0.38	59898	0.0	0.000
Trigger17	Sample Frontend17@192.168.0.38	59898	0.0	0.000


And SYSTEM buffers page shows:
ESFrontend	1968	198	47520	0	0x00000000	0		
193 ms
Sample Frontend06	1332547	1330826	379729872	0	0x00000000	
0		1.1 sec
Sample Frontend16	1332542	1330839	361988208	0	0x00000000	
0		94 ms
Sample Frontend37	1332530	1330841	337798408	0	0x00000000	
0		1.1 sec
Sample Frontend01	1332543	1330829	467136688	0	0x00000000	
0		34 ms
Sample Frontend38	1332528	1330830	291453608	0	0x00000000	
0		1.1 sec
Sample Frontend04	63254	61467	20882584	0	0x00000000	
0		208 ms
Sample Frontend08	63262	61476	27904056	0	0x00000000	
0		205 ms
Sample Frontend17	63271	61473	20433840	0	0x00000000	
0		213 ms
Sample Frontend03	1332549	1330818	386821728	0	0x00000000	
0		82 ms
Sample Frontend07	1332554	1330821	462210896	0	0x00000000	
0		37 ms
Logger	968742	0w+9500418r	0w+2718405736r	0	0x00000000	0	
GET_ALL Used 0 bytes 0.0%	303 ms
rootana	254561	0w+29856958r	0w+8718288352r	0	0x00000000	0		
762 ms


The problem is that eventually some of frontend closed with message 
:19:22:31.834 2021/12/02 [rootana,INFO] Client 'Sample Frontend38' on buffer 
'SYSMSG' removed by cm_periodic_tasks because process pid 9789 does not exist

in the meantime mserver loggging :
mserver started interactively
mserver will listen on TCP port 1175
double free or corruption (!prev)
double free or corruption (!prev)
free(): invalid next size (normal)
double free or corruption (!prev)


I can find some correlation between number of events/event size produced by 
frontend, cause its failed when its become big enough. 

frontend scheme is like this:

poll event time set to 0;

poll_event{
//if buffer not transferred return (continue cutting the main buffer)
//read main buffer from hardware
//buffer not transfered
}

read event{
// cut the main buffer to subevents (cut one event from main buffer) return;
//if (last subevent) {buffer transfered ;return}
}

What is strange to me that 2 frontends (1 per remote pc) causing this.

Also, I'm executing one FEcode with -i # flag , put setting eventid in 
frontend_init , and using SYSTEM buffer for all.

Is there something I'm missing?
Thanks. 
A.
    Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, some frontend kicked by cm_periodic_tasks 
> The problem is that eventually some of frontend closed with message 
> :19:22:31.834 2021/12/02 [rootana,INFO] Client 'Sample Frontend38' on buffer 
> 'SYSMSG' removed by cm_periodic_tasks because process pid 9789 does not exist

This messages means what it says. A client was registered with the SYSMSG buffer and this 
client had pid 9789. At some point some other client (rootana, in this case) checked it and 
process pid 9789 was no longer running. (it then proceeded to remove the registration).

There is 2 possibilities:
- simplest: your frontend has crashed. best to debug this by running it inside gdb, wait for 
the crash.
- unlikely: reported pid is bogus, real pid of your frontend is different, the client 
registration in SYSMSG is corrupted. this would indicate massive corruption of midas shared 
memory buffers, not impossible if your frontend misbehaves and writes to random memory 
addresses. ODB has protection against this (normally turned off, easy to enable, set ODB 
"/experiment/protect odb" to yes), shared memory buffers do not have protection against this 
(should be added?).

Do this. When you start your frontend, write down it's pid, when you see the crash message, 
confirm pid number printed is the same. As additional test, run your frontend inside gdb, 
after it crashes, you can print the stack trace, etc.

> 
> in the meantime mserver loggging :
> mserver started interactively
> mserver will listen on TCP port 1175
> double free or corruption (!prev)
> double free or corruption (!prev)
> free(): invalid next size (normal)
> double free or corruption (!prev)
> 

Are these "double free" messages coming from the mserver or from your frontend? (i.e. you run 
them in different terminals, not all in the same terminal?).

If messages are coming from the mserver, this confirms possibility (1),
except that for frontends connected remotely, the pid is the pid of the mserver,
and what we see are crashes of mserver, not crashes of your frontend. These are much harder to 
debug.

You will need to enable core dumps (ODB /Experiment/Enable core dumps set to "y"),
confirm that core dumps work (i.e. "killall -SEGV mserver", observe core files are created
in the directory where you started the mserver), reproduce the crash, run "gdb mserver 
core.NNNN", run "bt" to print the stack trace, post the stack trace here (or email to me 
directly).

>
> I can find some correlation between number of events/event size produced by 
> frontend, cause its failed when its become big enough. 
> 

There is no limit on event size or event rate in midas, you should not see any crash
regardless of what you do. (there is a limit of event size, because an event has
to fit inside an event buffer and event buffer size is limited to 2 GB).

Obviously you hit a bug in mserver that makes it crash. Let's debug it.

One thing to try is set the write cache size to zero and see if your crash goes away. I see
some indication of something rotten in the event buffer code if write cache is enabled. This
is set in ODB "/Eq/XXX/Common/Write Cache Size", set it to zero. (beware recent confusion
where odb settings have no effect depending on value of "equipment_common_overwrite").

>
> frontend scheme is like this:
> 

Best if you use the tmfe c++ frontend, event data handling is much simpler and we do not
have to debug the convoluted old code in mfe.c.

K.O.

>
> poll event time set to 0;
> 
> poll_event{
> //if buffer not transferred return (continue cutting the main buffer)
> //read main buffer from hardware
> //buffer not transfered
> }
> 
> read event{
> // cut the main buffer to subevents (cut one event from main buffer) return;
> //if (last subevent) {buffer transfered ;return}
> }
> 
> What is strange to me that 2 frontends (1 per remote pc) causing this.
> 
> Also, I'm executing one FEcode with -i # flag , put setting eventid in 
> frontend_init , and using SYSTEM buffer for all.
> 
> Is there something I'm missing?
> Thanks. 
> A.
       Reply  11 Feb 2022, Alexey Kalinin, Bug Report, some frontend kicked by cm_periodic_tasks 
Thanks for the answer.
As soon as I can(possible in a month) I'll try suggestion below:

> One thing to try is set the write cache size to zero and see if your crash goes away. I see
> some indication of something rotten in the event buffer code if write cache is enabled. This
> is set in ODB "/Eq/XXX/Common/Write Cache Size", set it to zero. (beware recent confusion
> where odb settings have no effect depending on value of "equipment_common_overwrite").

I tried to change this ODB for one of the frontend via mhttpd/browser, and eventually it goes back 
to default value (1000 as I remember). but this frontend has the minimum rate 50DWORD/~10sec. and 
depending on cashe size it appears in mdump once per 31 events but all aff them . SO its different 
story, but m.b. it has the same solution to play with Write Cashe Size.    


double free message goes from mserver terminal. 
all of the frontends are remote.
I can't exclude crashes of frontend , but when I run ./frontend -i 1(2,3 etc) thet means that I run 
one code for all, and only several causes crash.also I found that crash in frontend happened while 
it do nothing with collected data (last event reached and new data is not ready), but it tries to 
watch for the ODB changes.I mean it crashes iside (while {odb_changes(value in watchdog)}),and I don't 
know what else happenned meanwhile with cahed buffer.

Future plans is to use event buider for frontends when data/signals will be perfectly reasonable 
i/e/ without broken events. for now i kinda worry about if one of frontends will skip one of the 
event inside its buffer.


Thanks for the way to dig into.
A.    

> > The problem is that eventually some of frontend closed with message 
> > :19:22:31.834 2021/12/02 [rootana,INFO] Client 'Sample Frontend38' on buffer 
> > 'SYSMSG' removed by cm_periodic_tasks because process pid 9789 does not exist
> 
> This messages means what it says. A client was registered with the SYSMSG buffer and this 
> client had pid 9789. At some point some other client (rootana, in this case) checked it and 
> process pid 9789 was no longer running. (it then proceeded to remove the registration).
> 
> There is 2 possibilities:
> - simplest: your frontend has crashed. best to debug this by running it inside gdb, wait for 
> the crash.
> - unlikely: reported pid is bogus, real pid of your frontend is different, the client 
> registration in SYSMSG is corrupted. this would indicate massive corruption of midas shared 
> memory buffers, not impossible if your frontend misbehaves and writes to random memory 
> addresses. ODB has protection against this (normally turned off, easy to enable, set ODB 
> "/experiment/protect odb" to yes), shared memory buffers do not have protection against this 
> (should be added?).
> 
> Do this. When you start your frontend, write down it's pid, when you see the crash message, 
> confirm pid number printed is the same. As additional test, run your frontend inside gdb, 
> after it crashes, you can print the stack trace, etc.
> 
> > 
> > in the meantime mserver loggging :
> > mserver started interactively
> > mserver will listen on TCP port 1175
> > double free or corruption (!prev)
> > double free or corruption (!prev)
> > free(): invalid next size (normal)
> > double free or corruption (!prev)
> > 
> 
> Are these "double free" messages coming from the mserver or from your frontend? (i.e. you run 
> them in different terminals, not all in the same terminal?).
> 
> If messages are coming from the mserver, this confirms possibility (1),
> except that for frontends connected remotely, the pid is the pid of the mserver,
> and what we see are crashes of mserver, not crashes of your frontend. These are much harder to 
> debug.
> 
> You will need to enable core dumps (ODB /Experiment/Enable core dumps set to "y"),
> confirm that core dumps work (i.e. "killall -SEGV mserver", observe core files are created
> in the directory where you started the mserver), reproduce the crash, run "gdb mserver 
> core.NNNN", run "bt" to print the stack trace, post the stack trace here (or email to me 
> directly).
> 
> >
> > I can find some correlation between number of events/event size produced by 
> > frontend, cause its failed when its become big enough. 
> > 
> 
> There is no limit on event size or event rate in midas, you should not see any crash
> regardless of what you do. (there is a limit of event size, because an event has
> to fit inside an event buffer and event buffer size is limited to 2 GB).
> 
> Obviously you hit a bug in mserver that makes it crash. Let's debug it.
> 
> One thing to try is set the write cache size to zero and see if your crash goes away. I see
> some indication of something rotten in the event buffer code if write cache is enabled. This
> is set in ODB "/Eq/XXX/Common/Write Cache Size", set it to zero. (beware recent confusion
> where odb settings have no effect depending on value of "equipment_common_overwrite").
> 
> >
> > frontend scheme is like this:
> > 
> 
> Best if you use the tmfe c++ frontend, event data handling is much simpler and we do not
> have to debug the convoluted old code in mfe.c.
> 
> K.O.
> 
> >
> > poll event time set to 0;
> > 
> > poll_event{
> > //if buffer not transferred return (continue cutting the main buffer)
> > //read main buffer from hardware
> > //buffer not transfered
> > }
> > 
> > read event{
> > // cut the main buffer to subevents (cut one event from main buffer) return;
> > //if (last subevent) {buffer transfered ;return}
> > }
> > 
> > What is strange to me that 2 frontends (1 per remote pc) causing this.
> > 
> > Also, I'm executing one FEcode with -i # flag , put setting eventid in 
> > frontend_init , and using SYSTEM buffer for all.
> > 
> > Is there something I'm missing?
> > Thanks. 
> > A.
Entry  28 May 2021, Joseph McKenna, Bug Report, History plots deceiving users into thinking data is still logging flatline.pngflatline.png
I have been trying to fix this myself but my javascript isn't strong... The 'new' history plot render fills in missing data with the last ODB value (even when this value is very old... elog:2180/1 shows this... The data logging stopped, but the history plot can fool users into thinking data is logging (The export button generates CSVs with entires every 10 seconds also). Grepping through the history files behind the scenes, I found only one match for an example variable from this plot, so it looks like there are no entries after March 24th (although I may be mistaken, I've not studied the history files data structure in detail), ie this is a artifact from the mhistory.js rather than the mlogger... Have I missed something simple? Would it be possible to not draw the line if there are no datapoints in a significant time? Or maybe render a dashed line that doesn't export to CSV? Thanks in advance Edit, I see certificate errors this forum and I think its preventing my upload an image... inlining it into the text here:
    Reply  28 May 2021, Stefan Ritt, Bug Report, History plots deceiving users into thinking data is still logging 

This is a known problem and I'm working on. See the discussion at: 

https://bitbucket.org/tmidas/midas/issues/305/log_history_periodic-doesnt-account-for

Stefan

       Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, History plots deceiving users into thinking data is still logging 
https://bitbucket.org/tmidas/midas/issues/305/log_history_periodic-doesnt-account-for

this problem is a blocker for the next midas release.

the best I can tell, current development version of midas writes history data incorrectly,
but I do not have time to look at it at this moment.

I recommend that people use the latest released version, midas-2020-12. (this is what we have on alphag and 
should have in alpha2).

midas-2020-12 uses mlogger from midas-2020-08.

If I cannot find time to figure out what is going on in the mlogger,
the next release may have to be done the same way (with mlogger from midas-2020-08).

K.O.
          Reply  10 Feb 2022, Stefan Ritt, Bug Report, History plots deceiving users into thinking data is still logging 
The problem has been fixed on commit 825935dc on Oct. 2021 and runs fine since then at PSI. If TRIUMF people 
agree, we can close that issue and proceed.

Stefan
Entry  30 Sep 2021, Francesco Renga, Forum, OPC client within MIDAS 
Dear all,
     I need to integrate in my MIDAS project the communication with an OPC UA 
server. My plan is to develop an OPC UA client as a "device" in 
midas/drivers/device.

Two questions:

1) Is anybody aware of some similar effort for some other project, so that I can 
get some example?

2) What could be the more appropriate driver's class to be used? generic.cxx? 
multi.cxx?

Thank you for your help,
           Francesco
    Reply  10 Feb 2022, Francesco Renga, Forum, OPC client within MIDAS opc.cxxopc.h
Dear all,
        I finally succeeded to get a working driver for the communication with an OPC 
UA server. It is based on the open62541 library and I use it in combination with the 
generic.h driver class. This is still a crude implementation, but let me post it here, 
maybe it can be useful to somebody else.

BTW, if there is somebody more skilled than me with OPC UA and MIDAS drivers, who is 
willing to give suggestions for improving the implementation, it would be extremely 
appreciated.

Best Regards,
      Francesco



> Dear all,
>      I need to integrate in my MIDAS project the communication with an OPC UA 
> server. My plan is to develop an OPC UA client as a "device" in 
> midas/drivers/device.
> 
> Two questions:
> 
> 1) Is anybody aware of some similar effort for some other project, so that I can 
> get some example?
> 
> 2) What could be the more appropriate driver's class to be used? generic.cxx? 
> multi.cxx?
> 
> Thank you for your help,
>            Francesco
Entry  07 Feb 2022, Konstantin Olchanski, Forum, MidasWiki moved from ladd00 to daq00.triumf.ca and updated to MediaWiki 1.35 
MidasWiki moved from ladd00 (obsolete SL6) to daq00.triumf.ca (Ubuntu LTS 20.04) 
and updated from obsolete MediaWiki LTS 1.27.7 to MediaWiki LTS 1.35, supported 
until mid-2023, see https://www.mediawiki.org/wiki/Version_lifecycle

Old URL https://midas.triumf.ca and https://midas.triumf.ca/MidasWiki/...
redirect to new URL https://daq00.triumf.ca/MidasWiki/index.php/Main_Page

All old links and bookmarks should continue to work (via redirect).

To report problems with this MediaWiki instance and to request
any changes in configuration or installed extensions, please reply to this
message here.

K.O.
Entry  26 Jan 2022, Frederik Wauters, Forum, .gz files 
I adapted our analyzer to compile against the manalyzer included in the midas repo.

All our data files are .mid.gz, which now fail to process :(

frederik@frederik-ThinkPad-T550:~/new_daq/build/analyzer$ ./analyzer -e100 -s100 ../../run_backup_11783.mid.gz 
...
...n
Registered modules: 1
file[0]: ../../run_backup_11783.mid.gz
Setting up the analysis!
TMReadEvent: error: short read 0 instead of -1193512213

Which is in the TMEvent* TMReadEvent(TMReaderInterface* reader) class in the midasio.cxx file

Reading the unzipped files works. But we have always processed our .gz files directly, for the unzipping we would need ~2x disk space.

Am I doing something wrong? I see that there is some activity on lz4 in the midasio repo, is gunzip next?
    Reply  26 Jan 2022, Konstantin Olchanski, Forum, .gz files 
> I adapted our analyzer to compile against the manalyzer included in the midas repo.
> TMReadEvent: error: short read 0 instead of -1193512213

I think this problem is fixed in the latest version of midasio and manalyzer, but this update
was not pulled into midas yet. (Canada is in the middle of a covid wave since December).

What happens is you do not have the gzip library installed on your computer and
your analyzer is built without support for gzip.

The fix is done the hard way, the gzip library is no longer optional, but required.

You do not say what linux you use, so I cannot give exact instructions, but for:
ubuntu: apt -y install libz-dev
centos7: installed by default
centos8: installed by default
debian11/raspbian: same as ubuntu

K.O.
       Reply  31 Jan 2022, Frederik Wauters, Forum, .gz files 
> > I adapted our analyzer to compile against the manalyzer included in the midas repo.
> > TMReadEvent: error: short read 0 instead of -1193512213
> 
> I think this problem is fixed in the latest version of midasio and manalyzer, but this update
> was not pulled into midas yet. (Canada is in the middle of a covid wave since December).
> 
> What happens is you do not have the gzip library installed on your computer and
> your analyzer is built without support for gzip.
> 
> The fix is done the hard way, the gzip library is no longer optional, but required.
> 
> You do not say what linux you use, so I cannot give exact instructions, but for:
> ubuntu: apt -y install libz-dev
> centos7: installed by default
> centos8: installed by default
> debian11/raspbian: same as ubuntu
> 
> K.O.

My libz under ubuntu

-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11") 
-- MIDAS: Found ZLIB version 1.2.11

I got both the manalyzer example and mine going with
* the latest midas dev
* + the latest manalyzer (cf6c233)
* + almost latest midasio (568a617, otherwise I get an linking error 

./libmidas.a(midasio.cxx.o): In function `Lz4Error(int)':
midasio.cxx:(.text+0x359): undefined reference to `MLZ4F_getErrorName(unsigned long)'

So this works, I will assume that in the near future this all will come together in the standard midas release.

thanks  
Entry  29 Jan 2022, Isaac Labrie Boulay, Forum, MIDAS and GRIF-16 digitizer (Standalone Mode). 
Hi all,

I was sent a version of the frontend for the TIGRESS Detector lab setup so that 
I can test detectors using a GRIF-16 digitizer in standalone mode.

I followed the GRIF-16 wiki (https://grsi.wiki.triumf.ca/wiki/GRIF-16#One-
level_operation) to setup the GRIF-16 through the webpage. The digitized data is 
supposed to come into my UDP port 8800 but it is never retrieved in the 
frontend.

Here's the readout scheme:
// readout sequence ...
// poll_event() true (if still have data in buffer or testmsg() true)
// -> read_trigger_event() -> read_grifc_event() - re-buffers into midas events
//                         -> grifc_eventread()  - returns single grif fragment
//                         -> grifc_dataread()   - returns single net-pkt 


Here's poll_event():
INT poll_event(INT source, INT count, BOOL test)
{
   int i, have_data=0;

   for(i=0; i<count; i++){
      if( data_available ){ break; }
      have_data = ( testmsg(data_socket, 0) > 0 );
      if( have_data && !test ){ break; }
   }
   return( (have_data || data_available) && !test );
}

This being said, testmsg() always returns empty and "data_available" is only set 
to TRUE when there's leftover data after a GRIF-C reading (I'm obviously not 
using a GRIF-C).

I know that when GRIF-16 is in standalone mode, MIDAS does not change GRIF-16s 
settings based on the ODB, it has to be done through the GRIF-16 webpage. Is the 
user frontend code even responsible for the GRIF-16 data readout in standalone 
mode? If not, could it just be that my UDP offloader is incorrectly setup?

Here are its current settings:

SETTINGS/UDP
- Offloader: ON
- Dst IP: my IP
- Dst Port: 8800 (DATA_PORT)

SETTINGS/MIDAS
- Use MIDAS: OFF
- MIDAS Hostname: my hostname
- MIDAS IP: same as Dst IP from UDP settings
- Dst Port: 8080 (I'm assuming that this is the mhttpd port)

Again, the frontend runs but I get 0 events. What might I be missing?

Thanks for helping me out!

Isaac
Entry  22 Oct 2021, Francesco Renga, Forum, mhttpd error 
Dear all,
      I am trying to make the MIDAS web server for my DAQ project accessible from other machines. In the ODB, I activated the necessary flags:

[local:CYGNUS_RD:S]/WebServer>ls
Enable localhost port           y
localhost port                  8080
localhost port passwords        n
Enable insecure port            y
insecure port                   8081
insecure port passwords         y
insecure port host list         y
Enable https port               y
https port                      8443
https port passwords            y
https port host list            n
Host list
                                localhost
Enable IPv6                     y
Proxy                           
mime.types

Following the instructions on the Wiki I enabled the SSL support. When running mhttpd, I get these messages:

Mongoose web server will use HTTP Digest authentication with realm "CYGNUS_RD" and password file "/home/cygno/DAQ/online/htpasswd.txt"
Mongoose web server will use the hostlist, connections will be accepted only from: localhost
Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF
Mongoose web server listening on http address "[::1]:8080", passwords OFF, hostlist OFF
Mongoose web server listening on http address "8081", passwords enabled, hostlist enabled
[mhttpd,ERROR] [mhttpd.cxx:19166:mongoose_listen,ERROR] Cannot mg_bind address "[::0]:8081"
Mongoose web server will use https certificate file "/home/cygno/DAQ/online/ssl_cert.pem"
Mongoose web server listening on https address "8443", passwords enabled, hostlist OFF
[mhttpd,ERROR] [mhttpd.cxx:19166:mongoose_listen,ERROR] Cannot mg_bind address "[::0]:8443"

and the server is not accessible from other machines. Any suggestion to solve or better investigate this problem?

Thank you very much,
         Francesco
    Reply  22 Oct 2021, Stefan Ritt, Forum, mhttpd error 
> Enable IPv6                     y

Probably the IPv6 problem, see here elog:2269

I asked to turn off IPv6 by default, or at least mention this in the documentation,
but unfortunately nothing happened.

Stefan
       Reply  25 Oct 2021, Francesco Renga, Forum, mhttpd error 
It worked, thank you very much!

Francesco

> > Enable IPv6                     y
> 
> Probably the IPv6 problem, see here elog:2269
> 
> I asked to turn off IPv6 by default, or at least mention this in the documentation,
> but unfortunately nothing happened.
> 
> Stefan
       Reply  26 Jan 2022, Konstantin Olchanski, Forum, mhttpd error 
> > Enable IPv6                     y
> 
> Probably the IPv6 problem, see here elog:2269
> 
> I asked to turn off IPv6 by default, or at least mention this in the documentation,
> but unfortunately nothing happened.

But IPv4 and IPv6 code is completely separate, if IPv6 bind fails, IPv4 should still 
work.

This is all very strange.

It does not help that the OP does not say in which way things do not work,
"the server is not accessible from other machines" is not an error message
reported by any browser, and we do not know what URL he is using
to access mhttpd - http: or https:

Also he is enabling the "insecure" port 8081, I am pretty sure the documentation
is pretty clear, either use the secure https port or the insecure port,
but not both at the same time.

In any case, I see current version of mongoose have removed support
for password files, so all this stuff will likely become reworked
and at the end mhttpd will only listen to localhost ports. To make it "accessible
to other machines", one will have to use the apache https proxy. (or mtpcproxy from 
midas).

K.O.
Entry  29 Oct 2021, Kushal Kapoor, Bug Report, Unknown Error 319 from client Screenshot_2021-10-26_114015.png
I’m trying to run MIDAS using a frontend code/client named “fetiglab”. Run stops 
after 2/3sec with an error saying “Unknown error 319 from client “fetiglab” on 
localhost.

Frontend code compiled without any errors and MIDAS reads the frontend 
successfully, this only comes when I start the new run on MIDAS, here are a few 
more details from the terminal-

11:46:32 [fetiglab,ERROR] [odb.cxx:11268:db_get_record,ERROR] struct size 
mismatch for "/" (expected size: 1, size in ODB: 41920)

11:46:32 [Logger,INFO] Deleting previous file 
"/home/rcmp/online3/run00621_000.root"

11:46:32 [ODBEdit,ERROR] [midas.cxx:5073:cm_transition,ERROR] transition START 
aborted: client "fetiglab" returned status 319

11:46:32 [ODBEdit,ERROR] [midas.cxx:5246:cm_transition,ERROR] Could not start a 
run: cm_transition() status 319, message 'Unknown error 319 from client 
'fetiglab' on host "localhost"'

TR_STARTABORT transition: cleanup after failure to start a run

&#8204;

I’ve also enclosed a screenshot for the same, any suggestions would be highly 
appreciated. thanks
    Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, Unknown Error 319 from client 
> I’m trying to run MIDAS using a frontend code/client named “fetiglab”. Run stops 
> after 2/3sec with an error saying “Unknown error 319 from client “fetiglab” on 
> localhost.

actually run never starts.

> 11:46:32 [fetiglab,ERROR] [odb.cxx:11268:db_get_record,ERROR] struct size 
> mismatch for "/" (expected size: 1, size in ODB: 41920)

this is the error that causes run start to fail. for reasons unknown
your frontend is trying to do a db_get_record() from "/" (ODB root top directory).

if this is an mfe.c frontend, I do not think I have ever seen it do something
like this.

so, a puzzle.

K.O.
Entry  01 Dec 2021, Lars Martin, Bug Report, Off-by-one in sequencer documentation 
The documentation for the sequencer loop says:

<quote>
LOOP [name ,] n ... ENDLOOP	To execute a loop n times. For infinite loops, "infinite" 
can be specified as n. Optionally, the loop variable running from 0...(n-1) can be accessed 
inside the loop via $name.
</quote>

In fact the loop variable runs from 1...n, as can be seen by running this exciting 
sequencer code:

1 COMMENT "Figuring out MSL"
2 
3 LOOP n,4
4   MESSAGE $n,1
5 ENDLOOP
    Reply  02 Dec 2021, Stefan Ritt, Bug Report, Off-by-one in sequencer documentation 
> The documentation for the sequencer loop says:
> 
> <quote>
> LOOP [name ,] n ... ENDLOOP	To execute a loop n times. For infinite loops, "infinite" 
> can be specified as n. Optionally, the loop variable running from 0...(n-1) can be accessed 
> inside the loop via $name.
> </quote>
> 
> In fact the loop variable runs from 1...n, as can be seen by running this exciting 
> sequencer code:
> 
> 1 COMMENT "Figuring out MSL"
> 2 
> 3 LOOP n,4
> 4   MESSAGE $n,1
> 5 ENDLOOP

Indeed you're right. The loop variable runs from 1...n. I fixed that in the documentation.

Stefan
       Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, Off-by-one in sequencer documentation 
> > 3 LOOP n,4
> > 4   MESSAGE $n,1
> > 5 ENDLOOP
> 
> Indeed you're right. The loop variable runs from 1...n. I fixed that in the documentation.

Shades/ghosts of FORTRAN. c/c++/perl/python loops loop from 0 to n-1.

K.O.
          Reply  26 Jan 2022, Stefan Ritt, Bug Report, Off-by-one in sequencer documentation 
> Shades/ghosts of FORTRAN. c/c++/perl/python loops loop from 0 to n-1.

   for (i=1 ; i<=10 ; i++);     ;-)
             Reply  26 Jan 2022, Konstantin Olchanski, Bug Report, Off-by-one in sequencer documentation 
> > Shades/ghosts of FORTRAN. c/c++/perl/python loops loop from 0 to n-1.
> 
>    for (i=1 ; i<=10 ; i++);     ;-)

Similar code made big news just recently: (scroll down to the example main() program)

https://blog.qualys.com/vulnerabilities-threat-research/2022/01/25/pwnkit-local-privilege-escalation-
vulnerability-discovered-in-polkits-pkexec-cve-2021-4034

I forget if the FORTRAN rules were "loop once" or "never loop" or if it was different
between Fortran-4, fortran-77, DEC extensions and IBM extension, or if it was a compiler switch.

We should check that we do something reasonable with such loops to zero:

LOOP n,0
   MESSAGE $n,1
ENDLOOP

P.S. Yup. "man g77" option "-fonetrip".

K.O.
Entry  09 Nov 2021, Francesco Renga, Forum, Issue in data writing speed 
Dear all,
       I've a frontend writing a quite big bunch of data into a MIDAS bank (16bit output from a 4MP photo camera). 
I'm experiencing a writing speed problem that I don't understand. When the photo camera is triggered at a low rate (< 2 Hz) 
writing into the bank takes a very short time for each event (indeed, what I measure is the time to write and go back 
into the polling function). If I increase the rate to 4 Hz, I see that writing the first two events takes a sort time, 
but the third event takes a very long time (hundreds of ms), then again the fourth and fifth events are very fast, and 
the sixth is very slow. If I further increase the rate, every other event is very slow. The problem is not in the readout 
of the camera, because if I just remove the bank writing and keep the camera readout, the problem disappears. Can you 
explain this behavior? Is there any way to improve it?

Below you can also find the code I use to copy the data from the camera buffer into the bank. If you have any suggestion 
to improve it, it would be really appreciated.

Thank you very much,
          Francesco



  const char* pSrc = (const char*)bufframe.buf;

  for(int y = 0; y < bufframe.height; y++ ){

    //Copy one row
    const unsigned short* pDst = (const unsigned short*)pSrc;

    //go through the row
    for(int x = 0; x < bufframe.width; x++ ){

      WORD tmpData = *pDst++; 

      *pdata++ = tmpData;

    }

    pSrc += bufframe.rowbytes;

  }
 
    Reply  10 Nov 2021, Stefan Ritt, Forum, Issue in data writing speed 
Midas uses various buffers (in the frontend, at the server side before the SYSTEM buffer, the SYSTEM buffer itself, on the 
logger before writing to disk. All these buffers are in RAM and have fast access, so you can fill them pretty quickly. When
they are full, the logger writes to disk, which is slower. So I believe at 2 Hz your disk can keep up with your writing 
speed, but at 4 Hz (2x8MBx4=32 MB/sec) your disk starts slowing down the writing process. Now 32MB/s is pretty slow for
a disk, so I presume you have turned compression on which takes quite some time.

To verify this, disable logging. The disable compression and keep logging. Then report back here again.

> Dear all,
>        I've a frontend writing a quite big bunch of data into a MIDAS bank (16bit output from a 4MP photo camera). 
> I'm experiencing a writing speed problem that I don't understand. When the photo camera is triggered at a low rate (< 2 Hz) 
> writing into the bank takes a very short time for each event (indeed, what I measure is the time to write and go back 
> into the polling function). If I increase the rate to 4 Hz, I see that writing the first two events takes a sort time, 
> but the third event takes a very long time (hundreds of ms), then again the fourth and fifth events are very fast, and 
> the sixth is very slow. If I further increase the rate, every other event is very slow. The problem is not in the readout 
> of the camera, because if I just remove the bank writing and keep the camera readout, the problem disappears. Can you 
> explain this behavior? Is there any way to improve it?
> 
> Below you can also find the code I use to copy the data from the camera buffer into the bank. If you have any suggestion 
> to improve it, it would be really appreciated.
> 
> Thank you very much,
>           Francesco
> 
> 
> 
>   const char* pSrc = (const char*)bufframe.buf;
> 
>   for(int y = 0; y < bufframe.height; y++ ){
> 
>     //Copy one row
>     const unsigned short* pDst = (const unsigned short*)pSrc;
> 
>     //go through the row
>     for(int x = 0; x < bufframe.width; x++ ){
> 
>       WORD tmpData = *pDst++; 
> 
>       *pdata++ = tmpData;
> 
>     }
> 
>     pSrc += bufframe.rowbytes;
> 
>   }
>  
    Reply  26 Jan 2022, Konstantin Olchanski, Forum, Issue in data writing speed 
Francesco, when you say "writing an event is slow", do you mean it in the frontend
or in the output data file?

Stefan is quite right about the data file, it can take seconds between generating
an event in the frontend and seeing it written to the data file. (if compression
buffers are too big, an event can sit there forever, until pushed out by next events
or by run stop).

But maybe you see this on the frontend side.

What you are looking at is "real time" performance of the frontend and of the linux kernel.

The mfe.c frontend has many problems with real time performance, it can stall and take a long
time between calls to read_event(), for many reasons.

There are ways around that, but it is simpler to switch to the tmfe c++ frontend
that was designed for good real time performance.

In the tmfe frontend, if you use the polled equipment and enable the poll thread,
your frontend will be limited only by the linux kernel real time performance (i.e.
on a single-core CPU, other programs will delay execution of your frontend
and you will see it as long delays (usec, millisec) between calls to your read_event().

Next limit to real time performance (common to mfe.c and tmfe frontends) is the writing
of event data to the midas shared event buffer. One has to lock the shared memory semaphore
and this has to wait until other users of the event buffer finish their reading
or writing and unlock it. Arbitrary amount of time (usec, millisec, sec) can pass.
(there is also problems with "fairness" of the linux semaphores, a different story, again).

Making things more interesting, midas event buffers implement a write cache (default size 100 kbytes),
events smaller than the cache are quickly accumulated (no need to lock the shared memory semaphore),
them flushed to shared memory when cache is full. This is done to reduce the number
of shared memory semaphore locks per event, in the case of very high rate of very small events.

Solution to all this is to use 2 threads: read the data from hardware in one thread and write the data to midas
in a different thread. Between the threads would be an event fifo (circular buffer in mfe.c,
std::deque<EVENT> in tmfe c++ frontends).

For remote connected frontends, things are a bit different. Event data is written directly
into the TCP socket and as long as socket buffers are big enough, there is no real-time delays,
unless SYSTEM buffer is very congested and mserver does not read the TCP socket quickly enough.
So depending on event size, data rate and tcp socket buffer size, the extra 2nd thread
may not be necessary and poll thread real time performance may be good enough.

I hope this clarifies the situation somewhat.

K.O.

> Dear all,
>        I've a frontend writing a quite big bunch of data into a MIDAS bank (16bit output from a 4MP photo camera). 
> I'm experiencing a writing speed problem that I don't understand. When the photo camera is triggered at a low rate (< 2 Hz) 
> writing into the bank takes a very short time for each event (indeed, what I measure is the time to write and go back 
> into the polling function). If I increase the rate to 4 Hz, I see that writing the first two events takes a sort time, 
> but the third event takes a very long time (hundreds of ms), then again the fourth and fifth events are very fast, and 
> the sixth is very slow. If I further increase the rate, every other event is very slow. The problem is not in the readout 
> of the camera, because if I just remove the bank writing and keep the camera readout, the problem disappears. Can you 
> explain this behavior? Is there any way to improve it?
> 
> Below you can also find the code I use to copy the data from the camera buffer into the bank. If you have any suggestion 
> to improve it, it would be really appreciated.
> 
> Thank you very much,
>           Francesco
> 
> 
> 
>   const char* pSrc = (const char*)bufframe.buf;
> 
>   for(int y = 0; y < bufframe.height; y++ ){
> 
>     //Copy one row
>     const unsigned short* pDst = (const unsigned short*)pSrc;
> 
>     //go through the row
>     for(int x = 0; x < bufframe.width; x++ ){
> 
>       WORD tmpData = *pDst++; 
> 
>       *pdata++ = tmpData;
> 
>     }
> 
>     pSrc += bufframe.rowbytes;
> 
>   }
>  
       Reply  26 Jan 2022, Konstantin Olchanski, Forum, Issue in data writing speed 
> Francesco, when you say "writing an event is slow", do you mean it in the frontend
> or in the output data file?

Another explanation just occurred to me. We do not know your event size and we do not
know the size of your SYSTEM buffer. But if you have an unlucky combination,
this can happen:

Consider event size is 6 Mbytes, buffer size is 8 Mbytes, enough space for only 1 event.

First event is written quickly (buffer is empty).
Second event will be delayed, there is not enough free space in the buffer, we have
to wait for mlogger to finish reading the first event.

Same thing happens if event size is 3 Mbytes, the first 2 events will write quickly,
writing the 3rd event will be delayed until mlogger does it's thing.

The mlogger reads the SYSTEM buffer "fast" and "quickly", but it can be delayed
for a number of reasons, i.e. handling a history event, a delay writing to disk,
a delay writing to network connected storage, etc.

In general, it is best to size the SYSTEM buffer to hold about 1 second worth
of data (of average size, average rate). If your event size is 4 Mbytes, and you
record them at 10/sec, SYSTEM buffer should be at least 40 Mbytes big. (this is
set in ODB /Experiment/Buffer Sizes). (MIDAS event buffer size is limited to 2 GBytes).

K.O.
Entry  09 Nov 2021, Hunter Lowe, Forum, MityCAMAC Login  
Hello all,

I've recently acquired a MityCAMAC system that was built at TRIUMF and I'm 
having issues accessing it over ethernet.

The system: Ubuntu VM inside Windows 10 machine.

I've tried reconfiguring the network settings for the VM but nmap and arp/ip 
commands have yielded me no results in finding the crate controller. 

I was getting help from Pierre Amaudruz but I think he is now busy for some 
time. I have the mac address of the crate controller and its name. The 
controller seems to initialize fine inside of the CAMAC crate. The windows side 
of the workstation also tells me that an unknown network is in fact connected.

I suspect I either need to do something with an ssh key (which I thought we 
accomplished but maybe not), or perhaps the domain name in the controller needs 
to be changed. 

If anybody has experience working with MityARM I would appreciate any advice I 
could get.

Best,
Hunter Lowe
UNBC Graduate Physics
    Reply  11 Nov 2021, Thomas Lindner, Forum, MityCAMAC Login  
Hi Hunter 

This sounds like a Triumf specific problem; 
not a MIDAS problem.  Please email me directly 
and we can try to solve this problem.

Thomas Lindner
TRIUMF DAQ


 Hello all,
> 
> I've recently acquired a MityCAMAC system 
that was built at TRIUMF and I'm 
> having issues accessing it over ethernet.
> 
> The system: Ubuntu VM inside Windows 10 
machine.
> 
> I've tried reconfiguring the network 
settings for the VM but nmap and arp/ip 
> commands have yielded me no results in 
finding the crate controller. 
> 
> I was getting help from Pierre Amaudruz but 
I think he is now busy for some 
> time. I have the mac address of the crate 
controller and its name. The 
> controller seems to initialize fine inside 
of the CAMAC crate. The windows side 
> of the workstation also tells me that an 
unknown network is in fact connected.
> 
> I suspect I either need to do something with 
an ssh key (which I thought we 
> accomplished but maybe not), or perhaps the 
domain name in the controller needs 
> to be changed. 
> 
> If anybody has experience working with 
MityARM I would appreciate any advice I 
> could get.
> 
> Best,
> Hunter Lowe
> UNBC Graduate Physics
       Reply  26 Jan 2022, Konstantin Olchanski, Info, MityCAMAC Login  
For those curious about CAMAC controllers, this one was built around 2014 to 
replace the aging CAMAC A1/A2 controllers (parallel and serial) in the TRIUMF 
cyclotron controls system (around 50 CAMAC crates). It implements the main
and the auxiliary controller mode (single width and double width modules).

The design predates Altera Cyclone-5 SoC and has separate
ARM processor (TI 335x) and Cyclone-4 FPGA connected by GPMC bus.

ARM processor boots Linux kernel and CentOS-7 userland from an SD card,
FPGA boots from it's own EPCS flash.

User program running on the ARM processor (i.e. a MIDAS frontend)
initiates CAMAC operations, FPGA executes them. Quite simple.

K.O.
Entry  16 Dec 2021, Zaher Salman, Forum, Device driver for modbus 
Dear all, does anyone have an example of for a device driver using modbus or modbus tcp to communicate with a device and willing to share it? Thanks.
    Reply  26 Jan 2022, Konstantin Olchanski, Forum, Device driver for modbus 
> Dear all, does anyone have an example of for a device driver using modbus or modbus tcp to communicate with a device and willing to share it? Thanks.

I have not seen any modbus devices recently, so all my code and examples are quite old.

Basic modbus/tcp communication driver is in the midas repo:

daq00:midas$ find . | grep -i modbus
./drivers/divers/ModbusTcp.cxx
./drivers/divers/ModbusTcp.h
daq00:midas$ 

This driver worked for communication to a modbus PLC (T2K/ND280/TPC experiment in Japan).

An example program to use this driver and test modbus communication is here:
https://bitbucket.org/expalpha/agdaq/src/master/src/modbus.cxx

Because at the end, we do not have any modbus devices in any recent experiment,
I do not have any example of using this driver in the midas frontend. Sorry.

K.O.
Entry  19 Nov 2021, Jacob Thorne, Forum, Sequencer error with ODB Inc 
Hi,

I am having problems with the midas sequencer, here is my code:

1  COMMENT "Example to move a Standa stage"
2  RUNDESCRIPTION "Example movement sequence - each run is one position of a single stage
3  
4  PARAM numRuns
5  PARAM sequenceNumber
6  PARAM RunNum
7  
8  PARAM positionT2
9  PARAM deltapositionT2
10 
11 ODBSet "/Runinfo/Run number", $RunNum
12 ODBSet "/Runinfo/Sequence number", $sequenceNumber
13 
14 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Type of Measurement", 2
15 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Number of Time Bins", 10
16 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Number of Sweeps", 1
17 ODBSet "/Equipment/Neutron Detector/Settings/Detector/Dwell Time", 100000
18 
19 ODBSet "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Device Driver/Set Position", $positionT2
20 
21 LOOP $numRuns
22  WAIT ODBvalue, "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Ready", ==, 1
23  TRANSITION START
24  WAIT ODBvalue, "/Equipment/Neutron Detector/Statistics/Events sent", >=, 1
25  WAIT ODBvalue, "/Runinfo/State", ==, 1
26  WAIT ODBvalue, "/Runinfo/Transition in progress", ==, 0
27  TRANSITION STOP
28  ODBInc "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Device Driver/Set Position", $deltapositionT2
29 
30 ENDLOOP
31 
32 ODBSet "/Runinfo/Sequence number", 0

The issue comes with line 28, the ODBInc does not work, regardless of what number I put I get the following error:

[Sequencer,ERROR] [odb.cxx:7046:db_set_data_index1,ERROR] "/Equipment/MTSC/Settings/Devices/Stage 2 Translation/Device Driver/Set Position" invalid element data size 32, expected 4

I don't see why this should happen, the format is correct and the number that I input is an int.

Sorry if this is a basic question.

Jacob
    Reply  02 Dec 2021, Stefan Ritt, Forum, Sequencer error with ODB Inc 
Thanks for reporting that bug. Indeed there was a problem in the sequencer code which I fixed now. Please try the updated develop branch.

Stefan
Entry  29 Oct 2021, Frederik Wauters, Bug Report, midas::odb::iterator + operator 
I have 16 array odb key

{"FIR Energy", {
            {"Energy Gap Value", std::array<uint32_t,16>(10) },

I can get the maximum of this array like 


uint32_t max_value = *std::max_element(values.begin(),values.end());

but when I need the maximum of a sub range

uint32_t max_value = *std::max_element(values.begin(),values.begin()+4);

I get

/home/labor/new_daq/frontends/SIS3316Module.cpp:584:62: error: no match for ‘operator+’ (operand types are ‘midas::odb::iterator’ and ‘int’)
  584 |   max_value = *std::max_element(values.begin(),values.begin()+4);
      |                                                ~~~~~~~~~~~~~~^~
      |                                                            |  |
      |                                                            |  int
      |               

As the + operator is overloaded for midas::odb::iterator, I was expected this to work.

(and yes, I can find the max element by accessing the elements on by one)
    Reply  29 Oct 2021, Frederik Wauters, Bug Report, midas::odb::iterator + operator | work around 
ok, so retrieving as a std::array (as it was defined) does not work

    std::array<uint32_t,16> avalues = settings["FIR Energy"]["Energy Gap Value"];

but retrieving as an std::vector does, and then I have a standard c++ iterator which I can use in std stuff

    std::vector<uint32_t> values = settings["FIR Energy"]["Energy Gap Value"];

> I have 16 array odb key
> 
> {"FIR Energy", {
>             {"Energy Gap Value", std::array<uint32_t,16>(10) },
> 
> I can get the maximum of this array like 
> 
> 
> uint32_t max_value = *std::max_element(values.begin(),values.end());
> 
> but when I need the maximum of a sub range
> 
> uint32_t max_value = *std::max_element(values.begin(),values.begin()+4);
> 
> I get
> 
> /home/labor/new_daq/frontends/SIS3316Module.cpp:584:62: error: no match for ‘operator+’ (operand types are ‘midas::odb::iterator’ and ‘int’)
>   584 |   max_value = *std::max_element(values.begin(),values.begin()+4);
>       |                                                ~~~~~~~~~~~~~~^~
>       |                                                            |  |
>       |                                                            |  int
>       |               
> 
> As the + operator is overloaded for midas::odb::iterator, I was expected this to work.
> 
> (and yes, I can find the max element by accessing the elements on by one)
Entry  25 Oct 2021, Francesco Renga, Forum, Logger crash 
Hello,
     I'm experiencing crashes of the mlogger program on the time scale of a couple 
of days. The only messages from MIDAS are:

05:34:47.336 2021/10/24 [mhttpd,INFO] Client 'Logger' (PID 14281) on database 
'ODB' removed by db_cleanup called by cm_periodic_tasks (idle 10.2s,TO 10s)
05:34:47.335 2021/10/24 [mhttpd,INFO] Client 'Logger' on buffer 'SYSMSG' removed 
by cm_periodic_tasks (idle 10.2s, timeout 10s)

Any suggestion to further investigate this issue?

Thank you very much,
         Francesco
    Reply  25 Oct 2021, Stefan Ritt, Forum, Logger crash 
The short term solution would be to increase the logger timeout in the ODB under

/Programs/Logger/Watchdog timeout

and set it to 6000 (one minute). But that is curing just the symptoms. It would be 
interesting to understand the cause of this error. Probably the logger takes more than 10 
seconds to start or stop the run. The reason could be that the history grow too big (what 
we have right now in MEG II), or some disk problems. But that needs detailed debugging on 
the logger side.

Stefan
Entry  14 Oct 2021, Amy Roberts, Suggestion, Adding (or improving discoverability) of TID for odbset 
Creating an ODB key requires users to know the Type ID that are defined in 
https://bitbucket.org/tmidas/midas/src/develop/include/midas.h starting at line 320.

I can't find any information on the Midas Wiki about these values or how to find 
them.

Am I missing something obvious?  Is there a way to improve how to find these values?  
Or is this not the best way to interact with the ODB?
    Reply  15 Oct 2021, Stefan Ritt, Suggestion, Adding (or improving discoverability) of TID for odbset 
> Creating an ODB key requires users to know the Type ID that are defined in 
> https://bitbucket.org/tmidas/midas/src/develop/include/midas.h starting at line 320.
> 
> I can't find any information on the Midas Wiki about these values or how to find 
> them.
> 
> Am I missing something obvious?  Is there a way to improve how to find these values?  
> Or is this not the best way to interact with the ODB?

Well, you found them in midas.h, so where is the problem?

If you want a more detailed description, just look in the midas documentation (RTFM):

https://midas.triumf.ca/MidasWiki/index.php/Midas_Data_Types

If you want a more modern interface to the ODB without these data types, look here:

https://midas.triumf.ca/MidasWiki/index.php/Odbxx

Best regards,
Stefan
Entry  11 Oct 2021, Konstantin Olchanski, Forum, midas forum updated, moved 
The midas forum software (elogd) was updated to latest version and moved from our old server 
(ladd00.triumf.ca) to our new server (daq00.triumf.ca).

The following URLs should work:

https://daq00.triumf.ca/elog-midas/Midas/ (new URL)
https://midas.triumf.ca/elog/Midas/ (old URL, redirects to daq00)
https://midas.triumf.ca/forum (link from midas wiki)

The configuration on the old server ladd00.triumf.ca is quite tangled between
several virtual hosts and several DNS CNAMEs. I think I got all the redirects
correct and all old URLs and links in old emails & etc still work.

If you see something wrong, please reply to this message here or email me directly.

K.O.
Entry  11 Oct 2021, Konstantin Olchanski, Forum, test 
test, no email. K.O.
    Reply  11 Oct 2021, Konstantin Olchanski, Forum, test 
> test, no email. K.O.

test reply, no email. K.O.
       Reply  11 Oct 2021, Konstantin Olchanski, Forum, test image.png
> > test, no email. K.O.
> 
> test reply, no email. K.O.

test attachment, no email. K.O.
          Reply  11 Oct 2021, Konstantin Olchanski, Forum, test 
> > > test, no email. K.O.
> > 
> > test reply, no email. K.O.
> 
> test attachment, no email. K.O.

test email. K.O.
Entry  11 Oct 2021, Stefan Ritt, Info, Modification in the history logging system 
A requested change in the history logging system has been made today. Previously, history values were
logged with a maximum frequency (usually once per second) but also with a minimum frequency, meaning
that values were logged for example every 60 seconds, even if they did not change. This causes a problem.
If a frontend is inactive or crashed which produces variables to be logged, one cannot distinguish between
a crashed or inactive frontend program or a history value which simply did not change much over time.
The history system was designed from the beginning in a way that values are only logged when they actually
change. This design pattern was broken since about spring 2021, see for example this issue:

https://bitbucket.org/tmidas/midas/issues/305/log_history_periodic-doesnt-account-for

Today I modified the history code to fix this issue. History logging is now controlled by the value of 
common/Log history in the following way:

* Common/Log history = 0 means no history logging
* Common/Log history = 1 means log whenever the value changes in the ODB
* Common/Log history = N means log whenever the value changes in the ODB and 
  the previous write was more than N seconds ago

So most experiments should be happy with 0 or 1. Only experiments which have fluctuating values due to noisy 
sensors might benefit from a value larger than 1 to limit the history logging. Anyhow this is not the preferred 
way to limit history logging. This should be done by the front-end limiting the updates to the ODB. Most of the 
midas slow control drivers have a “threshold” value. Only if the input changes by more then the threshold are 
written to the ODB. This allows a per-channel “dead band” and not a per-event limit on history logging 
as ‘log history’ would do. In addition, the threshold reduces the write accesses to the ODB, although that is
only important for very large experiments.

Stefan
Entry  29 Sep 2021, Richard Longland, Bug Report, nstall clash between MIDAS 2020-08 and mscb 
Thank you, Stefan.

I found these instructions under
1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)

I do see reference to updating the submodules under the TRIUMF install 
instructions 
(https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
all_MIDAS) although perhaps it can be clarified.

Cheers,
Richard
    Reply  29 Sep 2021, Stefan Ritt, Bug Report, nstall clash between MIDAS 2020-08 and mscb 
> Thank you, Stefan.
> 
> I found these instructions under
> 1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
> 2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)
> 
> I do see reference to updating the submodules under the TRIUMF install 
> instructions 
> (https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
> all_MIDAS) although perhaps it can be clarified.
> 
> Cheers,
> Richard

Hi Richard,

I updated the documentation at

https://midas.triumf.ca/MidasWiki/index.php/Changelog#Updating_midas

by putting the submodule update command everywhere.

Best,
Stefan
Entry  28 Sep 2021, Richard Longland, Bug Report, Install clash between MIDAS 2020-08 and mscb 
All,

I am performing a fresh install of MIDAS on an Ubuntu linux box. I follow the 
usual installation procedure:

1) git clone https://bitbucket.org/tmidas/midas --recursive
2) cd midas
3) git checkout release/midas-2020-08
4) mkdir build
5) cd build
6) cmake ..
7) make

Step 3 warns me that 
"warning: unable to rmdir 'manalyzer': Directory not empty" and 
"warning: unable to rmdir 'midasio': Directory not empty"

Step 7 fails.
Compilation fails with an mhttp error related to mscb:
mhttpd.cxx:8224:59: error: too few arguments to function 'int mscb_ping(int, 
short unsigned int, int, int)'
 8224 |             status = mscb_ping(fd, (unsigned short) ind, 1);


I was able to get around this by rolling mscb back to some old version (commit 
74468dd), but am extremely nervous about mix-and-matching the code this way.

Any advice would be greatly appreciated.

Cheers,
Richard 
    Reply  28 Sep 2021, Stefan Ritt, Bug Report, Install clash between MIDAS 2020-08 and mscb 
> 1) git clone https://bitbucket.org/tmidas/midas --recursive
> 2) cd midas
> 3) git checkout release/midas-2020-08
> 4) mkdir build
> 5) cd build
> 6) cmake ..
> 7) make

When you do step 3), you get

~/tmp/midas$ git checkout release/midas-2020-08
warning: unable to rmdir 'manalyzer': Directory not empty
warning: unable to rmdir 'midasio': Directory not empty
M	mjson
M	mscb
M	mvodb
M	mxml

The 'M' in front of the submodules like mscb tell you that you
have an older version of midas (namely midas-2020-08), but the 
*current* submodules, which won't match. So you have to roll back
also the submodules with:

3.5) git submodule update --recursive

This fetched those versions of the submodules which match the
midas version 2020-08. See here for details: 

https://git-scm.com/book/en/v2/Git-Tools-Submodules

From where did you get the command

git checkout release/xxxx ???

If you tell me the location of that documentation, I will take
care that it will be amended with the command

git submodule update --recursive

Best,
Stefan
Entry  19 Sep 2021, Stefan Ritt, Bug Fix, Chat working again Screenshot_2021-09-19_at_21.27.19_.png
Not sure how many people are using it, but the Chat facility in midas was broken 
for some time now and got fixed today again.

Just for your information: Chat can be used like WhatsApp & Co, and connects all 
people who access a midas experiment through their browser. It's good to 
communicate between shift crew members located at different places. One advantage 
is that the chat messages can get 'spoken' by the text-to-speech engine of your 
browser, so it can be used to "wake up" shifters. Can be configured through the 
"Config" page.

Stefan
Entry  06 Sep 2021, Andreas Suter, Forum, mhttpd crash 
midas version used: midas-2019-05-cxx-1461-g906be8b

I find in the systemd log every couple of days/weeks the following error message related to the mhttpd:

[mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!

with various socket numbers of course.

Can anybody hint me what is going wrong here?

The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.

Help would be very much appreciated!

Andreas 
    Reply  06 Sep 2021, Konstantin Olchanski, Forum, mhttpd crash 
> [mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!
> Can anybody hint me what is going wrong here?
> The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.

This is my code. I am the culprit. I had a bit of discussion about this with Stefan.

Bottom line is something is rotten in the multithreading code inside mhttpd and under conditions unknown,
it sends the wrong data into the wrong socket. This causes midas web pages to be really confused (RPC replies
processed as CSS file, HTML code processed at RPC replies, a mess), this wrong data is cached by the browser,
so restarting mhttpd does not fix the web pages. So a mess.

I find this is impossible to replicate, and so cannot debug it, cannot fix it. Best I was able to do
is to add a check for socket numbers, and thankfully it catches the condition before web browser caches
become poisoned. So, broken web pages replaced by mhttpd crash.

This situation reinforces my opinion that multi-threading and C++ classes "do not mix" (like H2 and O2 do not mix).
If you write a multithreaded C++ program and it works, good for you, if there is a malfunction, good luck with it,
C++ just does not have any built-in support for debugging typical multithreading problems. I think others have come
to the same conclusion and invented all these new "safe" programming languages, like Rust and Go.

Back to your troubles.

1) If you see a way to replicate this crash, or some way to reliably cause
the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
and I wish to fix this problem very much.

2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
are useless for debugging the present problem. There is just no way to examine the state of each
thread and of each http request using gdb by hand.

3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
the "wrong socket" check, but most likely you will not like the result).

4) run mhttpd inside a script: "while (1) { start mhttpd; sleep 1 sec; rinse, repeat; }" (run mhttpd without "-D", yes?)

In other news, the mongoose web server library have a new version available, they again changed their
multithreading scheme (I think it is an improvement). If I update mhttpd to this new version, it is very
likely the code with the "wrong socket" bug will be deleted. (with new bugs added to replace old bugs, of course).

K.O.
       Reply  07 Sep 2021, Andreas Suter, Forum, mhttpd crash 
Dear Konstantin,

thanks for the prompt response, this helps a lot!

> 1) If you see a way to replicate this crash, or some way to reliably cause
> the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
> and I wish to fix this problem very much.

I wished I could! This happens 3-4 times per year only, so close to impossible to trigger.

> 2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
> are useless for debugging the present problem. There is just no way to examine the state of each
> thread and of each http request using gdb by hand.
> 
> 3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
> other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
> to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
> the "wrong socket" check, but most likely you will not like the result).
> 

I changed now to exit() rather than abort on the production machine. Perhaps this should be the default?

Andreas
          Reply  17 Sep 2021, Stefan Ritt, Forum, mhttpd crash mhttpdScreenshot_2021-09-17_at_21.11.15_.png
To limit the impact of the numerous crashes of mhttpd, I installed the monit tool at MEG at PSI 
(https://en.wikipedia.org/wiki/Monit). It monitors mhttpd, and if it cannot connect to it for a certain
time, it kills the process and restarts it. This covers endless loops, simple crashes (caused by the
known multi-threading issue in mongoose), and also cases where mhttpd develops a memory leak and becomes
unresponsive. 

To configure monit for mhttpd, first install the package, make sure the daemon gets started automatically
after reboot (typically "sysemctl enable monit"), and put the attached file into

/etc/monit.d/mhttpd

You have to adjust the <path-to-midas> according to your midas installation, and probably also the port
under which mhttpd is listening (8082 in my case). Put 

set daemon 10

into /etc/monitrc if you want monit to check mhttpd every 10 seconds (default is 30 seconds). Then, every
10 seconds monit request "midas.css" from mhttpd, and if it cannot obtain it after 30 seconds, it kills
mhttpd and restarts it.

Loading long history plots taking more than 30 seconds should probably not be an issue since mhttpd is 
multi-threaded, but I haven't tested this in detail.

Attached below is a typical status page produced by monit, which has its own built-in web server (normally
listening at port 2812, accessible only from localhost by default).

I hope this helps some of you.

Stefan
Entry  24 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
I am updating the history plots. Main changes:

- the old history display code should again be easily usable (use the "open in old history display" checkbox)
- the history plot editor has an "edit in ODB" button that takes as to the plot definition in ODB (sometimes it is 
easier to editing things in the ODB editor)
- error in history plot editor that created "formula" entry of incorrect size should be fixed
- "reorder" (and "delete entry") functions in the history plot editor should work again (plus added explanation text)
- "factor" and "offset" restored in the history plot editor
- added the long desired "voffset" to simplify plot scaling and positioning
- (factor, offset and voffset do not yet work in the new history plots, TBI ASAP)
- history plot editor and generate_hist_graph() now use the same code to read plot definitions from ODB. There should 
be no more confusion about content of history plot entries in ODB and what each entry is supposed to do.

These changes have been precipitated by our inability to plot high voltage voltage and current on the same plot,
see bug https://bitbucket.org/tmidas/midas/issues/308/history-plot-formula-cannot-be-used-to

Voltage is in the range 0..1000 (volts) and current is in the range 0..50 and 0..0.100, autoscaling on voltage
makes the currents invisible at the zero line. In the past, we used the "factor" setting to scale
the graphs so we can see both voltage and currents at the same time (currents scaled up by factor 25 and 600,
as example).

The new "formula" feature was supposed to replace (and improve upon) the "factor" and "offset". But if I use
the formula "x*25", suddenly the plot is telling us that current values are not 50 uA, but 1250 uA (50*25),
and this is just wrong. We do not want to scale the micro-amps, we want to better position the plot on the graph,
like the old "factor" and "offset" allowed us to do.

So the idea is to use this computation:

y_position_on_plot = offset + factor*(formula(history_value) - voffset)

- "formula" is to transform history values into physical values (i.e. pressure meter reports bars, but we want atm, or 
voltmeter is reading in discrete units of 0.125V, we want to see volts)
- "factor" and "offset" is to position the graphs on the plot for best visual presentation of data
- I also added is the much desired "voffset", you only know it is needed if you have a non-zero "offset" and you need 
to change the "factor", surprise, "offset" has ot be changed, too, and good luck recalculating it correctly in one 
try.

The way to use this stuff:
- adjust "voffset" to bring the graph to around y=0
- increase the "factor" to zoom-in on features and stuff
- adjust "offset" to move the graph up and down relative to all the other graphs on the plot
- now one can zoom in and out as needed by changing the "factor" and the plot will stay roughly in the right place 
without having to readjust the offsets.

K.O.
    Reply  24 Jun 2021, Stefan Ritt, Bug Fix, changes in history plots 
I disagree with the proposed change to scale the HV current for a "nice" display. If values are scaled, the axis should be 
scaled in the same way. Otherwise people might read the current from the plot, look at the axis, and again get the wrong 
value (the factor of 25x you mention). Sure you can hover with the cursor over the graph, and see the right value, but think 
of taking a screen shot, putting this into a publication, and get complaints from the reviewer.

The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a 
new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For 
the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.

Stefan
       Reply  25 Jun 2021, Marco Francesconi, Bug Fix, changes in history plots 
We are using the new history formula as a quick way to convert signals from sensors to actual physical values (for example Voltage->Temperature, Voltage->relative humidity 
...), so it is great that the shown voltage is the calculated one.

I would like to add a point to this discussion.
In our collaboration people attach images of history plots to elogs, meeting presentation and/or physical logbooks.
The proposed scaling formula may work fine online using the cursors, but, once an image is created, I do not understand how it is possible to extract the value for a scaled 
variables.
Suppose you see a graph in a presentation with a current increase by some PSU and the current was scaled to be in the same plot of the voltage.
Looking at the delta in the image, how can you judge the current increase without any axis/grid to refer to?

So I support Stefan proposal for a secondary axis, as long as it is clear which value belong to which axis.
Maybe marking the channels in the description or using different line styles/thickness?

Best,
Marco
          Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
I will have to post an example of a scaled plot. I figure everybody forgot how they look like.

K.O.


> We are using the new history formula as a quick way to convert signals from sensors to actual physical values (for example Voltage->Temperature, Voltage->relative humidity 
> ...), so it is great that the shown voltage is the calculated one.
> 
> I would like to add a point to this discussion.
> In our collaboration people attach images of history plots to elogs, meeting presentation and/or physical logbooks.
> The proposed scaling formula may work fine online using the cursors, but, once an image is created, I do not understand how it is possible to extract the value for a scaled 
> variables.
> Suppose you see a graph in a presentation with a current increase by some PSU and the current was scaled to be in the same plot of the voltage.
> Looking at the delta in the image, how can you judge the current increase without any axis/grid to refer to?
> 
> So I support Stefan proposal for a secondary axis, as long as it is clear which value belong to which axis.
> Maybe marking the channels in the description or using different line styles/thickness?
> 
> Best,
> Marco
       Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> I disagree ...

I am happy with disagreement and differences of opinions. Zest of life, driver of progress and improvements, etc.

I am even more happy with solutions to problems. The current problem is that the offset and factor feature
of history plots has been removed without much discussion.

I stress, we have been using this feature to run experiments for the last 20 years.

I do not understand objections to it being restored. If you do not want to use it, do not use it.

K.O.

> with the proposed change to scale the HV current for a "nice" display. If values are scaled, the axis should be 
> scaled in the same way. Otherwise people might read the current from the plot, look at the axis, and again get the wrong 
> value (the factor of 25x you mention). Sure you can hover with the cursor over the graph, and see the right value, but think 
> of taking a screen shot, putting this into a publication, and get complaints from the reviewer.
> 
> The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a 
> new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For 
> the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.
> 
> Stefan
          Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> > The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a 
> > new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For 
> > the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.

In the past, we have done some useful plots with maybe 10 variables plotted
at the same time with different scaling and positioning on the graph.

Having 2 vertical axis is maybe useful for the specific case of plotting high voltages,
but not in the general case.

Actually, just 2 vertical axis will not work to plot high voltages in ALPHA-g, because
we have anode currents on the scale 0..0.1 uA and cathode currents on the scale 50..60 uA.

K.O.
    Reply  25 Jun 2021, Stefan Ritt, Bug Fix, changes in history plots 
A general warning: With the recent history changes implemented in the develop branch, starting from a fresh ODB and editing 
any history panel, on gets tons of errors and debug output from mhttpd:

MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Log axis" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
Load from ODB History/Display/Default/Trigger rate: hist plot: 2 variables
timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, show_run_markers: 1, show_values: 1, 
show_fill: 1
var[0] event [System][Trigger per sec.] formula [], colour [#00AAFF] label [] factor 1.000000 offset 0.000000 voffset 
0.000000 order 10
var[1] event [System][Trigger kB per sec.] formula [], colour [#FF9000] label [] factor 1.000000 offset 0.000000 voffset 
0.000000 order 20



This has to be fixed by the original author. I strongly recommend to make such modifications on a separate branch not to 
break running experiments.

Stefan
       Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> A general warning: With the recent history changes implemented in the develop branch, starting from a fresh ODB and editing 
> any history panel, on gets tons of errors and debug output from mhttpd: ...

This is the reason most projects have separate development and production branches.

I recommend everybody to use the released tagged versions of midas for production.

> I strongly recommend to make such modifications on a separate branch not to 
> break running experiments.

Is there something that does not work anymore? Did I break something? The debug messages I am still
tuning.

K.O.


> 
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Log axis" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
> Load from ODB History/Display/Default/Trigger rate: hist plot: 2 variables
> timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, show_run_markers: 1, show_values: 1, 
> show_fill: 1
> var[0] event [System][Trigger per sec.] formula [], colour [#00AAFF] label [] factor 1.000000 offset 0.000000 voffset 
> 0.000000 order 10
> var[1] event [System][Trigger kB per sec.] formula [], colour [#FF9000] label [] factor 1.000000 offset 0.000000 voffset 
> 0.000000 order 20
> 
> 
> 
> This has to be fixed by the original author. I strongly recommend to make such modifications on a separate branch not to 
> break running experiments.
> 
> Stefan
    Reply  30 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> I am updating the history plots.
> So the idea is to use this computation:
> y_position_on_plot = offset + factor*(formula(history_value) - voffset)

Stefan and myself did some brain storming on zoom. Writing it down the way I remember it.

- we distilled the gist of the problem - the numerical values we show in the plot labels and in hover-over-the-graph
are before formula is applied or after the formula is applied?

- I suggested a universal solution using a double formula: use formula1 for one case;
  use formula2 for the other case;
  use formula1 for "physics calibration", use formula2 for factor and offset for composite plots:
     numeric_value = formula1(history_value)
     plotted_value = formula2(numeric_value)

- we agree that this is way too complicated, difficult to explain and difficult to coherently present in the history editor

- Stefan suggested a simple solution, a checkbox labeled "show raw value" next to each history variable. by default, the 
value after the formula is plotted and displayed. if checked, the raw value (before the formula) is displayed, and the 
value after the formula is plotted. (so this works the same as the factor and offset on the old history plots).

- if "show raw value" is enabled, the numerical values shown will be inconsistent against the labels on the vertical axis. 
Our solution it to turn the axis labels off. (for composite plots, like oscillator frequency in Hz vs oscillator 
temperature in degC, both scaled to see their correlation, the vertical axis is unit-less "arbitrary units", of course)

- to simplify migration of old history plots that use custom factor and offset settings, we think in the direction of 
automatically moving them to the "formula". (factor=2, offset=10 automatically populates formula with "2*x+10", "show raw 
value" checked/enabled). Thus we can avoid implementing factor and offset in the new history code (an unwelcome 
complication).

- I think this covers all the use cases I have seen in the past, so we will move in this direction.

K.O.
       Reply  14 Jul 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
Moving in the direction of this proposal. History plot editor is updated according to it. Remaining missing piece is the "show 
raw value" buttons and code behind them.

Changes:

- "show factor and offset" moved to the top of the page, "off" by default
- factor and offset (if not zero) are automatically migrated to the formula field (if it is empty), one needs to save the panel 
for this to take effect.

K.O.


> > I am updating the history plots.
> > So the idea is to use this computation:
> > y_position_on_plot = offset + factor*(formula(history_value) - voffset)
> 
> Stefan and myself did some brain storming on zoom. Writing it down the way I remember it.
> 
> - we distilled the gist of the problem - the numerical values we show in the plot labels and in hover-over-the-graph
> are before formula is applied or after the formula is applied?
> 
> - I suggested a universal solution using a double formula: use formula1 for one case;
>   use formula2 for the other case;
>   use formula1 for "physics calibration", use formula2 for factor and offset for composite plots:
>      numeric_value = formula1(history_value)
>      plotted_value = formula2(numeric_value)
> 
> - we agree that this is way too complicated, difficult to explain and difficult to coherently present in the history editor
> 
> - Stefan suggested a simple solution, a checkbox labeled "show raw value" next to each history variable. by default, the 
> value after the formula is plotted and displayed. if checked, the raw value (before the formula) is displayed, and the 
> value after the formula is plotted. (so this works the same as the factor and offset on the old history plots).
> 
> - if "show raw value" is enabled, the numerical values shown will be inconsistent against the labels on the vertical axis. 
> Our solution it to turn the axis labels off. (for composite plots, like oscillator frequency in Hz vs oscillator 
> temperature in degC, both scaled to see their correlation, the vertical axis is unit-less "arbitrary units", of course)
> 
> - to simplify migration of old history plots that use custom factor and offset settings, we think in the direction of 
> automatically moving them to the "formula". (factor=2, offset=10 automatically populates formula with "2*x+10", "show raw 
> value" checked/enabled). Thus we can avoid implementing factor and offset in the new history code (an unwelcome 
> complication).
> 
> - I think this covers all the use cases I have seen in the past, so we will move in this direction.
> 
> K.O.
          Reply  14 Jul 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> Moving in the direction of this proposal. Remaining missing piece is the "show 
> raw value" buttons and code behind them.

added "show raw value" button, updated on-page instructions.

I think this is the final layout of the history panel editor, conversion
to html+javascript will be done "as is". If you have suggestions to improve
the layout (add/remove/move things around, etc), please shoult out (on the elog
here or by direct email to me).

I am thinking in the direction of changing the control flow of the history editor:

- midas "history" manu button click redirects to
- current history panel selection (with checkbox to open old history plots), click on "new plot" button redirects to
- new page for creating new plots. this will present a list of all history variables, click on variable name creates a new history 
panel containing just this one variable and redirects to it.

In other words, to see the history for any history variable:
- click on "history" menu button
- click on "new"
- click on desired history variable
- see this history plot

From here, click on the "wheel" button to open the existing history panel editor and add any additional variables, change settings, 
etc.

In the history panel editor, I am thinking in the direction of replacing the existing drop-down selection of history variables (now 
very workable for large experiments) with an overlay dialog to show all history variables, with checkboxes to select them, basically 
the same history variable select page as described above. Not sure yet how this will work visually.

K.O.
             Reply  24 Aug 2021, Stefan Ritt, Bug Fix, changes in history plots 
One addition I would be in favour of is to remove the "Order" and replace it with drag&drop handles, because this is what people are more 
used to today. Only the old guys like us remember the /etc/init.d/xx_yy scheme where one uses an integer number in the file name to 
determine an order. 

See for example: https://jsbin.com/hijetos/edit?js,output

But instead of relying on a foreign library, I would rather implement that myself, since I need the same thing later for the to-be-
implemented ODB editor (next year? next lockdown?)

Stefan
Entry  19 Aug 2021, Konstantin Olchanski, Bug Report, select() FD_SETSIZE overrun 
I am looking at the mlogger in the ALPHA anti-hydrogen experiment at CERN. It is 
mysteriously misbehaving during run start and stop.

The problem turns out to be with the select() system call.

The corresponding FD_SET(), FD_ISSET() & co operate on a an array of fixed size 
FD_SETSIZE, value 1024, in my case. But the socket number is 1409, so we overrun 
the FD_SET() array. Ouch.

I see that all uses of select() in midas have no protection against this.

(we should probably move away from select() to newer poll() or whatever it is)

Why does mlogger open so many file descriptors? The usual, scaling problems in the 
history. The old midas history does not reuse file descriptors, so opens the same 
3 history files (.hst, .idx, etc) for each history event. The new FILE history 
opens just one file per history event. But if the number of events is bigger than 
1024, we run into same trouble.

(BTW, the system limit on file descriptors is 4096 on the affected machine, 1024 
on some other machines, see "limit" or "ulimit -a").

K.O.
    Reply  20 Aug 2021, Stefan Ritt, Bug Report, select() FD_SETSIZE overrun 
> I am looking at the mlogger in the ALPHA anti-hydrogen experiment at CERN. It is 
> mysteriously misbehaving during run start and stop.
> 
> The problem turns out to be with the select() system call.
> 
> The corresponding FD_SET(), FD_ISSET() & co operate on a an array of fixed size 
> FD_SETSIZE, value 1024, in my case. But the socket number is 1409, so we overrun 
> the FD_SET() array. Ouch.
> 
> I see that all uses of select() in midas have no protection against this.
> 
> (we should probably move away from select() to newer poll() or whatever it is)
> 
> Why does mlogger open so many file descriptors? The usual, scaling problems in the 
> history. The old midas history does not reuse file descriptors, so opens the same 
> 3 history files (.hst, .idx, etc) for each history event. The new FILE history 
> opens just one file per history event. But if the number of events is bigger than 
> 1024, we run into same trouble.
> 
> (BTW, the system limit on file descriptors is 4096 on the affected machine, 1024 
> on some other machines, see "limit" or "ulimit -a").
> 
> K.O.

I cannot imagine that you have more than 1024 different events in ALPHA. That wouldn't 
fit on your status page. 

I have some other suspicion: The logger opens a history file on access, then closes it 
again after writing to it. In the old days we had a case where we had a return from the 
write function BEFORE the file has been closed. This is kind of a memory leak, but with 
file descriptors. After some time of course you run out of file descriptors and crash. 
Now that bug has been fixed many years ago, but it sounds to me like there is another 
"fd leak" somewhere. You should add some debugging in the history code to print the 
file descriptors when you open a file and when you leave that routine. The leak could 
however also be somewhere else, like writing to the message file, ODB dump, ...

The right thing of course would be to rewrite everything with std::ofstream which 
closes automatically the file when the object gets out of scope.

Stefan
Entry  12 May 2021, Mathieu Guigue, Bug Report, mhttpd WebServer ODBTree initialization 
Hi,

Using midas version 12-2020,  I am trying to run mhttpd from within a docker container using docker-compose.
Starting from an empty ODB, I simply run `mhttpd` and this is the output I have:
midas_hatfe_1  | <Warning> Starting mhttpd...
midas_hatfe_1  | [mhttpd,INFO] ODB subtree /Runinfo corrected successfully
midas_hatfe_1  | MVOdb::SetMidasStatus: Error: MIDAS db_find_key() at ODB path "/WebServer/Host list" returned status 312
midas_hatfe_1  | Mongoose web server will not use password protection
midas_hatfe_1  | Mongoose web server will not use the hostlist, connections from anywhere will be accepted
midas_hatfe_1  | Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF
midas_hatfe_1  | [mhttpd,ERROR] [mhttpd.cxx:19160:mongoose_listen,ERROR] Cannot mg_bind address "[::1]:8080"

According to the documentation, the WebServer tree should be created automatically when starting the mhttpd; but it seems not as it doesn't find the entry "/WebServer/Host list".
If I create it by end (using "create STRING /WebServer/Host list"), I still get the error message that mhttpd didn't bind properly to the local port 8080.
I am not sure what it wrong, as mhttpd is working perfectly well in this exact container for midas 03-2020.

Any idea what difference makes it not possible anymore to run into these container?

Thanks very much for your help.
Cheers
Mathieu
    Reply  12 May 2021, Ben Smith, Bug Report, mhttpd WebServer ODBTree initialization 
> midas_hatfe_1  | Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF
> midas_hatfe_1  | [mhttpd,ERROR] [mhttpd.cxx:19160:mongoose_listen,ERROR] Cannot mg_bind address "[::1]:8080"

It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false.
       Reply  12 May 2021, Stefan Ritt, Bug Report, mhttpd WebServer ODBTree initialization 
> It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false.

We had this issue already several times. This info should be put into the documentation at a prominent location.

Stefan
          Reply  13 May 2021, Mathieu Guigue, Bug Report, mhttpd WebServer ODBTree initialization 
> > It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false.
> 
> We had this issue already several times. This info should be put into the documentation at a prominent location.
> 
> Stefan

Thanks a lot, this solved my issue!
             Reply  14 May 2021, Stefan Ritt, Bug Report, mhttpd WebServer ODBTree initialization 
> Thanks a lot, this solved my issue!

... or we should turn IPv6 off by default, since not many people use this right now.
                Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, mhttpd WebServer ODBTree initialization 
> > Thanks a lot, this solved my issue!
> 
> ... or we should turn IPv6 off by default, since not many people use this right now.

IPv6 certainly works and is used at CERN.

But I am not sure why people see this message. I do not see it on any machines at 
TRIUMF, even those with IPv6 turned off.

K.O.
                   Reply  05 Aug 2021, Stefan Ritt, Bug Report, mhttpd WebServer ODBTree initialization 
Well, we all see it here at PSI, so this is enough reason to turn this off by default. Shall 
I do it?
Entry  04 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
Hi,

if I check out midas and try to configure it with 

cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas

I do get the error messages:

  Target "midas" INTERFACE_INCLUDE_DIRECTORIES property contains path:

    "<path>/tmidas/midas/include"

  which is prefixed in the source directory.

Is the cmake setup not relocatable? This is new and was working until recently:

MIDAS version:      2.1
GIT revision:       Thu May 27 12:56:06 2021 +0000 - midas-2020-08-a-295-gfd314ca8-dirty on branch HEAD
ODB version:        3
    Reply  04 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas

good timing, I am working on cmake for manalyzer and rootana and I have not tested
the install prefix business.

now I know to test it for all 3 packages.

I will also change find_package(Midas) slightly, (see my other message here),
I hope you can confirm that I do not break it for you.

K.O.
    Reply  04 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> Is the cmake setup not relocatable? This is new and was working until recently:

Indeed. Not relocatable. This is because we do not install the header files.

When you use the CMAKE_INSTALL_PREFIX, you get MIDAS "installed" in:

prefix/lib
prefix/bin
$MIDASSYS/include <-- this is the source tree and so not "relocatable"!

Before, this was kludged and cmake did not complain about it.

Now I changed cmake to handle the include path "the cmake way", and now it knows to complain about it.

I am not sure how to fix this: we have a conflict between:

- our normal way of using midas (include $MIDASSYS/include, link $MIDASSYS/lib, run $MIDASSYS/bin)
- the cmake way (packages *must be installed* or else! but I do like install(EXPORT)!)
- and your way (midas include files are in $MIDASSYS/include, everything else is in your special location)

I think your case is strange. I am curious why you want midas libraries to be in prefix/lib instead of in 
$MIDASSYS/lib (in the source tree), but are happy with header files remaining in the source tree.

K.O.
       Reply  04 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > Is the cmake setup not relocatable? This is new and was working until recently:
> 
> Indeed. Not relocatable. This is because we do not install the header files.
> 
> When you use the CMAKE_INSTALL_PREFIX, you get MIDAS "installed" in:
> 
> prefix/lib
> prefix/bin
> $MIDASSYS/include <-- this is the source tree and so not "relocatable"!
> 
> Before, this was kludged and cmake did not complain about it.
> 
> Now I changed cmake to handle the include path "the cmake way", and now it knows to complain about it.
> 
> I am not sure how to fix this: we have a conflict between:
> 
> - our normal way of using midas (include $MIDASSYS/include, link $MIDASSYS/lib, run $MIDASSYS/bin)
> - the cmake way (packages *must be installed* or else! but I do like install(EXPORT)!)
> - and your way (midas include files are in $MIDASSYS/include, everything else is in your special location)
> 
> I think your case is strange. I am curious why you want midas libraries to be in prefix/lib instead of in 
> $MIDASSYS/lib (in the source tree), but are happy with header files remaining in the source tree.
> 
> K.O.

We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
If I think an all other packages I am working with, e.g. ROOT, the includes are also installed under CMAKE_INSTALL_PREFIX. 
Up until recently there was no issue to work with CMAKE_INSTALL_PREFIX, accepting that the includes stay under 
$MIDASSYS/include, even though this is not quite the standard way, but no problem here. Anyway, since CMAKE_INSTALL_PREFIX 
is a standard option from cmake, I think things should not "break" if you want to use it.

A.S.
          Reply  08 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > Is the cmake setup not relocatable? This is new and was working until recently:
> > Not relocatable. This is because we do not install the header files.
> 
> We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 

hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
mean by this? you install midas in a secret location to prevent somebody from linking to it?

> If I think an all other packages I am working with, e.g. ROOT, the includes are also installed under CMAKE_INSTALL_PREFIX.

cmake and other frameworks tend to be like procrustean beds (https://en.wikipedia.org/wiki/Procrustes),
pre-cmake packages never quite fit perfectly, and either the legs or the heads get cut off. post-cmake packages
are constructed to fit the bed, whether it makes sense or not.

given how this situation is known since antiquity, I doubt we will solve it today here.

(I exercise my freedom of speech rights to state that I object being put into
such situations. And I would like to have it clear that I hate cmake (ask me why)).

>
> Up until recently there was no issue to work with CMAKE_INSTALL_PREFIX, accepting that the includes stay under 
> $MIDASSYS/include, even though this is not quite the standard way, but no problem here.
>

I think a solution would be to add install rules for include files. There will be a bit of trouble,
normal include path is $MIDASSYS/include,$MIDASSYS/mxml,$MIDASSYS/mjson,etc, after installing
it will be $CMAKE_INSTALL_PREFIX/include (all header files from different git submodules all
dumped into one directory). I do not know what problems will show up from that.

I think if midas is used as a subproject of a bigger project, this is pretty much required
(and I have seen big experiments, like STAR and ND280, do this type of stuff with CMT,
another horror and the historical precursor of cmake)

The problem is that we do not have any super-project like this here, so I cannot ever
be sure that I have done everything correctly. cmake itself can be helpful, like
in the current situation where it told us about a problem. but I will never trust
cmake completely, I see cmake do crazy and unreasonable things way too often.

One solution would be for you or somebody else to contribute such a cmake super-project,
that would build midas as a subproject, install it with a CMAKE_INSTALL_PREFIX and
try to link some trivial frontend or analyzer to check that everything is installed
correctly. It would become an example for "how to use midas as a subproject").
Ideally, it should be usable in a bitbucket automatic build (assuming bitbucket
has correct versions of cmake, which it does not half the time).

P.S. I already spent half-a-week tinkering with cmake rules, only to discover
that I broke a kludge that allows you to do something strange (if I have it right,
the CMAKE_PREFIX_INSTALL code is your contribution). This does not encourage
 me to tinker with cmake even more. who knows against what other
kludge I bump into. (oh, yes, I know, I already bumped into the nonsense
find_package(Midas) implementation).

K.O.
             Reply  09 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > > Is the cmake setup not relocatable? This is new and was working until recently:
> > > Not relocatable. This is because we do not install the header files.
> > 
> > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
> 
> hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
> mean by this? you install midas in a secret location to prevent somebody from linking to it?
> 

This was a wrong wording from my side. We do not want the the users have write access to the midas installation libs and bins.
I have submitted the pull request which should resolve this without interfere with your usage.
Hope this will resolve the issue.
                Reply  10 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > > > Is the cmake setup not relocatable? This is new and was working until recently:
> > > > Not relocatable. This is because we do not install the header files.
> > > 
> > > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
> > 
> > hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
> > mean by this? you install midas in a secret location to prevent somebody from linking to it?
> > 
> 
> This was a wrong wording from my side. We do not want the the users have write access to the midas installation libs and bins.
> I have submitted the pull request which should resolve this without interfere with your usage.
> Hope this will resolve the issue.

Excellent. I think it is good to have midas "install" in a sane manner.

But I still struggle to understand what you do. Presumably you can "install" midas
in the "midas account", which is not writable by the experiment and user accounts.
Then it does not matter if you "install" it in it's build directory (like we do)
or in some other location (like you do now).

This does not work of course if you only have one account, so do you build midas
as root? or install it as root?

I do ask because in the current computing world, doing things as root requires
a certain amount of trust, which may not be there anymore, see the recent "supply side" attacks
against python packages, solar winds hack, linux kernel malicious patches from umn, etc.

Personally, I do not want to answer questions "is midas safe to run as root?",
"can I trust the midas install scripts to run as root?" and certainly I do not want to hear
about "I installed midas and 100 other packages as root and got hacked 7 days later".

(and running midas as root was never safe. neither mhttpd nor mserver will pass
a security audit).

Anyhow, looks like I will look at cmake again next week. Right now I have a major
breakthrough in the ALPHA-g experiment, my big 96-port Juniper switch suddenly
has working ethernet flow control and I can record data at 600 Mbytes/sec without
any UDP packet loss. Above that, my event builder explodes. I want to fix it and get
it up to 1000 Mbytes/sec, the limit of my 10gige network link. (In this system I do not
have the disk subsystem to record data at this rate, but I have build 8-disk ZFS arrays
that would sink it, no problem). And the day has come when I ran out of CPU cores.
The UDP packet receivers are multithreaded, the event builder is multithreaded and I am using
all 4 of the available cores (intel cpu). As soon as I can get a rackmounted AMD Ryzen
or Threadripper machine, we will likely upgrade. (need at least one more CPU core to run
the online analyzer!). Exciting.

K.O.
                   Reply  10 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > > > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > > > > Is the cmake setup not relocatable? This is new and was working until recently:
> > > > > Not relocatable. This is because we do not install the header files.
> > > > 
> > > > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
> > > 
> > > hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
> > > mean by this? you install midas in a secret location to prevent somebody from linking to it?
> > > 
> > 
> > This was a wrong wording from my side. We do not want the the users have write access to the midas installation libs and bins.
> > I have submitted the pull request which should resolve this without interfere with your usage.
> > Hope this will resolve the issue.
> 
> Excellent. I think it is good to have midas "install" in a sane manner.
> 
> But I still struggle to understand what you do. Presumably you can "install" midas
> in the "midas account", which is not writable by the experiment and user accounts.
> Then it does not matter if you "install" it in it's build directory (like we do)
> or in some other location (like you do now).
> 
> This does not work of course if you only have one account, so do you build midas
> as root? or install it as root?
> 

We work the following way: there is a production Midas under let's say /usr/local/midas (make install as sudo/root). This is for the running experiment. Since we are doing muSR, we 
have experiments on a daily base, rather than month and years as it is the case for a particle physics experiment. Now, still we would like to test updates, new features of Midas on 
the same machine. For this we us the repo directly. If we are happy with the new feature, and fixes, we again do a 'make install' and hence freeze for the production a specific 
snapshot. Of course we could use various local copies of the Midas repo, but over the last years this approach was very convenient and productive. Hope this explains a bit better 
why we want to work with a CMAKE_INSTALL_PREFIX.

AS
                      Reply  11 Jul 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
big thanks to Andreas S. for getting most of this figured out. I now understand
much better how cmake installs things and how it generates config files, both
find_package(midas) style and install(export) style.

with the latest updates, CMAKE_INSTALL_PREFIX should work correctly. I now understand how it works,
how to use it and how to test it, it should not break again.

for posterity, my commends to Andreas's pull request:

thank you for providing this code, it was very helpful. at the end I implemented things slightly differently. It took me a while to understand that I have to provide 2 “install” modes, for your case, I need to 
“install” the header files and everything works “the cmake way”, for our normal case, we use include files in-place and have to include all the git submodules to the include path. I am quite happy with the 
result. K.O.

K.O.
                         Reply  02 Aug 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
Dear Konstantin,

I have tried your adopted version. You did already quite a job which is more consistent than what I was suggesting.
Yet, I still have a problem (git sha2 2d3872dfd31) when starting on a clean system (i.e. no midas present yet): 
Without CMAKE_INSTALL_PREFIX set, everything is fine. 
However, when setting CMAKE_INSTALL_PREFIX, I get the following error message on the build level (cmake --build ./ -- VERBOSE=1) from the manalyzer:

[ 32%] Building CXX object manalyzer/CMakeFiles/manalyzer.dir/manalyzer.cxx.o
cd /home/l_musr_tst/Tmp/midas/build/manalyzer && /usr/bin/c++  -DHAVE_FTPLIB -DHAVE_MIDAS -DHAVE_ROOT_HTTP -DHAVE_THTTP_SERVER -DHAVE_TMFE -DHAVE_ZLIB -D_LARGEFILE64_SOURCE -I/home/l_musr_tst/Tmp/midas/manalyzer -I/usr/local/root/include  -O2 -g -Wall -Wformat=2 -Wno-format-nonliteral -Wno-strict-aliasing -Wuninitialized -Wno-unused-function -std=c++11 -pipe -fsigned-char -pthread -DHAVE_ROOT -std=gnu++11 -o CMakeFiles/manalyzer.dir/manalyzer.cxx.o -c /home/l_musr_tst/Tmp/midas/manalyzer/manalyzer.cxx
In file included from /home/l_musr_tst/Tmp/midas/manalyzer/manalyzer.cxx:14:0:
/home/l_musr_tst/Tmp/midas/manalyzer/manalyzer.h:13:21: fatal error: midasio.h: No such file or directory
 #include "midasio.h"
                     ^
compilation terminated.

Obviously, still some include paths are missing. I tried quickly to see if an easy fix is possible, but I failed.

Question: is it possible to use manalyzer without midas? I am asking since the MIDAS_FOUND flag is confusing me.

> big thanks to Andreas S. for getting most of this figured out. I now understand
> much better how cmake installs things and how it generates config files, both
> find_package(midas) style and install(export) style.
> 
> with the latest updates, CMAKE_INSTALL_PREFIX should work correctly. I now understand how it works,
> how to use it and how to test it, it should not break again.
> 
> for posterity, my commends to Andreas's pull request:
> 
> thank you for providing this code, it was very helpful. at the end I implemented things slightly differently. It took me a while to understand that I have to provide 2 “install” modes, for your case, I need to 
> “install” the header files and everything works “the cmake way”, for our normal case, we use include files in-place and have to include all the git submodules to the include path. I am quite happy with the 
> result. K.O.
> 
> K.O.
Entry  31 Jul 2021, Peter Kunz, Bug Report, ss_shm_name: unsupported shared memory type, bye! 
I ran into a problem trying to compile the latest MIDAS version on a Fedora 
system.

mhttpd and odbedit return:
ss_shm_name: unsupported shared memory type, bye!

check_shm_type: preferred POSIXv4_SHM got SYSV_SHM

The check returns SYSV_SHM which doesn't seem to be supported in ss_shm_name.

Is there an easy solution for this?

Thanks.
Entry  09 Jul 2021, Konstantin Olchanski, Bug Report, cmake question 
cmake check and mate in 1 move. please help.

the midas cmake file has a typo in the ROOT_CXX_FLAGS, I fixed it and now I am dead in the 
water, need help from cmake experts and pushers.

On Ubuntu:
ROOT_CXX_FLAGS has -std=c++14
midas cmake defines -std=gnu++11 (never mind that I asked for c++11, not "c++11 with GNU 
extensions")

the two compiler flags collide and the build explodes, the best I can tell c++11 prevails 
and ROOT header files blow up because they expect c++14.

if I remove the midas cmake request for c++11, -std=gnu++11 is gone, there is no conflict 
with ROOT C++14 request and the build works just fine.

but now it explodes on CentOS-7 because by default, c++11 is not enabled. (include <mutex> 
blows up).

what a mess.

K.O.
    Reply  13 Jul 2021, Konstantin Olchanski, Bug Report, cmake question 
> cmake check and mate in 1 move. please help.
> -std=c++11 and -std=c++14 collision...

I have a solution implemented for this, I am not happy with it, Stefan is not happy with it. See 
discussion: https://bitbucket.org/tmidas/midas/commits/50a15aa70a4fe3927764605e8964b55a3bb1732b

K.O.
       Reply  14 Jul 2021, Konstantin Olchanski, Bug Report, cmake question 
> > cmake check and mate in 1 move. please help.
> > -std=c++11 and -std=c++14 collision...
> 
> I have a solution implemented for this, I am not happy with it, Stefan is not happy with it. See 
> discussion: https://bitbucket.org/tmidas/midas/commits/50a15aa70a4fe3927764605e8964b55a3bb1732b
>

I figured it out, solution is to use:

target_compile_features(midas PUBLIC cxx_std_11)

this is how it works:

- centos-7 (g++ has c++11 off by default): -std=gnu++11 is added automatically (not -std=c++11, but 
probably correct, as some c++11 functions were available as gnu extensions)
- ubuntu-20.04 LTS without ROOT: nothing added (I guess correct, g++ has c++11 is enabled by default)
- ubuntu-20.04 LTS with -std=c++14 from ROOT: nothing added, c++14 as requested by ROOT is in affect.
- macos without ROOT: -std=gnu++11 is added automatically
- macos with -std=c++11 from ROOT: ditto, so both -std=c++11 and -std=gnu++11 are present in this order, 
wrong-ish, but works.

and good luck figuring this out just from cmake documentation:
https://cmake.org/cmake/help/latest/command/target_compile_features.html

K.O.
Entry  10 Aug 2020, Mathieu Guigue, Info, MidasConfig.cmake usage 
As the Midas software is installed using CMake, it can be easily integrated into 
other CMake projects using the MidasConfig.cmake file produced during the Midas 
installation.

This file points to the location of the include and libraries of Midas using three 
variables:
- MIDAS_INCLUDE_DIRS
- MIDAS_LIBRARY_DIRS
- MIDAS_LIBRARIES

Then the CMakeLists file of the new project can use the CMake find_package 
functionalities like:
```
find_package (Midas REQUIRED)
if (MIDAS_FOUND)
    MESSAGE(STATUS "Found midas: libraries ${MIDAS_LIBRARIES}")
    pbuilder_add_ext_libraries (${MIDAS_LIBRARIES})
else (MIDAS_FOUND)
    message(FATAL "Unable to find midas")
endif (MIDAS_FOUND)
include_directories (${MIDAS_INCLUDE_DIR})
```
pbuilder_add_ext_libraries is a CMake macro allowing to automatically add the 
libraries into the project: this macro can be found here: 
https://github.com/project8/scarab/blob/master/cmake/PackageBuilder.cmake
If such macro doesn't exist, the linkage to each executable/library can be done 
similarly to https://midas.triumf.ca/elog/Midas/1964 using: 

```
target_link_libraries(crfe ${MIDAS_LIBARIES} ${LIBS})
```

The current version of the MidasConfig.cmake is minimal and could for example 
include a version number: this would allow to define a e.g. minimal version of 
Midas needed by the new project.
    Reply  28 May 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
How does "find_package (Midas REQUIRED)" find the location of MIDAS?

The best I can tell from the current code, the package config files are installed
inside $MIDASSYS somewhere and I see "find_package MIDAS" never find them (indeed,
find_package() does not know about $MIDASSYS, so it has to use telepathy or something).

Does anybody actually use "find_package(midas)", does it actually work for anybody?

Also it appears that "the cmake way" of importing packages is to use
the install(EXPORT) method.

In this scheme, the user package does this:

include(${MIDASSYS}/lib/midas-targets.cmake)
target_link_libraries(myprogram PUBLIC midas)

this causes all the midas include directories (including mxml, etc)
and dependancy libraries (-lutil, -lpthread, etc) to be automatically
added to "myprogram" compilation and linking.

of course MIDAS has to generate a sensible targets export file,
working on it now.

K.O.
       Reply  28 May 2021, Marius Koeppel, Info, MidasConfig.cmake usage 
> Does anybody actually use "find_package(midas)", does it actually work for anybody?

What we do is to include midas as a submodule and than we call find_package:

    add_subdirectory(midas)
    list(APPEND CMAKE_PREFIX_PATH ${CMAKE_CURRENT_SOURCE_DIR}/midas)
    find_package(Midas REQUIRED)

For us it works fine like this but we kind of always compile Midas fresh and don't use a version on our system (keeping the newest version). 

Without the find_package the build does not work for us.
          Reply  28 May 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> > Does anybody actually use "find_package(midas)", does it actually work for anybody?
> 
> What we do is to include midas as a submodule and than we call find_package:
> 
>     add_subdirectory(midas)
>     list(APPEND CMAKE_PREFIX_PATH ${CMAKE_CURRENT_SOURCE_DIR}/midas)
>     find_package(Midas REQUIRED)
> 
> For us it works fine like this but we kind of always compile Midas fresh and don't use a version on our system (keeping the newest version). 
> 
> Without the find_package the build does not work for us.

Ok, I see. I now think that for us, this "find_package" business an unnecessary complication:

since one has to know where midas is in order to add it to CMAKE_PREFIX_PATH,
one might as well import the midas targets directly by include(.../midas/lib/midas-targets.cmake).

From what I see now, the cmake file is much simplifed by converting
it from "find_package(midas)" style MIDAS_INCLUDES & co to more cmake-ish
target_link_libraries(myexe midas) - all the compiler switches, include paths,
dependant libraires and gunk are handled by cmake automatically.

I am not touching the "find_package(midas)" business, so it should continue to work, then.

K.O.
             Reply  31 May 2021, Stefan Ritt, Info, MidasConfig.cmake usage 
MidasConfig.cmake might at some point get included in the standard Cmake installation (or some add-on). It will then reside in the Cmake system path 
and you don't have to explicitly know where this is. Just the find_package(Midas) will then be enough. 

Even if it's not there, the find_package() is the "traditional" way CMake discovers external packages and users are used to that (like ROOT does the 
same). In comparison, your "midas-targets.cmake" way of doing things, although this works certainly fine, is not the "standard" way, but a midas-
specific solution, other people have to learn extra.

Stefan
                Reply  02 Jun 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> MidasConfig.cmake might at some point get included in the standard Cmake installation (or some add-on). It will then reside in the Cmake system path 
> and you don't have to explicitly know where this is. Just the find_package(Midas) will then be enough.

Hi, Stefan, can you say more about this? If MidasConfig.cmake is part of the cmake distribution,
(did I understand you right here?) and is installed into a system-wide directory,
how can it know to use midas from /home/agmini/packages/midas or from /home/olchansk/git/midas?

Certainly we do not do system wide install of midas (into /usr/local/bin or whatever) because
typically different experiments running on the same computer use different versions of midas.

For ROOT, it looks as if for find_package(ROOT) to work, one has to add $ROOTSYS to the Cmake package
search path. This is what we do in our cmake build.

As for find_package() vs install(EXPORT), we may have the same situation as with my "make cmake",
where my one line solution is no good for people who prefer to type 3 lines of commands.

Specifically, the install(EXPORT) method defines the "midas" target which brings with it
all it's dependent include paths, libraries and compile flags. So to link midas you need
two lines:

include(.../midas/lib/midas-targets.cmake)
target_link_libraries(myexe midas)
target_link_libraries(myfrontend mfe)

whereas find_package() defines a bunch of variables (the best I can tell) and one has
to add them to the include paths and library paths and compile flags "by hand".

I do not know how find_package() handles the separate libmidas, libmfe and librmana. (and
the separate libmanalyzer and libmanalyzer_main).

K.O.
                   Reply  04 Jun 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> find_package(Midas)

I am testing find_package(Midas). There is a number of problems:

1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".

This seem to be an incomplete list of all libraries build by midas (rmana is missing).

This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):

- we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
- midas-c-compat is for building python interfaces, not for linking midas programs
- mfe contains a main() function, it will collide with the user main() function

So I think this should be changed to just "midas" and midas linking dependancy
libraries (-lutil, -lrt, -lpthread) should also be added to this list.

Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)

2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories

Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.

Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.

I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
this second method seems to be much simpler, everything is exported automatically into one file,
and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).

So how much time should I spend in fixing find_package(Midas) to make it generally usable?

- include path is incomplete
- library list is nonsense
- compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
- dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)

K.O.
                      Reply  04 Jun 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> > find_package(Midas)
> 
> So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> 
> - include path is incomplete
> - library list is nonsense
> - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> 

I think I give up on find_package(Midas). It seems like a lot of work to straighten
all this out, when install(EXPORT) does it all automatically and is easier to use
for building user frontends and analyzers.

K.O.
                      Reply  20 Jun 2021, Lukas Gerritzen, Suggestion, MidasConfig.cmake usage 
I agree that those two things are problems, but I don't see why it is preferable to leave the MidasConfig.cmake in this "broken" state. For us 
problem 1 is less of an issue, becaues we run "link_directories(${MIDAS_LIBRARY_DIRS})" in the top CMakeLists.txt and then just link against "midas", 
not "${MIDAS_LIBRARIES}". However, number 2 would be nice, to not manually hack in target_include_directories(target ${MIDASSYS}/mscb/include), 
especially because ${MIDASSYS} is not set in cmake. 

I see two solutions for problem 2: Treat mscb as a submodule and compile and install it together with midas, or add the include directory to 
${MIDAS_INCLUDE_DIRS} (same applies to the other submodules, mscb is the one that made me open this elog just now)

Cheers
Lukas
 
> > find_package(Midas)
> 
> I am testing find_package(Midas). There is a number of problems:
> 
> 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".
> 
> This seem to be an incomplete list of all libraries build by midas (rmana is missing).
> 
> This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):
> 
> - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
> - midas-c-compat is for building python interfaces, not for linking midas programs
> - mfe contains a main() function, it will collide with the user main() function
> 
> So I think this should be changed to just "midas" and midas linking dependancy
> libraries (-lutil, -lrt, -lpthread) should also be added to this list.
> 
> Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)
> 
> 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories
> 
> Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.
> 
> Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
> of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.
> 
> I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
> this second method seems to be much simpler, everything is exported automatically into one file,
> and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).
> 
> So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> 
> - include path is incomplete
> - library list is nonsense
> - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> 
> K.O.
                         Reply  20 Jun 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage 
> I agree that those two things are problems, but I don't see why it is preferable to leave the MidasConfig.cmake in this "broken" state. For us 
> problem 1 is less of an issue, becaues we run "link_directories(${MIDAS_LIBRARY_DIRS})" in the top CMakeLists.txt and then just link against "midas", 
> not "${MIDAS_LIBRARIES}". However, number 2 would be nice, to not manually hack in target_include_directories(target ${MIDASSYS}/mscb/include), 
> especially because ${MIDASSYS} is not set in cmake.

So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.

Problem still remains with required auxiliary libraries for linking "-lmidas". Sometimes you
need "-lutil" and "-lrt" and "-lpthread", sometimes not. Some way to pass this information
automatically would be nice.

Problem still remains that I cannot do these changes because I have no test harness
for any of this. Would be great if you could contribute this and post the documentation
blurb that we can paste into the midas wiki documentation.

And I still do not understand why we have to do all this work when cmake "import(EXPORT)"
already does all of this automatically. What am I missing?

K.O.

> 
> I see two solutions for problem 2: Treat mscb as a submodule and compile and install it together with midas, or add the include directory to 
> ${MIDAS_INCLUDE_DIRS} (same applies to the other submodules, mscb is the one that made me open this elog just now)
> 
> Cheers
> Lukas
>  
> > > find_package(Midas)
> > 
> > I am testing find_package(Midas). There is a number of problems:
> > 
> > 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".
> > 
> > This seem to be an incomplete list of all libraries build by midas (rmana is missing).
> > 
> > This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):
> > 
> > - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
> > - midas-c-compat is for building python interfaces, not for linking midas programs
> > - mfe contains a main() function, it will collide with the user main() function
> > 
> > So I think this should be changed to just "midas" and midas linking dependancy
> > libraries (-lutil, -lrt, -lpthread) should also be added to this list.
> > 
> > Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)
> > 
> > 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories
> > 
> > Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.
> > 
> > Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
> > of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.
> > 
> > I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
> > this second method seems to be much simpler, everything is exported automatically into one file,
> > and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).
> > 
> > So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> > 
> > - include path is incomplete
> > - library list is nonsense
> > - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> > - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> > 
> > K.O.
                            Reply  22 Jun 2021, Lukas Gerritzen, Suggestion, MidasConfig.cmake usage 
> So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.

A more moderate option would be to remove mfe from ${MIDAS_LIBRARIES}, but as far as I understand mfe is not the only problem, so nuking might be the 
better option after all. In addition, setting ${MIDASSYS} in MidasConfig.cmake would probably improve compatibility.

>Sometimes you need "-lutil" and "-lrt" and "-lpthread", sometimes not. 
>Some way to pass this information automatically would be nice.

I do not properly understand when you need this and when not, but can't this be communicated with the PUBLIC keyword of target_link_libraries()? If I 
understand if we can use PUBLIC for -lutil, -lrt and -lpthread, I can write something, test it here and create a pull request.

> And I still do not understand why we have to do all this work when cmake "import(EXPORT)"
> already does all of this automatically. What am I missing?

Does this not require midas to be built every time you import it? I know, it's a bit the "billions of flies can't be wrong" argument, but I've never seen 
any package that uses import(EXPORT) over find_package().

> > I agree that those two things are problems, but I don't see why it is preferable to leave the MidasConfig.cmake in this "broken" state. For us 
> > problem 1 is less of an issue, becaues we run "link_directories(${MIDAS_LIBRARY_DIRS})" in the top CMakeLists.txt and then just link against "midas", 
> > not "${MIDAS_LIBRARIES}". However, number 2 would be nice, to not manually hack in target_include_directories(target ${MIDASSYS}/mscb/include), 
> > especially because ${MIDASSYS} is not set in cmake.
> 
> So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> 
> Problem still remains with required auxiliary libraries for linking "-lmidas". Sometimes you
> need "-lutil" and "-lrt" and "-lpthread", sometimes not. Some way to pass this information
> automatically would be nice.
> 
> Problem still remains that I cannot do these changes because I have no test harness
> for any of this. Would be great if you could contribute this and post the documentation
> blurb that we can paste into the midas wiki documentation.
> 
> And I still do not understand why we have to do all this work when cmake "import(EXPORT)"
> already does all of this automatically. What am I missing?
> 
> K.O.
> 
> > 
> > I see two solutions for problem 2: Treat mscb as a submodule and compile and install it together with midas, or add the include directory to 
> > ${MIDAS_INCLUDE_DIRS} (same applies to the other submodules, mscb is the one that made me open this elog just now)
> > 
> > Cheers
> > Lukas
> >  
> > > > find_package(Midas)
> > > 
> > > I am testing find_package(Midas). There is a number of problems:
> > > 
> > > 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".
> > > 
> > > This seem to be an incomplete list of all libraries build by midas (rmana is missing).
> > > 
> > > This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):
> > > 
> > > - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
> > > - midas-c-compat is for building python interfaces, not for linking midas programs
> > > - mfe contains a main() function, it will collide with the user main() function
> > > 
> > > So I think this should be changed to just "midas" and midas linking dependancy
> > > libraries (-lutil, -lrt, -lpthread) should also be added to this list.
> > > 
> > > Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)
> > > 
> > > 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories
> > > 
> > > Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.
> > > 
> > > Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
> > > of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.
> > > 
> > > I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
> > > this second method seems to be much simpler, everything is exported automatically into one file,
> > > and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).
> > > 
> > > So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> > > 
> > > - include path is incomplete
> > > - library list is nonsense
> > > - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> > > - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> > > 
> > > K.O.
                               Reply  24 Jun 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage 
> > So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> A more moderate option ...

For the record, I did not disappear. I have a very short time window
to complete commissioning the alpha-g daq (now that the network
and the event builder are cooperating). To add to the fun, our high voltage
power supply turned into a pumpkin, so plotting voltages and currents 
on the same history plot at the same time (like we used to be able to do)
went up in priority. 

K.O.
                                  Reply  11 Jul 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage 
> > > So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> > A more moderate option ...
> 
> For the record, I did not disappear. I have a very short time window
> to complete commissioning the alpha-g daq (now that the network
> and the event builder are cooperating). To add to the fun, our high voltage
> power supply turned into a pumpkin, so plotting voltages and currents 
> on the same history plot at the same time (like we used to be able to do)
> went up in priority. 
> 

in the latest update, find_package(midas) should work correctly, the include path is right, 
the library list is right.

please test.

I find that the cmake install(export) method is simpler on the user side (just one line of 
code) and is easier to support on the midas side (config file is auto-generated).

I request that proponents of the find_package(midas) method contribute the documentation and 
example on how to use it. (see my other message).

K.O.
    Reply  13 Jul 2021, Stefan Ritt, Info, MidasConfig.cmake usage 
Thanks for the contribution of MidasConfig.cmake. May I kindly ask for one extension:

Many of our frontends require inclusion of some midas-supplied drivers and libraries 
residing under

$MIDASSYS/drivers/class/
$MIDASSYS/drivers/device
$MIDASSYS/mscb/src/
$MIDASSYS/src/mfe.cxx

I guess this can be easily added by defining a MIDAS_SOURCES in MidasConfig.cmake, so 
that I can do things like:

add_executable(my_fe
  myfe.cxx
  $(MIDAS_SOURCES}/src/mfe.cxx
  ${MIDAS_SOURCES}/drivers/class/hv.cxx
  ...)

Does this make sense or is there a more elegant way for that?

Stefan
       Reply  13 Jul 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> $MIDASSYS/drivers/class/
> $MIDASSYS/drivers/device
> $MIDASSYS/mscb/src/
> $MIDASSYS/src/mfe.cxx
> 
> I guess this can be easily added by defining a MIDAS_SOURCES in MidasConfig.cmake, so 
> that I can do things like:
> 
> add_executable(my_fe
>   myfe.cxx
>   $(MIDAS_SOURCES}/src/mfe.cxx
>   ${MIDAS_SOURCES}/drivers/class/hv.cxx
>   ...)

1) remove $(MIDAS_SOURCES}/src/mfe.cxx from "add_executable", add "mfe" to 
target_link_libraries() as in examples/experiment/frontend:

add_executable(frontend frontend.cxx)
target_link_libraries(frontend mfe midas)

2) ${MIDAS_SOURCES}/drivers/class/hv.cxx surely is ${MIDASSYS}/drivers/...

If MIDAS is built with non-default CMAKE_INSTALL_PREFIX, "drivers" and co are not 
available, as we do not "install" them. Where MIDASSYS should point in this case is
anybody's guess. To run MIDAS, $MIDASSYS/resources is needed, but we do not install
them either, so they are not available under CMAKE_INSTALL_PREFIX and setting
MIDASSYS to same place as CMAKE_INSTALL_PREFIX would not work.

I still think this whole business of installing into non-default CMAKE_INSTALL_PREFIX
location has not been thought through well enough. Too much thinking about how cmake works
and not enough thinking about how MIDAS works and how MIDAS is used. Good example
of "my tool is a hammer, everything else must have the shape of a nail".

K.O.
Entry  11 Jul 2021, Konstantin Olchanski, Info, midas cmake update 
I reworked the midas cmake files:
- install via CMAKE_INSTALL_PREFIX should work correctly now:
- installed are bin, lib and include - everything needed to build against the midas library
- if built without CMAKE_INSTALL_PREFIX, a special mode "MIDAS_NO_INSTALL_INCLUDE_FILES" is activated, and the include path 
contains all the subdirectories need for compilation
- -I$MIDASSYS/include and -L$MIDASSYS/lib -lmidas work in both cases
- to "use" midas, I recommend: include($ENV{MIDASSYS}/lib/midas-targets.cmake)
- config files generated for find_package(midas) now have correct information (a manually constructed subset of information 
automatically exported by cmake's install(export))
- people who want to use "find_package(midas)" will have to contribute documentation on how to use it (explain the magic used to 
find the "right midas" in /usr/local/midas or in /midas or in ~/packages/midas or in ~/pacjages/new-midas) and contribute an 
example superproject that shows how to use it and that can be run from the bitpucket automatic build. (features that are not part 
of the automatic build we cannot insure against breakage).

On my side, here is an example of using include($ENV{MIDASSYS}/lib/midas-targets.cmake). I posted this before, it is used in 
midas/examples/experiment and I will ask ben to include it into the midas wiki documentation.

Below is the complete cmake file for building the alpha-g event bnuilder and main control frontend. When presented like this, I 
have to agree that cmake does provide positive value to the user. (the jury is still out whether it balances out against the 
negative value in the extra work to "just support find_package(midas) already!").

#
# CMakeLists.txt for alpha-g frontends
#

cmake_minimum_required(VERSION 3.12)
project(agdaq_frontends)

include($ENV{MIDASSYS}/lib/midas-targets.cmake)

add_compile_options("-O2")
add_compile_options("-g")
#add_compile_options("-std=c++11")
add_compile_options(-Wall -Wformat=2 -Wno-format-nonliteral -Wno-strict-aliasing -Wuninitialized -Wno-unused-function)
add_compile_options("-DTMFE_REV0")
add_compile_options("-DOS_LINUX")

add_executable(feevb feevb.cxx TsSync.cxx)
target_link_libraries(feevb midas)

add_executable(fectrl fectrl.cxx GrifComm.cxx EsperComm.cxx JsonTo.cxx KOtcp.cxx $ENV{MIDASSYS}/src/tmfe_rev0.cxx)
target_link_libraries(fectrl midas)

#end
Entry  09 Jul 2021, Konstantin Olchanski, Info, cannot push to bitbucket 
the day has arrived when I cannot git push to bitbucket. cloud computing rules!

I have never seen this error before and I do not think we have any hooks installed,
so it must be some bitbucket stuff. their status page says some kind of maintenance
is happening, but the promised error message is "repository is read only" or something 
similar.

I hope this clears out automatically. I am updating all the cmake crud and I have no idea 
which changes I already pushed and which I did not, so no idea if anything will work for 
people who pull from midas until this problem is cleared out.

daq00:mvodb$ git push
X11 forwarding request failed on channel 0
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Delta compression using up to 12 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (2/2), 247 bytes | 247.00 KiB/s, done.
Total 2 (delta 1), reused 0 (delta 0)
remote: null value in column "attempts" violates not-null constraint
remote: DETAIL:  Failing row contains (13586899, 2021-07-10 01:13:28.812076+00, 1970-01-01 
00:00:00+00, 1970-01-01 00:00:00+00, 65975727, null).
To bitbucket.org:tmidas/mvodb.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'git@bitbucket.org:tmidas/mvodb.git'
daq00:mvodb$

K.O.
Entry  08 Jul 2021, Francesco Renga, Forum, Problem with python file reader 
Dear experts,
       while trying to readout a MIDAS file from a python script. I get the error below at the very first event. Any hint?

Thank you very much,
            Francesco

  File "/home/cygno/DAQ/offline/file_reader.py", line 9, in <module>
    for event in mfile:
  File "/home/cygno/DAQ/python/midas/file_reader.py", line 159, in __next__
    ev = self.read_next_event()
  File "/home/cygno/DAQ/python/midas/file_reader.py", line 264, in read_next_event
    return self.read_this_event_body()
  File "/home/cygno/DAQ/python/midas/file_reader.py", line 307, in read_this_event_body
    self.event.unpack_body(body_data, 0, self.use_numpy)
  File "/home/cygno/DAQ/python/midas/event.py", line 648, in unpack_body
    bank.fill_header_from_bytes(bank_header_data, self.is_bank_32(), self.is_bank_data_64bit_aligned())
  File "/home/cygno/DAQ/python/midas/event.py", line 298, in fill_header_from_bytes
    self.name = "".join(x.decode('ascii') for x in unpacked[:4])
  File "/home/cygno/DAQ/python/midas/event.py", line 298, in <genexpr>
    self.name = "".join(x.decode('ascii') for x in unpacked[:4])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 0: ordinal not in range(128)
    Reply  09 Jul 2021, Ben Smith, Forum, Problem with python file reader 
Hi Francesco,

Can you send me an example file to look at please? Either attached to the elog or sent directly to bsmith@triumf.ca

Thanks,
Ben
Entry  29 Jun 2021, Lukas Gerritzen, Bug Report, modbcheckbox behaves erroneous with UINT32 variables 
For boolean and INT32 variables, modbcheckbox works as expected. You click, it 
sets the variable to true or 1, the checkbox stays checked until you click again 
and it's being set back to 0.

For UINT32 variables, you can turn the variable "on", but the checkbox visually 
becomes unchecked immediately. Clicking again does not set the variable to 
0/false and the tick visually appears for a fraction of a second, but vanishes 
again.
    Reply  30 Jun 2021, Stefan Ritt, Bug Report, modbcheckbox behaves erroneous with UINT32 variables 
> For boolean and INT32 variables, modbcheckbox works as expected. You click, it 
> sets the variable to true or 1, the checkbox stays checked until you click again 
> and it's being set back to 0.
> 
> For UINT32 variables, you can turn the variable "on", but the checkbox visually 
> becomes unchecked immediately. Clicking again does not set the variable to 
> 0/false and the tick visually appears for a fraction of a second, but vanishes 
> again.

Thanks for reporting that bug. Fixed in

https://bitbucket.org/tmidas/midas/commits/4ef26bdc5a32716efe8e8f0e9ce328bafad6a7bf

Stefan
       Reply  30 Jun 2021, Lukas Gerritzen, Bug Report, modbcheckbox behaves erroneous with UINT32 variables 
Thanks for the quick fix.
Entry  28 Jun 2021, Marco Francesconi, Suggestion, ODB Load in Sequencer 
Hi all,
for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
Let me know if you think this is a good approach.

Marco F
    Reply  28 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
> Hi all,
> for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> Let me know if you think this is a good approach.
> 
> Marco F

How can people judge your modification if they cannot see it? Why don't you make a pull request, so it can be properly reviewed.

Stefan
    Reply  28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer 
> Hi all,
> for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> Let me know if you think this is a good approach.
> 

Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
should be named "ODBLoadJSON" to be clear about this.

(JSON is preferred over .odb and .xml for many reasons (ask me))

K.O.
       Reply  28 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
> > Hi all,
> > for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> > The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> > I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> > Let me know if you think this is a good approach.
> > 
> 
> Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
> should be named "ODBLoadJSON" to be clear about this.
> 
> (JSON is preferred over .odb and .xml for many reasons (ask me))

What if some experiment keep some files in .xml format (ask me!). The routine should check for the extension and support all three formats.

Stefan
          Reply  28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer 
> > > Hi all,
> > > for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> > > The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> > > I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> > > Let me know if you think this is a good approach.
> > > 
> > 
> > Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
> > should be named "ODBLoadJSON" to be clear about this.
> > 
> > (JSON is preferred over .odb and .xml for many reasons (ask me))
> 
> What if some experiment keep some files in .xml format (ask me!). The routine should check for the extension and support all three formats.
> 

Yes, hard to tell without seeing his full proposal, including the code. If it is load from file,
sure we look at the file extension, I think the existing code already would do this and support all 3 formats.

But if he wants to load ODB data from a text literal or from a string,
we might as well stick to json. I guess we could support the other formats, but I do not see anybody
using anything other than json for new code like this.

ODBPasteJSON("/foo/bar/baz", '{"var1":1, "var2":"somestr"}');

K.O.
             Reply  28 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
> > > > Hi all,
> > > > for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> > > > The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> > > > I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> > > > Let me know if you think this is a good approach.
> > > > 
> > > 
> > > Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
> > > should be named "ODBLoadJSON" to be clear about this.
> > > 
> > > (JSON is preferred over .odb and .xml for many reasons (ask me))
> > 
> > What if some experiment keep some files in .xml format (ask me!). The routine should check for the extension and support all three formats.
> > 
> 
> Yes, hard to tell without seeing his full proposal, including the code. If it is load from file,
> sure we look at the file extension, I think the existing code already would do this and support all 3 formats.
> 
> But if he wants to load ODB data from a text literal or from a string,
> we might as well stick to json. I guess we could support the other formats, but I do not see anybody
> using anything other than json for new code like this.
> 
> ODBPasteJSON("/foo/bar/baz", '{"var1":1, "var2":"somestr"}');

I agree that if one would paste a string to the ODB, then JSON would be best.

But at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.

Stefan
                Reply  28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer 
> ... at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.

same here, lots of historical .odb and .xml files.

I think the .odb and .xml support is here to stay. Best I remember, latest things I fixed in both
was support for unlimited string length (and removal of associated buffer overflows). Right now,
I am not sure if both are UTF-8 clean and if they properly escape all control characters,
something to fix as we go or as we bump into problems.

K.O.
                   Reply  28 Jun 2021, Marco Francesconi, Suggestion, ODB Load in Sequencer 
My idea was to collect some feedback instead of blindly submitting code for a pull request.

Currently I'm just calling db_load() with a given file, so it is only supporting .odb formatting.
It is pretty easy to extend to json by calling the db_load_json() depending on the file extension.
I do not see a similar call for the .xml format, maybe I can study tomorrow how it is implemented in odbedit and port it to the sequencer.

I guess that the ODBPasteJSON can be a solution as well but I find it a bit too technical.
Anyway it is easy to implement just by calling db_paste_json(), I will keep this in mind.

I'll try to sort this out and make a commit soon.
Best,

Marco



> > ... at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.
> 
> same here, lots of historical .odb and .xml files.
> 
> I think the .odb and .xml support is here to stay. Best I remember, latest things I fixed in both
> was support for unlimited string length (and removal of associated buffer overflows). Right now,
> I am not sure if both are UTF-8 clean and if they properly escape all control characters,
> something to fix as we go or as we bump into problems.
> 
> K.O.
                      Reply  29 Jun 2021, Marco Francesconi, Suggestion, ODB Load in Sequencer 
I just submitted a pull request for this feature, I did quite a lot of testing and it looks good to me.
Let me know if something is not clear.

I'll take care of adding the relevant informations to the wiki once it is merged.
Best,

Marco


> My idea was to collect some feedback instead of blindly submitting code for a pull request.
> 
> Currently I'm just calling db_load() with a given file, so it is only supporting .odb formatting.
> It is pretty easy to extend to json by calling the db_load_json() depending on the file extension.
> I do not see a similar call for the .xml format, maybe I can study tomorrow how it is implemented in odbedit and port it to the sequencer.
> 
> I guess that the ODBPasteJSON can be a solution as well but I find it a bit too technical.
> Anyway it is easy to implement just by calling db_paste_json(), I will keep this in mind.
> 
> I'll try to sort this out and make a commit soon.
> Best,
> 
> Marco
> 
> 
> 
> > > ... at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.
> > 
> > same here, lots of historical .odb and .xml files.
> > 
> > I think the .odb and .xml support is here to stay. Best I remember, latest things I fixed in both
> > was support for unlimited string length (and removal of associated buffer overflows). Right now,
> > I am not sure if both are UTF-8 clean and if they properly escape all control characters,
> > something to fix as we go or as we bump into problems.
> > 
> > K.O.
                         Reply  30 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
I quickly checked the pull request and could not find any obvious problem, so I merged it.
Entry  18 Jun 2021, Konstantin Olchanski, Bug Report, my html modbvalue thing is not working? 
I have a web page and I try to use modbvalue, but nothing happens. The best I can tell, I follow the documentation 
(https://midas.triumf.ca/MidasWiki/index.php/Custom_Page#modbvalue).

<td id=setv0><div class="modbvalue" data-odb-path="/Equipment/CAEN_hvps01/Settings/VSET[0]" data-odb-editable="1">(ch0)</div></td>

I suppose I could add debug logging to the javascript framework for modbvalue to find out why it is not seeing
or how it is not liking my web page.

But how would a non-expert user (or an expert user in a hurry) would debug this?

Should the modbvalue framework log more error messages to the javascrpt console ("I am ignoring your modbvalue entry because...")?

Should it have a debug mode where it reports to the javascript console all the tags it scanned, all the tags it found, etc
to give me some clue why it does not find my modbvalue tag?

Right now I am not even sure if this framework is activated, perhaps I did something wrong in how I load the page
and the modbvalue framework is not loaded. The documentation gives some magic incantations but does not explain
where and how this framework is loaded and activated. (But I do not see any differences between my page and
the example in the documentation. Except that I do not load control.js, I do not need all the thermometer bars, etc.
If I do load it, still my modbvalue does not work).

K.O.
    Reply  25 Jun 2021, Stefan Ritt, Bug Report, my html modbvalue thing is not working? 
Can you post your complete page here so that I can have a look?

Stefan
Entry  21 Jun 2021, Lars Martin, Bug Report, ELog documentation inconsistency 
The documentation fro the Elog ODB tree here:
https://midas.triumf.ca/MidasWiki/index.php//Elog_ODB_tree#Url

says:

The Built-in elog will ignore this key.

If using an Built-in Elog, this key must NOT be present.

I assume this is an artifact from amending the documentation, but it's unclear if 
the key has to be removed or not. I.e. if the key exists and is empty, will the 
built-in elog work? In what way will it break?
Entry  17 Jun 2021, Joseph McKenna, Info, Add support for rtsp camera streams in mlogger (history_image.cxx) unnamed.png
mlogger (history_image) now supports rtsp cameras, in ALPHA we have 
acquisitioned several new network connected cameras. Unfortunately they dont 
have a way of just capturing a single frame using libcurl


========================================
Motivation to link to OpenCV libraries
========================================

After looking at the ffmpeg libraries, it seemed non trivial to use them to 
listen to a rtsp stream and write a series of jpgs.

OpenCV became an obvious choice (it is itself linked to ffmpeg and 
gstreamer), its a popular, multiplatform, open source library that's easy to 
use. It is available in the default package managers in centos 7 and ubuntu 
(an is installed by default on lxplus).

========================================
How it works:
========================================

The framework laid out in history_image.cxx is great. A separate thread is 
dedicated for each camera. This is continued with the rtsp support, using 
the same periodicity:

if (ss_time() >= o["Last fetch"] + o["Period"]) {
An rtsp camera is detected by its URL, if the URL starts with ‘rtsp://’ its 
obvious its using the rtsp protocol and the cv::VideoCapture object is 
created (line 147).

If the connection fails, it will continue to retry, but only send an error 
message on the first 10 attempts (line 150). This counter is reset on 
successful connection
If MIDAS has been built without OpenCV, mlogger will send an error message 
that OpenCV is required if a rtsp URL is given (line 166)
The VideoCapture ‘stays live' and will grab frames from the camera based on 
the sleep, saving to file based on the Period set in the ODB.

If the VideoCapture object is unable to grab a frame, it will release() the 
camera, send an error message to MIDAS, then destroy itself, and create a 
new version (this destroy and create fully resets the connection to a 
camera, required if its on flaky wifi)
If the VideoCapture gets an empty frame, it also follows the same reset 
steps.
If the VideoCaption fills a cv::Frame object successfully, the image is 
saved to disk in the same way as the curl tools.

========================================
Concerns for the future:
========================================

VideoCapture is decoding the video stream in the background, allowing us to 
grab frames at will. This is nice as we can be pretty agnostic to the video 
format in the stream (I tested with h264 from a TP-LINK TAPO C100, but the 
CPU usage is not negligible.

I noticed that this used ~2% of the CPU time on an intel i7-4770 CPU, given 
enough cameras this is considerable. In ALPHA, I have been testing with 10 
cameras:
elog:2220/1 

My suggestion / request would be to move the camera management out of 
mlogger and into a new program (mcamera?), so that users can choose to off 
load the CPU load to another system (I understand the OpenCV will use GPU 
decoders if available also, which can also lighten the CPU load).
    Reply  18 Jun 2021, Konstantin Olchanski, Info, Add support for rtsp camera streams in mlogger (history_image.cxx) 
> mlogger (history_image) now supports rtsp cameras

my goodness, we will drive the video surveillance industry out of business.

> My suggestion / request would be to move the camera management out of 
> mlogger and into a new program (mcamera?), so that users can choose to off 
> load the CPU load to another system (I understand the OpenCV will use GPU 
> decoders if available also, which can also lighten the CPU load).

every 2 years I itch to separate mlogger into two parts - data logger
and history logger.

but then I remember that the "I" in MIDAS stands for "integrated",
and "M" stands for "maximum" and I say, "nah..."

(I guess we are not maximum integrated enough to have mhttpd, mserver
and mlogger to be one monolithic executable).

There is also a line of thinking that mlogger should remain single-threaded
for maximum reliability and ease of debugging. So if we keep adding multithreaded
stuff to it, perhaps it should be split-apart after all. (anything that makes
the size of mlogger.cxx smaller is a good thing, imo).

K.O.
Entry  15 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved! 
I am sure everybody else has 10gige and 40gige networks and are sending terabytes of data before breakfast.

Myself, I only have one computer with a 10gige network link and sufficient number of daq boards to fill
it with data. Here is my success story of getting all this data through MIDAS.

This is the anti-matter experiment ALPHA-g now under final assembly at CERN. The main particle detector is a long but 
thin cylindrical TPC. It surrounds the magnetic bottle (particle trap) where we make and study anti-hydrogen. There are 
64 daq boards to read the TPC cathode pads and 8 daq boards to read the anode wires and to form the trigger. Each daq 
board can produce data at 80-90 Mbytes/sec (1gige links). Data is sent as UDP packets (no jumbo frames). Altera FPGA 
firmware was done here at TRIUMF by Bryerton Shaw, Chris Pearson, Yair Lynn and myself.

Network interconnect is a 96-port Juniper switch with a 10gige uplink to the main daq computer (quad core Intel(R) 
Xeon(R) CPU E3-1245 v6 @ 3.70GHz, 64 GBytes of DDR4 memory).

MIDAS data path is: UDP packet receiver frontend -> event builder -> mlogger -> disk -> lazylogger -> CERN EOS cloud 
storage.

First chore was to get all the UDP packets into the main computer. "U" in UDP stands for "unreliable", and at first, UDP 
packets have been disappearing pretty much anywhere they could. To fix this, in order:

- reading from the udp socket must be done in a dedicated thread (in the midas context, pauses to write statistics or 
check alarms result in lost udp packets)
- udp socket buffer has to be very big
- maximum queue sizes must be enabled in the 10gige NIC
- ethernet flow control must be enabled on the 10gige link
- ethernet flow control must be enabled in the switch (to my surprise many switches do not have working end-to-end 
ethernet flow control and lose UDP packets, ask me about this. our big juniper switch balked at first, but I got it 
working eventually).
- ethernet flow control must be enabled on the 1gige links to each daq module
- ethernet flow control must be enabled in the FPGA firmware (it's a checkbox in qsys)
- FPGA firmware internally must have working back pressure and flow control (avalon and axi buses)
- ideally, this back-pressure should feed back to the trigger. ALPHA-g does not have this (it does not need it).

Next chore was to multithread the UDP receiver frontend and to multithread the event builder. Stock single-threaded 
programs quickly max out with 100% CPU use and reach nowhere near 10gige data speeds.

Naive multithreading, with two threads, reader (read UDP packet, lock a mutex, put it into a deque, unlock, repeat) and 
sender (lock a mutex, get a packet from deque, unlock, bm_send_event(), repeat) spends all it's time locking and 
unlocking the mutex and goes nowhere fast (with 1500 byte packets, about 600 kHz of lock/unlock at 10gige speed).

So one has to do everything in batches: reader thread: accumulate 1000 udp packets in an std::vector, lock the mutex, 
dump this batch into a deque, unlock, repeat; sender thread: lock mutex, get 1000 packets from the deque, unlock, stuff 
the 1000 packets into 1 midas event, bm_send_event(), repeat.

It takes me 5 of these multithreaded udp reader frontends to keep up with a 10gige link without dropping any UDP packets. 
My first implementation chewed up 500% CPU, that's all of it, there is only 4 CPU cores available, leaving nothing
for the event builder (and mlogger, and ...)

I had to:
a) switch from plain socket read() to socket recvmmsg() - 100000 udp packets per syscall vs 1 packet per syscall, and
b) switch from plain bm_send_event() to bm_send_event_sg() - using a scatter-gather list to avoid a memcpy() of each udp 
packet into one big midas event.

Next is the event builder.

The event builder needs to read data from the 5 midas event buffers (one buffer per udp reader frontend, each midas event 
contains 1000 udp packets as indovidual data banks), examine trigger timestamps inside each udp packet, collect udp 
packets with matching timestamps into a physics event, bm_send_event() it to the SYSTEM buffer. rinse and repeat.

Initial single threaded implementation maxed out at about 100-200 Mbytes/sec with 100% busy CPU.

After trying several threading schemes, the final implementation has these threads:
- 5 threads to read the 5 event buffers, these threads also examine the udp packets, extract timestamps, etc
- 1 thread to sort udp packets by timestamp and to collect them into physics events
- 1 thread to bm_send_event() physics events to the SYSTEM buffer
- main thread and rpc handler thread (tmfe frontend)

(Again, to reduce lock contention, all data is passed between threads in large batches)

This got me up to about 800 Mbytes/sec. To get more, I had to switch the event builder from old plain bm_send_event() to 
the scatter-gather bm_send_event_sg(), *and* I had to reduce CPU use by other programs, see steps (a) and (b) above.

So, at the end, success, full 10gige data rate from daq boards to the MIDAS SYSTEM buffer.

(But wait, what about the mlogger? In this experiment, we do not have a disk storage array to sink this
much data. But it is an already-solved problem. On the data storage machines I built for GRIFFIN - 8 SATA NAS HDDs using 
raidz2 ZFS - the stock MIDAS mlogger can easily sink 1000 Mbytes/sec from SYSTEM buffer to disk).

Lessons learned:

- do not use UDP. dealing with packet loss will cost you a fortune in headache medicines and hair restorations.
- use jumbo frames. difference in per-packet overhead between 1500 byte and 9000 byte packets is almost a factor of 10.
- everything has to be done in bulk to reduce per-packet overheads. recvmmsg(), batched queue push/pop, etc
- avoid memory allocations (I has a per-packet std::string, replaced it with char[5])
- avoid memcpy(), use writev(), bm_send_event_sg() & co

K.O.

P.S. Let's counting the number of data copies in this system:

x udp reader frontend:
- ethernet NIC DMA into linux network buffers
- recvmmsg() memcpy() from linux network buffer to my memory
- bm_send_event_sg() memcpy() from my memory to the MIDAS shared memory event buffer

x event builder:
- bm_receive_event() memcpy() from MIDAS shared memory event buffer to my event buffer
- my memcpy() from my event buffer to my per-udp-packet buffers
- bm_send_event_sg() memcpy() from my per-udp-packet buffers to the MIDAS shared memory event buffer (SYSTEM)

x mlogger:
- bm_receive_event() memcpy() from MIDAS SYSTEM buffer
- memcpy() in the LZ4 data compressor
- write() syscall memcpy() to linux system disk buffer
- SATA interface DMA from linux system disk buffer to disk.

Would a monolithic massively multithreaded daq application be more efficient?
("udp receiver + event builder + logger"). Yes, about 4 memcpy() out of about 10 will go away.

Would I be able to write such a monolithic daq application?

I think not. Already, at 10gige data rates, for all practical purposes, it is impossible
to debug most problems, especially subtle trouble in multithreading (race conditions)
and in memory allocations. At best, I can sprinkle assert()s and look at core dumps.

So the good old divide-and-conquer approach is still required, MIDAS still rules.

K.O.
    Reply  15 Jun 2021, Stefan Ritt, Info, 1000 Mbytes/sec through midas achieved! frontend.cxx
In MEG II we also kind of achieved this rate. Marco F. will post an entry soon to describe the details. There is only one thing 
I want to mention, which is our network switch. Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade 
switch. We have "rack switches", which are collector switch for each rack receiving up to 10 x 1GBit inputs, and outputting 1 x 
10 GBit to an "aggregation switch", which collects all 10 GBit lines form rack switches and forwards it with (currently a single 
) 10 GBit line. For the rack switch we use a 

MikroTik CRS354-48G-4S+2Q+RM 54 port

and for the aggregation switch

MikroTik CRS326-24S-2Q+RM 26 Port

both cost in the order of 500 US$. We were astonished that they don't loose UDP packets when all inputs send a packet at the 
same time, and they have to pipe them to the single output one after the other, but apparently the switch have enough buffers 
(which is usually NOT written in the data sheets). 

To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is 
completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the 
details.

Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your 
bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend 
attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk. I got

Event size: 7 MB

No logging: 900 events/s = 6.7 GBytes/s

Logging with LZ4 compression: 155 events/s = 1.2 GBytes/s

Logging without compression: 170 events/s = 1.3 GBytes/s

So with this simple approach I got already more than 1 GByte of "dummy data" through midas, indicating that the buffer 
management is not so bad. I did use the plain mfe.c frontend framework, no bm_send_event_sg() (but mfe.c uses rpc_send_event() which is an 
optimized version of bm_send_event()).

Best,
Stefan
       Reply  16 Jun 2021, Marco Francesconi, Info, 1000 Mbytes/sec through midas achieved! 
As reported by Stefan, in MEG II we have very similar ethernet throughputs.
In total, we have 34 crates each with 32 DRS4 digitiser chips and a single 1 Gbps readout link through a Xilinx Zynq SoC.
The data arrives in push mode without any external intervention, the only throttling being an optional prescaling on the trigger rate.
We discovered the hard way that 1 Gbps throughput on Zynq is not trivial at all: the embedded ethernet MAC does not support jumbo frames (always read the fine prints in the manuals!) and the embedded Linux ethernet stack seems to struggle when we go beyond 250 Mbps of UDP traffic.

Anyhow, even with the reduced speed, the maximum throughput at network input is around 8.5 Gbps which passes through the Mikrotik switches mentioned by Stefan.
We had very bad experiences in the past with similar price-point switches, observing huge packet drops when the instantaneous switching capacity cannot cope with the traffic, but so far we are happy with the Mikrotik ones.

On the receiver side, we have the DAQ server with an Intel E5-2630 v4 CPU and a 10 Gbit connection to the network using an Intel X710 Network card.
In the past, we used also a "cheap" 10 Gbit card from Tehuti but the driver performance was so bad that it could not digest more than 5 Gbps of data.

The current frontend is based on the mfe.c scheme for historical reasons (the very first version dates back to 2015).
We opted for a monolithic multithread solution so we can reuse the underlying DAQ code for other experiments which may not have the complete Midas backend.
Just to mention them: one is the FOOT experiment (which afaik uses an adapted version of Altas DAQ) and the other is the LOLX experiment (for which we are going to ship to Canada soon a small 32 channel system using Midas).
A major modification to Konstantin scheme is that we need to calibrate all WFMs online so that a software zero suppression can be applied to reduce the final data size (that part is still to be implemented).
This requirement results in additional resource usage to parse the UDP content into floats and calibrate them.
Currently, we have 7 packet collector threads to digest the full packet flow (using recvmmsg), followed by an event building stage that uses 4 threads and 3 other threads for WFM calibration.
We have progressive packet numbers on each packet generated by the hardware and a set of flags marking the start and end of the event; combining the packet number difference between the start and end of the event and the total received packets for that event it is really easy to understand if packet drops are happening.

All the thread infrastructure was tested and we could digest the complete throughput, we still have to finalise the full 10 Gbit connection to Midas because the final system has been installed only recently (April).
We are using EQ_USER flag to push events into mfe.c buffers with up to 4 threads, but I was observing that above ~1.5 Gbps the rb_get_wp() returns almost always DB_TIMEOUT and I'm forced to drop the event.
This conflicts with the measurements reported by Stefan (we were discussing this yesterday), so we are still investigating the possible cause.

It is difficult to report three years of development in a single Elog, I hope I put all the relevant point here.
It looks to me that we opted for very complementary approaches for high throughput ethernet with Midas, and I think there are still a lot of details that could be worth reporting.
In case someone organises some kind of "virtual workshop" on this, I'm willing to participate.
Best,

Marco


> In MEG II we also kind of achieved this rate. Marco F. will post an entry soon to describe the details. There is only one thing 
> I want to mention, which is our network switch. Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade 
> switch. We have "rack switches", which are collector switch for each rack receiving up to 10 x 1GBit inputs, and outputting 1 x 
> 10 GBit to an "aggregation switch", which collects all 10 GBit lines form rack switches and forwards it with (currently a single 
> ) 10 GBit line. For the rack switch we use a 
> 
> MikroTik CRS354-48G-4S+2Q+RM 54 port
> 
> and for the aggregation switch
> 
> MikroTik CRS326-24S-2Q+RM 26 Port
> 
> both cost in the order of 500 US$. We were astonished that they don't loose UDP packets when all inputs send a packet at the 
> same time, and they have to pipe them to the single output one after the other, but apparently the switch have enough buffers 
> (which is usually NOT written in the data sheets). 
> 
> To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is 
> completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the 
> details.
> 
> Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your 
> bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend 
> attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk. I got
> 
> Event size: 7 MB
> 
> No logging: 900 events/s = 6.7 GBytes/s
> 
> Logging with LZ4 compression: 155 events/s = 1.2 GBytes/s
> 
> Logging without compression: 170 events/s = 1.3 GBytes/s
> 
> So with this simple approach I got already more than 1 GByte of "dummy data" through midas, indicating that the buffer 
> management is not so bad. I did use the plain mfe.c frontend framework, no bm_send_event_sg() (but mfe.c uses rpc_send_event() which is an 
> optimized version of bm_send_event()).
> 
> Best,
> Stefan
          Reply  18 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved! 
> ... MEG II ... 34 crates each with 32 DRS4 digitiser chips and a single 1 Gbps readout link through a Xilinx Zynq SoC.
>
> Zynq ... embedded ethernet MAC does not support jumbo frames (always read the fine prints in the manuals!)
> and the embedded Linux ethernet stack seems to struggle when we go beyond 250 Mbps of UDP traffic.

that's an ouch. we use the altera ethernet mac, and jumbo frames are supported, but the firmware data path
was originally written assuming 1500-byte packets and it is too much work to rewrite it for jumbo frames.

we send the data directly from the FPGA fabric to the ethernet, there is an avalon/axi bus multiplexer
to split the ethernet packets to the NIOS slow control CPU. not sure if such scheme is possible
for SoC FPGAs with embedded ARM CPUs.

and yes, a 1 GHz ARM CPU will not do 10gige. You see it yourself, measure your memcpy() speed. Where
typical PC will have dual-channel 128-bit wide memory (and the famous for it's low latency
Intel memory controller), ARM SoC will have at best 64-bit wide memory (some boards are only 32-bit wide!),
with DDR3 (not DDR4) severely under-clocked (i.e. DDR3-900, etc). This is why the new Apple ARM chips
are so interesting - can Apple ARM memory controller beat the Intel x86 memory controller?

> On the receiver side, we have the DAQ server with an Intel E5-2630 v4 CPU

that's the right gear for the job. quad-channel memory with nominal "Max Memory Bandwidth 68.3 GB/s",
10 CPU cores. My benchmark of memcpy() for the much older duad-channel memory i7-4820 with DDR3-1600 DIMMs
is 20 Gbytes/sec. waiting for ARM CPU with similar specs.

> and a 10 Gbit connection to the network using an Intel X710 Network card.
> In the past, we used also a "cheap" 10 Gbit card from Tehuti but the driver performance was so bad that it could not digest more than 5 Gbps of data.

yup, same here. use Intel ethernet exclusively, even for 1gige links.

> A major modification to Konstantin scheme is that we need to calibrate all WFMs online so that a software zero suppression

I implemented hardware zero suppression in the FPGA code. I think 1 GHz ARM CPU does not have the oomph for this.

> rb_get_wp() returns almost always DB_TIMEOUT

replace rb_xxx() with std::deque<std::vector<char>> (protected by a mutex, of course). lots of stuff in the mfe.c frontend
is obsolete in the same way. check out the newer tmfe frontends (tmfe.md, tmfe.h and tmfe examples).

> It is difficult to report three years of development in a single Elog

but quite successful at it. big thanks for your write-up. I think our info is quite useful for the next people.

K.O.
       Reply  18 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved! 
> In MEG II we also kind of achieved this rate.
>
> Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade switch.

Right. We built this DAQ system about 3 years ago and the cheep Chineese switches arrived
on the market about 1 year after we purchased the big 96 port juniper switch. Bad timing/good timing.

Actually I have a very nice 24-port 1gige switch ($2000 about 3 years ago), I could have
used 4 of them in parallel, but they were discontinued and replaced with a $5000 switch
(+$3000 for a 10gige uplink. I think I got the last very last one cheap switch).

But not all Chineese switches are equal. We have an Ubiquity 10gige switch, and it does
not have working end-to-end ethernet flow control. (yikes!).

BTW, for this project we could not use just any cheap switch, we must have 64 fiber SFP ports
for connecting on-TPC electronics. This narrows the market significantly and it does
not match the industry standard port counts 8-16-24-48-96.

> MikroTik CRS354-48G-4S+2Q+RM 54 port
> MikroTik CRS326-24S-2Q+RM 26 Port

We have a hard time buying this stuff in Vancouver BC, Canada. Most of our regular suppliers
are US based and there is a technology trade war still going on between the US and China.
I guess we could buy direct on alibaba, but for the risk of scammers, scalpers and iffy shipping.

> both cost in the order of 500 US$

tell one how much we overpay for US based stuff. not surprising, with how Cisco & co can afford
to buy sports arenas, etc.

> We were astonished that they don't loose UDP packets when all inputs send a packet at the 
> same time, and they have to pipe them to the single output one after the other,
> but apparently the switch have enough buffers.

You probably see ethernet flow control in action. Look at the counters for ethernet pause frames
in your daq boards and in your main computer.

> (which is usually NOT written in the data sheets).

True, when I looked into this, I found a paper by somebody in Berkley for special
technique to measure the size of such buffers.

(The big Juniper switch has only 8 Mbytes of buffer. The current wisdom for backbone networks
is to have as little buffering as possible).

> To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is 
> completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the 
> details.

We do not do this. (very bad!). When each trigger arrives, all 64+8 DAQ boards send a train of UDP packets
at maximum line speed (64+8 at 1 gige) all funneled into one 10 gige ((64+8)/10 oversubscription).

Before we got ethernet flow control to work properly, we had to throttle all the 1gige links by about 60%
to get any complete events at all. This would not have been acceptable for physics data taking.

> Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your 
> bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend 
> attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk.

Dummy frontend is not very representative, because limitation is the memory bandwidth
and CPU load, and a real ethernet receiver has quite a bit of both (interrupt processing,
DMA into memory, implicit memcpy() inside the socket read()).

For example, typical memcpy() speeds are between 22 and 10 Gbytes/sec for current
generation CPUs and DRAM. This translates for a total budget of 22 and 10 memcpy()
at 10gige speeds. Subtract from this 1 memcpy() to DMA data from ethernet into memory
and 1 memcpy() to DMA data from memory to storage. Subtract from this 2 implicit
memcpy() for read() in the frontend and write() in mlogger. (the Linux sendfile() syscall
was invented to cut them out). Subtract from this 1 memcpy() for instruction and incidental
data fetch (no interesting program fits into cache). Subtract from this memory bandwidth
for running the rest of linux (systemd, ssh, cron jobs, NFS, etc). Hardly anything
left when all is said and done. (Found it, the alphagdaq memcpy() runs at 14 Gbytes/sec,
so total budget of 14 memcpy() at 10gige speeds).

And the event builder eats up 2 CPU cores to process the UDP packets at 10gige rate,
there is quite a bit of CPU-expensive data unpacking, inspection and processing
going on that cannot be cut out. (alphagdaq has 4 cores, "8 threads").

K.O.

P.S. Waiting for rack-mounted machines with AMD "X" series processors... K.O.
Entry  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
There have been times in ALPHA that an alarm is triggered and the shift crew 
are unclear who to contact if they aren't trained to fix the specific 
failure mode.

I wish to add the property 'Users responsible' to the ODB for Alarms and 
Programs.

I have drafted what this might look like in a new pull request:
https://bitbucket.org/tmidas/midas/pull-requests/22/add-users-responsible-
field-for-specific

It requires changing of several data structures, I think I have found all 
instances of the definitions so the ODB should 'repair' any of the old 
structures adding in users responsible.

If 'Users responsible' is set, MIDAS messages append them after the message 
in brackets '()'. If used in conjunction with the MIDAS messenger 
(mmessenger), the users responsible can be 'tagged' directly.

I.e, for slack, simply set the 'users responsible' to <@UserID|Nickname>, 
for mattermost '@username', for discord '<@userid>'. Note that discord 
doesn't allow you to tag by username, but numeric userid



I have expanded char array in 'al_trigger_class' to handle the potentially 
longer MIDAS messages. Perhaps since I'm touching these lines I should 
change these temporary containers to std::string (line 383 and 386 of 
alarm.cxx)?

I have tested this quite a bit for my system, I am not sure how I can test 
mjsonrpc.
    Reply  28 May 2021, Stefan Ritt, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
I think this is a good idea and I support it. We have a similar problem in MEG and 
we solved that with external (bash) scripts called in case of alarms. One feature 
there we have is that for some alarms, several people want to get notified. So 
people can "subscribe" to certain alarms. The subscription are now handled inside 
Slack which I like better, but maybe it would be good to have more than one "user 
responsible". Like if one person is sleeping/traveling, it's good to have a 
substitute. Can you make an array out of that? Or a comma-separated list?

Best,
Stefan
       Reply  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> I think this is a good idea and I support it. We have a similar problem in MEG and 
> we solved that with external (bash) scripts called in case of alarms. One feature 
> there we have is that for some alarms, several people want to get notified. So 
> people can "subscribe" to certain alarms. The subscription are now handled inside 
> Slack which I like better, but maybe it would be good to have more than one "user 
> responsible". Like if one person is sleeping/traveling, it's good to have a 
> substitute. Can you make an array out of that? Or a comma-separated list?
> 
> Best,
> Stefan

Presently there are 256 characters in the 'users responsible' field, so you can just 
list many users (no space, space or comma whatever). Discord, slack and mattermost 
don't care, they just parse the user tags.

I can still make this an array and pass a std::vector<std::string> into 
al_trigger_class function?
          Reply  28 May 2021, Stefan Ritt, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> I can still make this an array and pass a std::vector<std::string> into 
> al_trigger_class function?

Maybe 256 chars are enough at the moment. If other people complain in the future, we can 
re-visit.

Stefan
             Reply  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> > I can still make this an array and pass a std::vector<std::string> into 
> > al_trigger_class function?
> 
> Maybe 256 chars are enough at the moment. If other people complain in the future, we can 
> re-visit.
> 
> Stefan

Thinking about it, an array of maybe 80 character would give enough space for a name, a tag 
and phone number. Do I need to budget memory very strictly? Would 32 entries of 80 
characters be too much? 
                Reply  28 May 2021, Stefan Ritt, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> > > I can still make this an array and pass a std::vector<std::string> into 
> > > al_trigger_class function?
> > 
> > Maybe 256 chars are enough at the moment. If other people complain in the future, we can 
> > re-visit.
> > 
> > Stefan
> 
> Thinking about it, an array of maybe 80 character would give enough space for a name, a tag 
> and phone number. Do I need to budget memory very strictly? Would 32 entries of 80 
> characters be too much? 

On that level memory is cheap.

Stefan
                   Reply  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 

I've updated the branch / pull request to use an array of 10 entries (80 chars each). 32 felt a 
little overkill when I saw it on screen, but absolutely happy to set it to any number you 
recommend.

The array gets flattened out when an alarm is triggered, currently the formatting produces

AlarmClass : AlarmMessage (Flattened List Of Users Responsible Array With Space Separators)

If experiments want to use Discord / Slack / Mattermost tags and or add phone numbers, that 
should fit in 80 characters
                      Reply  31 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
This list of responsible being attached to alarm message strings will be great for the 
mmessenger, however, perhaps its going to generate very long messages for the speaker programs 
(web interface and mlxspeaker ):

AlarmClass : AlarmMessage (ResponsibleUser1 ResponsibleUser2 ResponsibleUser3 ResponsibleUser4 
... ResponsibleUser4)

especially if people put in user tags or emergency contact details...

Should we add a key word or character for the programs that create audio to parse that silence 
the list of responsible users? I'd be tempted to use a single character but there is a risk 
users might have that in a custom alarm message. Maybe something usual like the 'bel' 
character? '|'? 

Perhaps use the string 'Responsible:' or 'Users:' to trim out the Users Responsible list from 
the message string?

AlarmClass : AlarmMessage Responsible:(ResponsibleUser1 ResponsibleUser2 ResponsibleUser3 
ResponsibleUser4 ... ResponsibleUser4)

AlarmClass : AlarmMessage Users:(ResponsibleUser1 ResponsibleUser2 ResponsibleUser3 
ResponsibleUser4 ... ResponsibleUser4)
                         Reply  02 Jun 2021, Konstantin Olchanski, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> This list of responsible being attached to alarm message strings ...

This is a great idea. But I think we do not need to artificially limit ourselves
to string and array lengths.

The code in alarm.c should be changes to use std::string and std::vector<std::string> (STRING_LIST 
#define), db_get_record() should be replaced with individual ODB reads (that's what it does behind 
the scenes, but in a non-type and -size safe way).

I think the web page code will work correctly, it does not care about string lengths.

K.O.
                            Reply  09 Jun 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> > This list of responsible being attached to alarm message strings ...
> 
> This is a great idea. But I think we do not need to artificially limit ourselves
> to string and array lengths.
> 
> The code in alarm.c should be changes to use std::string and std::vector<std::string> (STRING_LIST 
> #define), db_get_record() should be replaced with individual ODB reads (that's what it does behind 
> the scenes, but in a non-type and -size safe way).
> 
> I think the web page code will work correctly, it does not care about string lengths.
> 
> K.O.

Auto growing lists is an excellent plan. I am making decent progress and should have something to 
report soon
Entry  05 Apr 2021, Konstantin Olchanski, Info, blog - convert mfe frontend to tmfe c++ framework 
notes from converting ALPHA-g chronobox frontend fechrono to tmfe c++ framework.

the chronobox device is a timestamp/low resolution tdc/scaler/generic TTL and ECL io
mainboard with an altera DE10_NANO plugin board. it has a cyclone-5 FPGA SOC running Raspbian linux.
FPGA communication is done by avalon-bus memory mapped registers, main data readout
is PIO from an FPGA 32-bit wide FIFO (no DMA yet).

- login to main computer (daq16)
- cd packages
- git clone https://bitbucket.org/tmidas/midas midas-develop
- cd midas-develop
- make mini ### creates linux-x86_64/{bin,lib}
- ssh agdaq@cb02 ### private network
- cd ~/packages/midas-develop
- make mini ### creates linux-linux-armv7l/{bin/lib}
- cd ~/online/chronobox_software
- cat fechrono.cxx ~/packages/midas-develop/progs/tmfe_example_everything.cxx > fechrono_tmfe.cxx
- edit fechrono_tmfe.cxx:

- rename "FeEverything" to "FeChrono"
- copy contents of frontend_init() to HandleFrontendInit()
- copy contents of frontend_exit() to HandleFrontendExit()
- replace get_frontend_index() with fFe->fFeIndex
- replace "return SUCCESS" with return TMFeOk()
- replace "return !SUCCESS" with return TMFeErrorMessage("boo!!!")
- this frontend has 3 indexed equipments, copy EqEverything 3 times, rename EqEverything to EqCbHist, EqCbms, EqCbFlow
- copy contents of begin_of_run() to EqCbHist::HandleBeginRun()
- copy contents of end_of_run() to EqCbHist::HandleEndRun()
- pause_run(), resume_run() are empty, delete all HandlePauseRun() and all HandleResumeRun()
- frontend_loop() is empty, delete
- poll_event() and interrupt_configure() are empty, delete
- delete all HandleStartAbortRun(), delete all calls to RegisterTransitionStartAbort();
- examine equipment[]:
- "cbhist%02d" - periodic, copy contents of read_cbhist() to EqCbHist::HandlePeriodic()
- "cbms%02d" - polled, copy contents of read_cbms_fifo() to EqCbms::HandlePollRead()
- "cbflow%02d" - periodic, copy contents of read_flow() to EqCbFlow::HandlePeriodic()
- delete unused HandlePoll(), HandlePollRead() and HandlePeriodic()
- replace bk_init32() with "size_t event_size = 100*1024; char* event = (char*)malloc(event_size); ComposeEvent(event, 
event_size); BkInit(event, event_size);"
- replace bk_create(pevent) with BkOpen(event)
- replace bk_close(pevent, ...) with BkClose(event, ...)
- replace "return bk_size(pevent)" with "EqSendEvent(event); free(event);"
- remove unused example SendData()
- if there linker complains about references to "hDB", add "HNDLE hDB" is global scope, add "hDB = fMfe->fDB"
- replace set_equipment_status() with EqSetStatus()
- move equipment configuration from the equipment[] array to the equipment constructors
- remove unused HandleRpc()
- remove unused HandleBeginRun() and unused HandleEndRun()
- remove all example code from HandleInit(), breakup frontend_init() code into per-equipment HandleInit() functions
- EqCbms::HandlePoll() replace all example code with "return true"
- if desired, replace ODB functions from utils.cxx with MVOdb RI(), RD(), etc
- if desired, replace cm_msg() with Msg() and delete "const char* frontend_name"
- update FeChrono() constructor:
      FeSetName("fechrono%02d");
      FeAddEquipment(new EqCbHist("cbhist%02d", __FILE__));
      FeAddEquipment(new EqCbms("cbms%02d", __FILE__));
      FeAddEquipment(new EqCbFlow("cbflow%02d", __FILE__));
- build:
g++ -std=c++11 -Wall -Wuninitialized -g -Ialtera -Dsoc_cv_av -I/home/agdaq/packages/midas-develop/include -
I/home/agdaq/packages/midas-develop/mvodb -c fechrono_tmfe.cxx
g++ -o fechrono_tmfe.exe -std=c++11 -Wall -Wuninitialized -g -Ialtera -Dsoc_cv_av -I/home/agdaq/packages/midas-develop/include 
-I/home/agdaq/packages/midas-develop/mvodb fechrono_tmfe.o utils.o cb.o /home/agdaq/packages/midas-develop/linux-
armv7l/lib/libmidas.a -lm -lz -lutil -lnsl -lpthread -lrt
- run:
- bombs on bm_set_cache_size(), reduce default cache size, old mserver cannot deal with the new default size, set 
fEqConfWriteCacheSize = 100*1024;
- run:
- prints too many messages, comment out print "HandlePollRead!"
- run:
- good now!

success, was not too bad.

also:
- replace gHaveRun with fMfe->fStateRunning
- replace gRunNumber with fMfe->fRunNumber

see tmfe.md section "variables provided by the framework"

K.O.
    Reply  05 Apr 2021, Konstantin Olchanski, Info, blog - convert mfe frontend to tmfe c++ framework 
Result is here:
https://bitbucket.org/expalpha/chronobox_software/src/master/fechrono_tmfe.cxx

Original code is in fechrono.cxx. Not super pretty, but representative of most mfe-based frontends
we see around here. A good example of why the old mfe.c structure no longer works so well for us.

After conversion to tmfe, we do not win a beauty contest yet, but the path for further
clean up and refactoring into better c++ is quite clear. (And it is very obvious where
the missing "event object" wants to be here)

K.O.
       Reply  15 Jun 2021, Konstantin Olchanski, Info, blog - convert tmfe_rev0 event builder to develop-branch tmfe c++ framework 
Now we are converting the alpha-g event builder from rev0 tmfe (midas-2020-xx) to the new tmfe c++
framework in midas-develop. Earlier, I followed the steps outlined in this blog
to convert this event builder from mfe.c framework to rev0 tmfe.

- get latest midas-develop
- examine progs/tmfe_example_everything.cxx
- open feevb.cxx
- comment-out existing main() function
- from tmfe_example_everything.cxx, copy class FeEverything and main() to the bottom of feevb.cxx
- comment-out old main()
- make sure we include the correct #include "tmfe.h"
- rename example frontend class FeEverything to FeEvb
- rename feevb's "rpc handler" and "periodic handler" class EvbEq to EqEvb
- update class declaration and constructor of EqEvb from EqEverything in example_everything: EqEvb extends TMFeEquipment, 
EqEvb constructor calls constructor of base class (c++ bogosity), keep the bits of the example that initialize the 
equipment "common"
- in EqEvb, remove data members fMfe and fEq: fMfe is now inherited from the base class, fEq is now "this"
- in FeEvb constructor, wire-in the EqEvb constructor: FeSetName("feevb") and FeAddEquipment(new EqEvb("EVB",__FILE__))
- migrate function names:
- fEq->SendEvent() with EqSendEvent()
- fEq->SetStatus() with EqSetStatus()
- fEq->ZeroStatistics() with EqZeroStatistics() -- can be removed, taken care of in the framework
- fEq->WriteStatistics() with EqWriteStatistics() -- can be removed, taken care of in the framework
- (my feevb.o now compiles, but will not work, yet, keep going:)
- EqEvb - update prototypes of all HandleFoo() methods per example_everything.cxx or per tmfe.h: otherwise the framework 
will not call them. c++ compiler will not warn about this!
- migrate old main():
- restore initialization of "common" and other things done in the old main():
- TMFeCommon was merged into TMFeEquipment, move common->Foo = ... to the EqEvb constructor, consult tmfe.h and tmfe.md 
for current variable names.
- consider adding "fEqConfReadConfigFromOdb = false;" (see tmfe.md)
- if EqEvb has a method Init() called from old main(), change it's name to HandleInit() with correct arguments.
- split EqEvb constructor: leave initialization of "common" in the constructor, move all functions, etc into HandleInit()
- move fMfe->SetTransitionSequenceFoo() calls to HandleFrontendInit()
- move fMfe->DeregisterTransition{Pause,Resume}() to HandleFrontendInit()
- old main should be empty now
- remove linking tmfe_rev0.o from feevb Makefile, now it builds!
- try to run it!
- it works!
- done.

K.O.
Entry  02 Jun 2021, Konstantin Olchanski, Info, bitbucket build truncated 
I truncated the bitbucket build to only build on ubuntu LTS 20.04.

Somehow all the other build targets - centos-7, centos-8, ubuntu-18 - have
an obsolete version of cmake. I do not know where the bitbucket os images
get these obsolete versions of cmake - my centos-7 and centos-8 have much
more recent versions of cmake.

If somebody has time to figure it out, please go at it, I would like very
much to have centos-7 and centos-8 builds restored (with ROOT), also
to have a ubuntu LTS 20.04 build with ROOT. (For me, debugging bitbucket
builds is extremely time consuming).

Right now many midas cmake files require cmake 3.12 (released in late 2018).

I do not know why that particular version of cmake (I took the number
from the tutorials I used).

I do not know what is the actual version of cmake that MIDAS (and ROOTANA) 
require/depend on.

I wish there were a tool that would look at a cmake file, examine all the 
features it uses and report the lowest version of cmake that supports them.

K.O.
Entry  12 May 2021, Pierre Gorel, Bug Report, History formula not correctly managed 
OS: OSX 10.14.6 Mojave
MIDAS: Downloaded from repo on April 2021.

I have a slow control frontend doing the command/readout of a MPOD HV/LV. Since I am reading out the current that are in nA (after updating snmp), I wanted to multiply the number by 1e9.

I noticed the new "Formula" field (introduced in 2019 it seems) instead of the "Factor/Offset" I was used to. None of my entries seems to be accepted (after hitting save, when coming back thee field is empty).

Looking in ODB in "/History/Display/MPOD/HV (Current)/", the field "Formula" is a string of size 32 (even if I have multiple plots in that display). I noticed that the fields "Factor" and "Offset" are still existing and they are arrays with the correct size. However, changing the values does not seem to do anything.

Deleting "Formula" by hand and creating a new field as an array of string (of correct length) seems to do the trick: the formula is displayed in the History display config, and correctly used.
    Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, History formula not correctly managed 
> OS: OSX 10.14.6 Mojave
> MIDAS: Downloaded from repo on April 2021.
> 
> I have a slow control frontend doing the command/readout of a MPOD HV/LV. Since I am reading out the current that are in nA (after updating snmp), I wanted to multiply the number by 1e9.
> 
> I noticed the new "Formula" field (introduced in 2019 it seems) instead of the "Factor/Offset" I was used to. None of my entries seems to be accepted (after hitting save, when coming back thee field is empty).
> 
> Looking in ODB in "/History/Display/MPOD/HV (Current)/", the field "Formula" is a string of size 32 (even if I have multiple plots in that display). I noticed that the fields "Factor" and "Offset" are still existing and they are arrays with the correct size. However, changing the values does not seem to do anything.
> 
> Deleting "Formula" by hand and creating a new field as an array of string (of correct length) seems to do the trick: the formula is displayed in the History display config, and correctly used.

I see this, too. Problem is that the history plot code must be compatible with both
the old scheme (factor/offset) and the new scheme (formula). But something goes wrong somewhere.

https://bitbucket.org/tmidas/midas/issues/307/history-plot-config-incorrect-in-odb

Why?

- new code cannot to "3 year" plots, old code has no problem with it
- old experiments (alpha1, etc) have only the old-style history plot definitions,
and both old and new plotting code should be able to show them (there is nobody
to convert this old stuff to the "new way", but we still desire to be able to look at it!)

K.O.
Entry  24 May 2021, Mathieu Guigue, Bug Report, Bug "is of type" 
Hi,

I am running a simple FE executable that is supposed to define a PRAW DWORD bank.
The issue is that, right after the start of the run, the logger crashes without messages.
Then the FE reports this error, which is rather confusing.
```
12:59:29.140 2021/05/24 [feTestDatastruct,ERROR] [odb.cxx:6986:db_set_data1,ERROR] "/Equipment/Trigger/Variables/PRAW" is of type UINT32, not UINT32
```
    Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, Bug "is of type" 
> Hi,
> 
> I am running a simple FE executable that is supposed to define a PRAW DWORD bank.
> The issue is that, right after the start of the run, the logger crashes without messages.
> Then the FE reports this error, which is rather confusing.
> ```
> 12:59:29.140 2021/05/24 [feTestDatastruct,ERROR] [odb.cxx:6986:db_set_data1,ERROR] "/Equipment/Trigger/Variables/PRAW" is of type UINT32, not UINT32
> ```

I think this is fixed in latest midas. There was a typo in this message, the same tid was printed twice,
with result you report "mismatch UINT32 and UINT32", instead of "mismatch of UINT32 vs what is actually there".

This fixes the message, after that you have to manually fix the mismatch in the data type in ODB (delete old one, I guess).

K.O.
Entry  26 May 2021, Marco Chiappini, Info, label ordering in history plot 
Dear all,
is there any way to order the labels in the history plot legend? In the old 
system there was the “order” column in the config panel, but I can not find it 
in the new system. Thanks in advance for the support.

Best regards,
Marco Chiappini
    Reply  02 Jun 2021, Konstantin Olchanski, Info, label ordering in history plot 
> is there any way to order the labels in the history plot legend? In the old 
> system there was the “order” column in the config panel, but I can not find it 
> in the new system. Thanks in advance for the support.

correct, for reasons unknown, the function to reorder and to delete individual 
entries was removed from the history panel editor.

K.O.
       Reply  02 Jun 2021, Konstantin Olchanski, Info, label ordering in history plot 
> > is there any way to order the labels in the history plot legend? In the old 
> > system there was the “order” column in the config panel, but I can not find it 
> > in the new system. Thanks in advance for the support.
> 
> correct, for reasons unknown, the function to reorder and to delete individual 
> entries was removed from the history panel editor.
> 
> K.O.

https://bitbucket.org/tmidas/midas/issues/284/history-panel-editor-reordering-of

K.O.
Entry  27 May 2021, Lukas Gerritzen, Bug Report, Wrong location for mysql.h on our Linux systems 
Hi,
with the recent fix of the CMakeLists.txt, it seems like another bug surfaced. 
In midas/progs/mlogger.cxx:48/49, the mysql header files are included without a 
prefix. However, mysql.h and mysqld_error.h are in a subdirectory, so for our 
systems, the lines should be
  48 #include <mysql/mysql.h>
  49 #include <mysql/mysqld_error.h>
This is the case with MariaDB 10.5.5 on OpenSuse Leap 15.2, MariaDB 10.5.5 on 
Fedora Workstation 34 and MySQL 5.5.60 on Raspbian 10.

If this problem occurs for other Linux/MySQL versions as well, it should be 
fixed in mlogger.cxx and midas/src/history_schema.cxx.
If this problem only occurs on some distributions or MySQL versions, it needs 
some more differentiation than #ifdef OS_UNIX. 

Also, this somehow seems familiar, wasn't there such a problem in the past?
    Reply  27 May 2021, Nick Hastings, Bug Report, Wrong location for mysql.h on our Linux systems 
Hi,

> with the recent fix of the CMakeLists.txt, it seems like another bug 
surfaced. 
> In midas/progs/mlogger.cxx:48/49, the mysql header files are included without 
a 
> prefix. However, mysql.h and mysqld_error.h are in a subdirectory, so for our 
> systems, the lines should be
>   48 #include <mysql/mysql.h>
>   49 #include <mysql/mysqld_error.h>
> This is the case with MariaDB 10.5.5 on OpenSuse Leap 15.2, MariaDB 10.5.5 on 
> Fedora Workstation 34 and MySQL 5.5.60 on Raspbian 10.
> 
> If this problem occurs for other Linux/MySQL versions as well, it should be 
> fixed in mlogger.cxx and midas/src/history_schema.cxx.
> If this problem only occurs on some distributions or MySQL versions, it needs 
> some more differentiation than #ifdef OS_UNIX.

What does "mariadb_config --cflags" or "mysql_config --cflags" return on 
these systems? For mariadb 10.3.27 on Debian 10 it returns both paths:

% mariadb_config --cflags
-I/usr/include/mariadb -I/usr/include/mariadb/mysql

Note also that mysql.h and mysqld_error.h reside in /usr/include/mariadb *not* 
/usr/include/mariadb/mysql so using "#include <mysql/mysql.h>" would not work.

On CentOS 7 with mariadb  5.5.68:

%  mysql_config --include
-I/usr/include/mysql
% ls -l /usr/include/mysql/mysql*.h
-rw-r--r--. 1 root root 38516 May  6  2020 /usr/include/mysql/mysql.h
-r--r--r--. 1 root root 76949 Oct  2  2020 /usr/include/mysql/mysqld_ername.h
-r--r--r--. 1 root root 28805 Oct  2  2020 /usr/include/mysql/mysqld_error.h
-rw-r--r--. 1 root root 24717 May  6  2020 /usr/include/mysql/mysql_com.h
-rw-r--r--. 1 root root  1167 May  6  2020 /usr/include/mysql/mysql_embed.h
-rw-r--r--. 1 root root  2143 May  6  2020 /usr/include/mysql/mysql_time.h
-r--r--r--. 1 root root   938 Oct  2  2020 /usr/include/mysql/mysql_version.h

So this seems to be the correct setup for both Debian and RHEL. If this is to 
be worked around in Midas I would think it would be better to do it at the 
cmake level than by putting another #ifdef in the code.

Cheers,

Nick.
       Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, Wrong location for mysql.h on our Linux systems 
> % mariadb_config --cflags
> -I/usr/include/mariadb -I/usr/include/mariadb/mysql

I get similar, both .../include and .../include/mysql are in my include path,
so both #include "mysql/mysql.h" and #include "mysql.h" work.

I added a message to cmake to report the MySQL CFLAGS and libraries, so next time
this is a problem, we can see what happened from the cmake output:

4ed0:midas olchansk$ make cmake | grep MySQL
...
-- MIDAS: Found MySQL version 10.4.16
-- MIDAS: MySQL CFLAGS: -I/opt/local/include/mariadb-10.4/mysql;-I/opt/local/include/mariadb-
10.4/mysql/mysql and libs: -L/opt/local/lib/mariadb-10.4/mysql/ -lmariadb

K.O.
Entry  27 May 2021, Joseph McKenna, Info, MIDAS Messenger - A program to send MIDAS messages to Discord, Slack and or Mattermost 

I have created a simple program that parses the message buffer in MIDAS and 
sends notifications by webhook to Discord, Slack and or Mattermost.

Active pull request can be found here:

https://bitbucket.org/tmidas/midas/pull-requests/21


Its written in python and CMake will install it in bin (if the Python3 binary 
is found by cmake). The only dependency outside of the MIDAS python library is 
'requests', full documentation are in the mmessenger.md 
    Reply  28 May 2021, Joseph McKenna, Info, MIDAS Messenger - A program to forward MIDAS messages to Discord, Slack and or Mattermost merged 
A simple program to forward MIDAS messages to Discord, Slack and or Mattermost

(Python 3 required)

Pull request accepted! Documentation can be found on the wiki

https://midas.triumf.ca/MidasWiki/index.php/Mmessenger
       Reply  02 Jun 2021, Konstantin Olchanski, Info, MIDAS Messenger - A program to forward MIDAS messages to Discord, Slack and or Mattermost merged 
> A simple program to forward MIDAS messages to Discord, Slack and or Mattermost
> 
> (Python 3 required)
> 
> Pull request accepted! Documentation can be found on the wiki
> 
> https://midas.triumf.ca/MidasWiki/index.php/Mmessenger

This sounds like a very useful and welcome addition to MIDAS.

But from documentation provided, I have clue how to activate it.

Perhaps it would help if you could write up the basic steps on how to go about it, i.e.
- go to discord
- push these buttons
- cut and paste this thingy from the web page to ODB

K.O.
Entry  19 May 2021, Francesco Renga, Suggestion, MYSQL logger 
Dear all,
      I'm trying to use the logging on a mysql DB. Following the instructions on 
the Wiki, I recompiled MIDAS after installing mysql, and cmake with NEED_MYSQL=1 
can find it:

-- MIDAS: Found MySQL version 8.0.23

Then, I compiled my frontend (cmake with no options + make) and run it, but in the 
ODB I cannot find the tree for mySQL. I have only:

Logger/Runlog/ASCII

while I would expect also:

Logger/Runlog/SQL

What could be missing? Maybe should I add something in the CMakeList file or run 
cmake with some option?

Thank you,
      Francesco
    Reply  21 May 2021, Francesco Renga, Suggestion, MYSQL logger 
I solved this, it was a failed "make clean" before recompiling. Now it works.

Sorry for the noise.

Francesco

> Dear all,
>       I'm trying to use the logging on a mysql DB. Following the instructions on 
> the Wiki, I recompiled MIDAS after installing mysql, and cmake with NEED_MYSQL=1 
> can find it:
> 
> -- MIDAS: Found MySQL version 8.0.23
> 
> Then, I compiled my frontend (cmake with no options + make) and run it, but in the 
> ODB I cannot find the tree for mySQL. I have only:
> 
> Logger/Runlog/ASCII
> 
> while I would expect also:
> 
> Logger/Runlog/SQL
> 
> What could be missing? Maybe should I add something in the CMakeList file or run 
> cmake with some option?
> 
> Thank you,
>       Francesco
Entry  19 May 2021, Konstantin Olchanski, Info, update of event buffer code 
a big update to the event buffer code was merged today.

two important bug fixes:

- a logic error in bm_receive_event() (actually bm_fill_read_cache_locked()) 
caused use of uninitialized variable to increment the read pointer and crash 
with error "read pointed points to an invalid event")
- missing bm_unlock() in bm_flush_cache() caused double-locking of event buffer 
caused a hang and a subsequent crash via the watchdog timeout.

several improvements:

- bm_receive_event_vec(std::vector<char>) with automatic memory allocation, one 
does not need to worry about providing a large event buffer to receive event 
data. For local connections MAX_EVENT_SIZE is no longer used, for remote 
connections, a buffer of MAX_EVENT_SIZE is allocated automatically, this is a 
limitation of the MIDAS RPC layer (it does not know how to allocate memory to 
receive arbitrary large data)

(MAX_EVENT_SIZE is now only used in bm_receive_event_rpc()).

- rpc_send_event_sg() - thread safe method to send events to the mserver. it 
takes an array of scatter-gather buffers, so a midas event does not have to be 
in one continious buffer.

- bm_send_event_sg() - same for local connections.

- on top of bm_send_event_sg() we now have bm_send_event_vec(std::vector<char>) 
and bm_send_event_vec(std::vector<vector<char>>). now we can move forward with 
implementing a new "event object" (the TMEvent event object from midasio.h 
already works with these new methods).

- remote connected bm_send_event() & co now always send events to the mserver 
using the event socket. (before, bm_send_event() used RPC_BM_SEND_EVENT and 
suffered from the RPC layer encoding/decoding overhead. mfe.c used 
rpc_send_event() for remote connections)

- bm_send_event(), bm_receive_event() & co now take a timeout value (in 
milliseconds) instead of an async_flag. The old async_flag values BM_WAIT and 
BM_NO_WAIT continue working as expected (wait forever and do not wait at all, 
respectively).

- following improvements are only for remote connections:

- in the case of event buffer congestion (event readers are slow, event buffers 
are close to 100% full), the bm_flush_cache() RPC will no longer timeout due to 
mserver being stuck waiting for free buffer space. (RPC is called with a 1000 
msec timeout, infinite loop waiting for flush is done on the frontend side, the 
RPC timeout will never fire)

- in the case of event buffer congestion, ODB RPC will no longer timeout. 
(previously mserver was stuck waiting for free buffer space and did not process 
any RPCs).

- at the end of run, last few events could be stuck in the event socket. now, 
frontends can flush it using bm_flush_cache(0,BM_WAIT) (use zero for the buffer 
handle). correct run transition should stop the trigger, stop generating new 
events, call bm_flush_cache(0,BM_WAIT), call bm_flush_cache("SYSTEM",BM_WAIT) 
and return success. (TMFE frontend already does this). Note that 
bm_flush_cache(BM_WAIT) can be stuck for very long time waiting for the event 
buffers to empty-out, so run transition RPC timeout is still possible.

K.O.
Entry  07 May 2021, Zaher Salman, Bug Report, modbselect trigget hotlink 
It seems that a modbselect triggers a "change" in an ODB which has a hot link. This happens onload (or whenever the custom page is reloaded) and otherwise it behaves as expected, i.e. no change unless the modbselect is actually changed. Is this the intended behaviour? can this be modified?
    Reply  10 May 2021, Stefan Ritt, Bug Report, modbselect trigget hotlink 
Thanks for reporting that bug, I fixed it in the last commit.

Stefan
Entry  06 May 2021, Ben Smith, Info, New feature in odbxx that works like db_check_record() 
For those unfamiliar, odbxx is the interface that looks like a C++ map, but automatically syncs with the ODB - https://midas.triumf.ca/MidasWiki/index.php/Odbxx.

I've added a new feature that is similar to the existing odb::connect() function, but works like the old db_check_record(). The new odb::connect_and_fix_structure() function:
- keeps the existing value of any keys that are in both the ODB and your code
- creates any keys that are in your code but not yet in the ODB
- deletes any keys that are in the ODB but not your code
- updates the order of keys in the ODB to match your code

This will hopefully make it easier to automate ODB structure changes when you add/remove keys from a frontend.

The new feature is currently in the develop branch, and should be included in the next release.
Entry  16 Feb 2021, Ruslan Podviianiuk, Forum, m is not defined error m_is_not_defined.png
Hello,

I see this mhttpd error starting MSL-script: 
Uncaught (in promise) ReferenceError: m is not defined
at mhttpd_message (VM2848 mhttpd.js:2304)
at VM2848 mhttpd.js:2122

As I can see it does not affect work of MSL script but shows ReferenceError in 
Midas sequencer (see picture).

Could please point me how to fix this error?

Thanks.
Ruslan
    Reply  25 Feb 2021, Konstantin Olchanski, Forum, m is not defined error 
> I see this mhttpd error starting MSL-script: 
> Uncaught (in promise) ReferenceError: m is not defined
> at mhttpd_message (VM2848 mhttpd.js:2304)
> at VM2848 mhttpd.js:2122

your line numbers do not line up with my copy of mhttpd.js. what version of midas 
do you run?

please give me the output of odbedit "ver" command (GIT revision, looks like this: 
IT revision:       Wed Feb 3 11:47:02 2021 -0800 - midas-2020-08-a-84-g78d18b1c on 
branch feature/midas-2020-12).

same info is in the midas "help" page (GIT revision).

to decipher the git revision string:

midas-2020-08-a-84-g78d18b1c means:
is commit 78d18b1c
which is 84 commits after git tag midas-2020-08-a

"on branch feature/midas-2020-12" confirms that I have the midas-2020-12 pre-
release version without having to do all the decoding above.

if you also have "-dirty" it means you changed something in the source code 
 and warranty is voided. (just joking! we can debug even modified midas source 
code)

K.O.
       Reply  05 May 2021, Zaher Salman, Forum, m is not defined error 
We had the same issue here, which comes from mhttpd.js line 2395 on the current git version. This seems to happen mostly when there is an alarm triggered or when there is an error message.

Anyway, the easiest solution for us was to define m at the beginning of mhttpd_message function 

let m;

and replace line 2395 with

if (m !== undefined) {


> > I see this mhttpd error starting MSL-script: 
> > Uncaught (in promise) ReferenceError: m is not defined
> > at mhttpd_message (VM2848 mhttpd.js:2304)
> > at VM2848 mhttpd.js:2122
> 
> your line numbers do not line up with my copy of mhttpd.js. what version of midas 
> do you run?
> 
> please give me the output of odbedit "ver" command (GIT revision, looks like this: 
> IT revision:       Wed Feb 3 11:47:02 2021 -0800 - midas-2020-08-a-84-g78d18b1c on 
> branch feature/midas-2020-12).
> 
> same info is in the midas "help" page (GIT revision).
> 
> to decipher the git revision string:
> 
> midas-2020-08-a-84-g78d18b1c means:
> is commit 78d18b1c
> which is 84 commits after git tag midas-2020-08-a
> 
> "on branch feature/midas-2020-12" confirms that I have the midas-2020-12 pre-
> release version without having to do all the decoding above.
> 
> if you also have "-dirty" it means you changed something in the source code 
>  and warranty is voided. (just joking! we can debug even modified midas source 
> code)
> 
> K.O.
          Reply  06 May 2021, Stefan Ritt, Forum, m is not defined error 
Thanks for reporting and pointing to the right location.

I fixed and committed it.

Best,
Stefan
Entry  09 Apr 2021, Lars Martin, Suggestion, Time zone selection for web page 
The new history as well as the clock in the web page header show the local time 
of the user's computer running the browser.
Would it be possible to make it either always use the time zone of the Midas 
server, or make it selectable from the config page?
It's not ideal trying to relate error messages from the midas.log to history 
plots if the time stamps don't match.
    Reply  14 Apr 2021, Stefan Ritt, Suggestion, Time zone selection for web page Screenshot_2021-04-14_at_16.54.12_.png
> The new history as well as the clock in the web page header show the local time 
> of the user's computer running the browser.
> Would it be possible to make it either always use the time zone of the Midas 
> server, or make it selectable from the config page?
> It's not ideal trying to relate error messages from the midas.log to history 
> plots if the time stamps don't match.

I implemented a new row in the config page to select the time zone. 

"Local": Time zone where the browser runs
"Server": Time zone where the midas server runs (you have to update mhttpd for that)
"UTC+X": Any other time zone

The setting affects both the status header and the history display.

I spent quite some time with "named" time zones like "PST" "EST" "CEST", but the 
support for that is not that great in JavaScript, so I decided to go with simple 
UTC+X. Hope that's ok.

Please give it a try and let me know if it's working for you.

Best,
Stefan
       Reply  29 Apr 2021, Pierre-Andre Amaudruz, Suggestion, Time zone selection for web page 
> > The new history as well as the clock in the web page header show the local time 
> > of the user's computer running the browser.
> > Would it be possible to make it either always use the time zone of the Midas 
> > server, or make it selectable from the config page?
> > It's not ideal trying to relate error messages from the midas.log to history 
> > plots if the time stamps don't match.
> 
> I implemented a new row in the config page to select the time zone. 
> 
> "Local": Time zone where the browser runs
> "Server": Time zone where the midas server runs (you have to update mhttpd for that)
> "UTC+X": Any other time zone
> 
> The setting affects both the status header and the history display.
> 
> I spent quite some time with "named" time zones like "PST" "EST" "CEST", but the 
> support for that is not that great in JavaScript, so I decided to go with simple 
> UTC+X. Hope that's ok.
> 
> Please give it a try and let me know if it's working for you.
> 
> Best,
> Stefan

Hi Stefan,

This is great, the UTC+x is perfect, thank you.
PAA
Entry  10 Mar 2021, Zaher Salman, Suggestion, embed modbvalue in SVG 
Is it possible to embed modbvalue in an SVG for use within a custom page?

thanks.
    Reply  10 Mar 2021, Stefan Ritt, Suggestion, embed modbvalue in SVG 
You can't really embed it, but you can overlay it. You tag the SVG with a 
"relative" position and then move the modbvalue with an "absolute" position over 
it:

<svg style="position:relative" width="400" height="100">
  <rect width="300" height="100" style="fill:rgb(255,0,0);stroke-width:3;stroke:rgb(0,0,0)" />
  <div class="modbvalue" style="position:absolute;top:50px;left:50px" data-odb-path="/Runinfo/Run number"></div>
</svg>
       Reply  26 Apr 2021, Zaher Salman, Suggestion, embed modbvalue in SVG 
I found a way to embed modbvalue into a SVG:

<text x="100" y="100" font-size="30rem">
Run=<tspan class="modbvalue" data-odb-path="/Runinfo/Run number"></tspan>
</text>

This seems to behave better that the suggestion below.

> You can't really embed it, but you can overlay it. You tag the SVG with a 
> "relative" position and then move the modbvalue with an "absolute" position over 
> it:
> 
> <svg style="position:relative" width="400" height="100">
>   <rect width="300" height="100" style="fill:rgb(255,0,0);stroke-width:3;stroke:rgb(0,0,0)" />
>   <div class="modbvalue" style="position:absolute;top:50px;left:50px" data-odb-path="/Runinfo/Run number"></div>
> </svg>
Entry  25 Mar 2021, Lars Martin, Bug Report, Minor bug: Change all time axes together doesn't work with +- buttons 
Version: release/midas-2020-12

In the new history display, the checkbox "Change all time axes together" works 
well with the mouse-based zoom, but does not apply to the +- buttons.
    Reply  14 Apr 2021, Stefan Ritt, Bug Report, Minor bug: Change all time axes together doesn't work with +- buttons 
> Version: release/midas-2020-12
> 
> In the new history display, the checkbox "Change all time axes together" works 
> well with the mouse-based zoom, but does not apply to the +- buttons.

Fixed in current commit.

Stefan
Entry  23 Mar 2021, Lars Martin, Bug Report, Time shift in history CSV export Cooling-MoxaCalib-20212118-190450-20212119-102151.pngScreenshot_from_2021-03-23_12-29-21.png
Version: release/midas-2020-12

I'm exporting the history data shown in elog:2132/1 to CSV, but when I look at the 
CSV data, the step no longer occurs at the same time in both data sets (elog:2132/2)
    Reply  23 Mar 2021, Lars Martin, Bug Report, Time shift in history CSV export 
History is from two separate equipments/frontends, but both have "Log history" set to 1.
       Reply  23 Mar 2021, Lars Martin, Bug Report, Time shift in history CSV export 
Tried with export of two different time ranges, and the shift appears to remain the same, 
about 4040 rows.
          Reply  24 Mar 2021, Stefan Ritt, Bug Report, Time shift in history CSV export 
I confirm there is a problem. If variables are from the same equipment, they have the same 
time stamps, like

t1 v1(t1) v2(t1)
t2 v1(t2) v2(t2)
t3 v1(t3) v2(t3)

when they are from different equipments, they have however different time stamps

t1 v1(t1)
t2    v2(t2)
t3 v1(t3)
t4    v2(t4)

The bug in the current code is that all variables use the time stamps of the first variable, 
which is wrong in the case of different equipments, like

t1 v1(t1) v2(*t2*)
t3 v1(t3) v2(*t4*)

So I can change the code, but I'm not sure what would be the bast way. The easiest would be to 
export one array per variable, like

t1 v1(t1)
t2 v1(t2)
...
t3 v2(t3)
t4 v2(t4)
...

Putting that into a single array would leave gaps, like

t1 v1(t1) [gap]
t2 [gap]  v2(t2)
t3 v1(t3) [gap]
t4 [ga]]  v2(t4)

plus this is programmatically more complicated, since I have to merge two arrays. So which 
export format would you prefer?

Stefan
             Reply  24 Mar 2021, Lars Martin, Bug Report, Time shift in history CSV export 
I think from my perspective the separate files are fine. I personally don't really like the format 
with the gaps, so don't see an advantage in putting in the extra work.
I'm surprised the shift is this big, though, it was more than a whole hour in my case, is it the 
time difference between when the frontends were started?
                Reply  14 Apr 2021, Stefan Ritt, Bug Report, Time shift in history CSV export 
I finally found some time to fix this issue in the latest commit. Please update and check if it's 
working for you.

Stefan
Entry  03 Apr 2020, Stefan Ritt, Info, Change of TID_xxx data types 
We have to request of a 64-bit integer data type to be included in MIDAS banks.
Since 64-bit integers are on some systems "long" and on other systems "long long",
I decided to create the two new data types

TID_INT64
TID_UINT64

which follows more the standard C++ tradition:

https://en.cppreference.com/w/cpp/types/integer

To be consistent, I renamed the old types:

TID_BYTE       -> TID_UINT8
TID_SBYTE      -> TID_INT8
TID_WORD       -> TID_UINT16
TID_SHORT      -> TID_INT16
TID_DWORD      -> TID_UINT32
TID_INT        -> TID_INT32

I left the old definitions in midas.h, so old code will still compile fine and be binary
compatible. But if you write new code, the new types are recommended.

If you save the ODB in ASCII format, the new types are used as stings as well, like

[/Experiment]
ODB timeout = INT32 : 10000

but the old types are still understood when you load an old ODB file.

I hope I didn't break anything, please report if you have any issue.

Stefan
    Reply  30 Mar 2021, Konstantin Olchanski, Info, INT64/UINT64/QWORD not permitted in ODB and history... Change of TID_xxx data types 
> We have to request of a 64-bit integer data type to be included in MIDAS banks.
> Since 64-bit integers are on some systems "long" and on other systems "long long",
> I decided to create the two new data types
> 
> TID_INT64
> TID_UINT64
> 

These 64-bit data types do not work with ODB and they do not work with the MIDAS history.

As of commits on 30 March 2021, mlogger will refuse to write them to the history and 
db_create_key() will refuse to create them in ODB.

Why these limitations:

a1) all reading of history is done using the "double" data type, IEEE-754 double precision 
floating point numbers have around 53 bits of precision and are too small to represent all 
possible values of 64-bit integers.
a2) SQL, SQLite and FILE history know nothing about reading and writing 64-bit integer data 
types (this should be easy to fix, as long as MySQL/MariaDB and PostgresQL support it)

b1) in ODB, odbedit and mhttd web pages do not display INT64/UINT64/QWORD data
b2) ODB save and restore from odb, xml and json formats most likely does not work for these 
data types

Fixing all this is possible, with a medium amount of work. As long as somebody needs it. 
Display of INT64/UINT64/QWORD on history plots will probably forever be truncated to 
"double" precision.

K.O.
       Reply  14 Apr 2021, Stefan Ritt, Info, INT64/UINT64/QWORD not permitted in ODB and history... Change of TID_xxx data types 
> These 64-bit data types do not work with ODB and they do not work with the MIDAS history.

They were never meant to work with the history. They were primarily implemented to put large 64-
bit data words into midas banks. We did not yet have a request to put these values into the ODB. 
Once such a request comes, we can address this.

Stefan
    Reply  04 Apr 2021, Konstantin Olchanski, Info, Change of TID_xxx data types 
> 
> To be consistent, I renamed the old types:
> 
> TID_DWORD      -> TID_UINT32
> TID_INT        -> TID_INT32
> 

this created an incompatibility with old XML save files,
old versions of midas cannot load new XML save files,
variable types have changed i.e. from "INT" to "INT32".

it would have been better if XML save files kept using the old names.

now packages that read midas XML files also need updating.

specifically, in ROOTANA:
- the old TVirtualOdb/XmlOdb.cxx (no longer used, deleted),
- mvodb/mxmlodb.cxx

K.O.
Entry  12 Apr 2021, Isaac Labrie Boulay, Forum, Client gets immediately removed when using a script button. logicCtrl.cppstart_daq.PNG
Hi all,

I'm running into a curious problem when I try to run a program using my custom 
script button. I have been using a script button to start my DAQ, this button 
has always worked. It starts by exporting an absolute path to scripts and then 
runs scripts, my frontend, my analyzer, and mlogger relative to this path.

I recently added a line of code to run a new script "logic_controller". If I run 
the script_daq from my terminal (./start_daq), mhttpd accepts the client and the 
program works as intended. But, if I use the script button, the logic_controller 
program is immediately deleted by MIDAS. It can be seen appearing in the status 
page clients list and then immediately gets deleted. This is a client that runs 
on the local experiment host.

What might be the issue? What is the difference between running the script 
through the terminal as opposed to running it through the mhttpd button?

I have added a picture of my simple script and the logic_controller code.

Any help would be greatly appreciated.

Cheers.

Isaac
    Reply  12 Apr 2021, Ben Smith, Forum, Client gets immediately removed when using a script button. 
> if I use the script button, the logic_controller program is immediately deleted by MIDAS.

This is indeed very curious, and I can't reproduce it on my test experiment. Can you redirect stdout and stderr from the logic_controller program into a file, to see how far the program gets? If it gets to the while loop at the end, then it would be useful to add some debug statements to see what condition causes it to exit the loop.

Are there any relevant messages in the midas message log about the program being killed? What's the value of "/Programs/logic_controller/Watchdog timeout"? 
       Reply  12 Apr 2021, Isaac Labrie Boulay, Forum, Client gets immediately removed when using a script button. debug_logic_controller.txt
> > if I use the script button, the logic_controller program is immediately deleted by MIDAS.
> 
> This is indeed very curious, and I can't reproduce it on my test experiment. Can you redirect stdout and stderr from the logic_controller program into a file, to see how far the program gets? If it gets to the while loop at the end, then it would be useful to add some debug statements to see what condition causes it to exit the loop.

I have redirected stdout and stderr into a text file and I have attached it to this entry. From what the stdout says, it seems that the lambda
function gets called 4 times before the program disconnects from the experiment. Somehow the status must become SS_ABORT or RPC_SHUTDOWN.

> Are there any relevant messages in the midas message log about the program being killed? What's the value of "/Programs/logic_controller/Watchdog timeout"? 

There are no interesting messages in the midas.log and "/Programs/logic_controller/Watchdog timeout" is 10000 when I run the command from the terminal window.
What happens when you run it on your test experiment?

I'll try some more debugging.

Thanks for helping me out! Cheers.

Isaac
          Reply  12 Apr 2021, Ben Smith, Forum, Client gets immediately removed when using a script button. 
I think it would be useful to find the minimal example that exhibits this behaviour.

What happens if your logic controller code is simply the 17 lines below? What happens if you create another script button that only starts the logic controller, not any of the other programs? etc. Gradually re-add features until you hit the problem (or scream in horror if it breaks with 17 lines of C++ and a 1 line shell script).



#include "midas.h"
#include "stdio.h"

int main() {
   cm_connect_experiment("", "", "logic_controller", NULL);

   do {
     int status = cm_yield(100);
     printf("cm_yield returned %d\n", status);
     if (status == SS_ABORT || status == RPC_SHUTDOWN)
       break;
   } while (!ss_kbhit());

   cm_disconnect_experiment();

   return 0;
}
             Reply  13 Apr 2021, Isaac Labrie Boulay, Forum, Client gets immediately removed when using a script button. 
> I think it would be useful to find the minimal example that exhibits this behaviour.
> 
> What happens if your logic controller code is simply the 17 lines below? What happens if you create another script button that only starts the logic controller, not any of the other programs? etc. Gradually re-add features until you hit the problem (or scream in horror if it breaks with 17 lines of C++ and a 1 line shell script).
> 

Hi Ben,

I have followed your suggestions and the program still stops immediately. My status as returned from "cm_yield(100)" is always 412 (SS_TIMEOUT) which is fine. 
The issue is that, when run with the script button, the do-wile loop stops immediately because the !ss_kbhit() always evaluates to FALSE.

My temporary solution has been to let the loop run forever :)

Let me know what think. Thanks again!

Isaac

> 
> 
> #include "midas.h"
> #include "stdio.h"
> 
> int main() {
>    cm_connect_experiment("", "", "logic_controller", NULL);
> 
>    do {
>      int status = cm_yield(100);
>      printf("cm_yield returned %d\n", status);
>      if (status == SS_ABORT || status == RPC_SHUTDOWN)
>        break;
>    } while (!ss_kbhit());
> 
>    cm_disconnect_experiment();
> 
>    return 0;
> }
                Reply  13 Apr 2021, Stefan Ritt, Forum, Client gets immediately removed when using a script button. 
> I have followed your suggestions and the program still stops immediately. My status as returned from "cm_yield(100)" is always 412 (SS_TIMEOUT) which is fine. 
> The issue is that, when run with the script button, the do-wile loop stops immediately because the !ss_kbhit() always evaluates to FALSE.
> 
> My temporary solution has been to let the loop run forever :)

Ahh, could be that ss_kbhit() misbehaves if there is no keyboard, meaning that it is started in the background as a script. 
We never had the issue before, since all "standard" midas programs like mlogger, mhttpd etc. also use ss_kbhit() and they 
can be started in the background via the "-D" flag, but maybe the stdin is then handled differentlhy. 

So just remove the ss_kbhit(), but keep the break, so that you can stop your program via the web page, like

#include "midas.h"
#include "stdio.h"

int main() {
  cm_connect_experiment("", "", "logic_controller", NULL);

  do {
    int status = cm_yield(100);
    printf("cm_yield returned %d\n", status);
    if (status == SS_ABORT || status == RPC_SHUTDOWN)
      break;
  } while (TRUE);

  cm_disconnect_experiment();

  return 0;
}
Entry  04 Apr 2021, Konstantin Olchanski, Info, bk_init32a data format 
In April 4th 2020 Stefan added a new data format that fixes the well known problem with alternating banks being 
misaligned against 64-bit addresses. (cannot find announcement on this forum. midas commit 
https://bitbucket.org/tmidas/midas/commits/541732ea265edba63f18367c7c9b8c02abbfc96e)

This brings the number of midas data formats to 3:

bk_init: bank_header_flags set to 0x0000001 (BANK_FORMAT_VERSION)
bk_init32: bank_header_flags set to 0x0000011 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT)
bk_init32a: bank_header_flags set to 0x0000031 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT | BANK_FORMAT_64BIT_ALIGNED;

TMEvent (midasio and manalyzer) support for "bk_init32a" format added today (commit 
https://bitbucket.org/tmidas/midasio/commits/61b7f07bc412ea45ed974bead8b6f1a9f2f90868)

TMidasEvent (rootana) support for "bk_init32a" format added today (commit 
https://bitbucket.org/tmidas/rootana/commits/3f43e6d30daf3323106a707f6a7ca2c8efb8859f)

ROOTANA should be able to handle bk_init32a() data now.

TMFE MIDAS c++ frontend switched from bk_init32() to bk_init32a() format (midas commit 
https://bitbucket.org/tmidas/midas/commits/982c9c2f8b1e329891a782bcc061d4c819266fcc)

K.O.
    Reply  13 Apr 2021, Konstantin Olchanski, Info, bk_init32a data format 
Until commit a4043ceacdf241a2a98aeca5edf40613a6c0f575 today, mdump mostly did not work with bank32a data.
K.O.


> In April 4th 2020 Stefan added a new data format that fixes the well known problem with alternating banks being 
> misaligned against 64-bit addresses. (cannot find announcement on this forum. midas commit 
> https://bitbucket.org/tmidas/midas/commits/541732ea265edba63f18367c7c9b8c02abbfc96e)
> 
> This brings the number of midas data formats to 3:
> 
> bk_init: bank_header_flags set to 0x0000001 (BANK_FORMAT_VERSION)
> bk_init32: bank_header_flags set to 0x0000011 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT)
> bk_init32a: bank_header_flags set to 0x0000031 (BANK_FORMAT_VERSION | BANK_FORMAT_32BIT | BANK_FORMAT_64BIT_ALIGNED;
> 
> TMEvent (midasio and manalyzer) support for "bk_init32a" format added today (commit 
> https://bitbucket.org/tmidas/midasio/commits/61b7f07bc412ea45ed974bead8b6f1a9f2f90868)
> 
> TMidasEvent (rootana) support for "bk_init32a" format added today (commit 
> https://bitbucket.org/tmidas/rootana/commits/3f43e6d30daf3323106a707f6a7ca2c8efb8859f)
> 
> ROOTANA should be able to handle bk_init32a() data now.
> 
> TMFE MIDAS c++ frontend switched from bk_init32() to bk_init32a() format (midas commit 
> https://bitbucket.org/tmidas/midas/commits/982c9c2f8b1e329891a782bcc061d4c819266fcc)
> 
> K.O.
Entry  22 Sep 2020, Frederik Wauters, Forum, INT INT32 in experim.h 
For my analyzer I generate the experim.h file from the odb.

Before midas commit 13c3b2b this generates structs with INT data types. compiles fine with my analysis code (using the old mana.cpp)

newer midas versions generate INT32, ... types. I get a 

‘INT32’ does not name a type   

although I include midas.h 

how to fix this?
    Reply  22 Sep 2020, Konstantin Olchanski, Forum, INT INT32 in experim.h 
> For my analyzer I generate the experim.h file from the odb.
> 
> Before midas commit 13c3b2b this generates structs with INT data types. compiles fine with my analysis code (using the old mana.cpp)
> 
> newer midas versions generate INT32, ... types. I get a 
> 
> ‘INT32’ does not name a type   
> 
> although I include midas.h 
> 
> how to fix this?

You could run experim.h through "sed" to replace the "wrong" data types with the correct data types.

You can also #define the "wrong" data types before doing #include experim.h.

I put your bug report into our bug tracker, but for myself I am very busy
with the alpha-g experiment and cannot promise to fix this quickly.

https://bitbucket.org/tmidas/midas/issues/289/int32-types-in-experimh

Here is an example to substitute things using "sed" (it can also do "in-place" editing, "man sed" and google sed examples)
sed "sZshm_unlink(.*)Zshm_unlink(SHM)Zg"

K.O.
       Reply  09 Mar 2021, Andreas Suter, Forum, INT INT32 in experim.h 
> > For my analyzer I generate the experim.h file from the odb.
This issue is still open. Shouldn't midas.h provide the 'new' data types as typedefs like  

typedef int INT32;

etc. Of course you would need to deal with all the supported targets and wrap it accordingly.

A.S.

> > 
> > Before midas commit 13c3b2b this generates structs with INT data types. compiles fine with my analysis code (using the old mana.cpp)
> > 
> > newer midas versions generate INT32, ... types. I get a 
> > 
> > ‘INT32’ does not name a type   
> > 
> > although I include midas.h 
> > 
> > how to fix this?
> 
> You could run experim.h through "sed" to replace the "wrong" data types with the correct data types.
> 
> You can also #define the "wrong" data types before doing #include experim.h.
> 
> I put your bug report into our bug tracker, but for myself I am very busy
> with the alpha-g experiment and cannot promise to fix this quickly.
> 
> https://bitbucket.org/tmidas/midas/issues/289/int32-types-in-experimh
> 
> Here is an example to substitute things using "sed" (it can also do "in-place" editing, "man sed" and google sed examples)
> sed "sZshm_unlink(.*)Zshm_unlink(SHM)Zg"
> 
> K.O.
          Reply  10 Mar 2021, Stefan Ritt, Forum, INT INT32 in experim.h 
Ok, I added

/* define integer types with explicit widths */
#ifndef NO_INT_TYPES_DEFINE
typedef unsigned char      UINT8;
typedef char               INT8;
typedef unsigned short     UINT16;
typedef short              INT16;
typedef unsigned int       UINT32;
typedef int                INT32;
typedef unsigned long long UINT64;
typedef long long          INT64;
#endif

to cover all new types. If there is a collision with user defined types, compile your program with -DNO_INT_TYPES_DEFINE and you remove the 
above definition. I hope there are no other conflicts.

Stefan
             Reply  15 Mar 2021, Frederik Wauters, Forum, INT INT32 in experim.h 
works!

> Ok, I added
> 
> /* define integer types with explicit widths */
> #ifndef NO_INT_TYPES_DEFINE
> typedef unsigned char      UINT8;
> typedef char               INT8;
> typedef unsigned short     UINT16;
> typedef short              INT16;
> typedef unsigned int       UINT32;
> typedef int                INT32;
> typedef unsigned long long UINT64;
> typedef long long          INT64;
> #endif
> 
> to cover all new types. If there is a collision with user defined types, compile your program with -DNO_INT_TYPES_DEFINE and you remove the 
> above definition. I hope there are no other conflicts.
> 
> Stefan
                Reply  30 Mar 2021, Konstantin Olchanski, Forum, INT INT32 in experim.h 
> > 
> > /* define integer types with explicit widths */
> > #ifndef NO_INT_TYPES_DEFINE
> > typedef unsigned char      UINT8;
> > typedef char               INT8;
> > typedef unsigned short     UINT16;
> > typedef short              INT16;
> > typedef unsigned int       UINT32;
> > typedef int                INT32;
> > typedef unsigned long long UINT64;
> > typedef long long          INT64;
> > #endif
> > 

NIH at work. In C and C++ the standard fixed bit length data types are available
in #include <stdint.h> as uint8_t, uint16_t, uint32_t, uint64_t & co.

BTW, the definition of UINT32 as "unsigned int" is technically incorrect, on 16-bit machines
"int" is 16-bit wide and on some 64-bit machines "int" is 64-bit wide.

K.O.
Entry  05 Mar 2021, Svetlana Chesnevskaya, Bug Report, New MIDAS old frontend incompatibility error.log
Hello!

Could you help me solve the problem of compatibility between our frontend (created in 2017) and the fresh MIDAS? The old MIDAS (2017) worked well, then we did not use it.
While compiling the frontend, I get a lot of warnings and a few compilation errors.

Any help will be greatly appreciated.

Thanks in advance.
With the best regards,
Svetlana
Entry  01 Mar 2021, Marius Koeppel, Forum, Using JSROOT.openFile with Midas 
Hi everyone,

I am currently trying to access a ROOT file produced by manalyzer. By calling JSROOT.openFile("MIDAS_DOMAIN/outputRUN.root"). I can download the rootfile via MIDAS_DOMAIN/outputRUN.root. Using JSROOT.openFile results in an 501 error, 
since the request feature is not provided. Using a simple API and uploading outputRUN.root there worked fine (when the run finised). 

Is there a way to use JSROOT.openFile with the current analyzed root file in Midas (so during the run)? I know that one can access histograms of the THttpServer via JSON but I need to get the full root tree.

Cheers,
Marius
    Reply  03 Mar 2021, Konstantin Olchanski, Forum, Using JSROOT.openFile with Midas 
> 
> I am currently trying to access a ROOT file produced by manalyzer. By calling JSROOT.openFile("MIDAS_DOMAIN/outputRUN.root"). I can download the rootfile via MIDAS_DOMAIN/outputRUN.root. Using JSROOT.openFile results in an 501 error, 
> since the request feature is not provided. Using a simple API and uploading outputRUN.root there worked fine (when the run finised). 
> 
> Is there a way to use JSROOT.openFile with the current analyzed root file in Midas (so during the run)? I know that one can access histograms of the THttpServer via JSON but I need to get the full root tree.
> 

Good questions. Right now in manalyzer I do not do anything more than starting the ROOT web server (so whatever
they support should work) and providing two "standard" location: one in the output file for histograms
and other permanent output and one in memory for transient objects, such as waveform plots, etc.

At some point I would like to provide a function to "get" TAFlowEvent objects so you can do things
like event displays in javascript. But I need a c++ to json serializer and standard c++ does not have it.
So I will have to use the clang serializer (also used by ROOT) and it will take me a few days
to figure it out.

Back to "openFile".

If you figure out the missing bits that need to be added to our code,
please post them here or submit them as a pull request or a bug report in bitbucket.

Also it would be good if you can provide a code example of "openFile" working elsewhere
but not with manalyzer, if I can run it, maybe I can figure out what's missing. But lacking
some example code, there is nothing for me to hack at.

K.O.
       Reply  04 Mar 2021, Marius Koeppel, Forum, Using JSROOT.openFile with Midas test.htmlexample_root.root
Thank you for the answer :)

> At some point I would like to provide a function to "get" TAFlowEvent objects so you can do things
> like event displays in javascript. But I need a c++ to json serializer and standard c++ does not have it.
> So I will have to use the clang serializer (also used by ROOT) and it will take me a few days
> to figure it out.

That sounds exactly what I was searching for. Because I wanted to create an interface between rootana and 
an event display build in javascript. Since all I tried did not really worked with the current rootana
I have now a "solution". I use the MIDAS python client, read directly events from the MIDAS buffer and provided
the events in a JSON format with the python flask API. Since the rendering of the event display is the bottleneck 
and I only need a view events to display this solution worked really well for me. Maybe having such a JSON API of 
the event buffer in MIDAS directly would also work for most of the event display applications or other simple javescript 
applications (my opinion).

> If you figure out the missing bits that need to be added to our code,
> please post them here or submit them as a pull request or a bug report in bitbucket.

One of the problems I had was the CORS domain of the THttpServer. In manalyzer:1894 you do 
sprintf(str, "http:127.0.0.1:%d", httpPort); but there are additional options for the THttpServer (like "?cors=DOMAIN").
So maybe a flag while starting manalyzer passing such options would be nice. I will create a pull request passing them later.

> Also it would be good if you can provide a code example of "openFile" working elsewhere
> but not with manalyzer, if I can run it, maybe I can figure out what's missing. But lacking
> some example code, there is nothing for me to hack at.

The problem is that it was not even running on simple THttpServer using interactive root:

    serv = new THttpServer("http:8088?cors=*");
    TFile *_file0 = TFile::Open("example_root.root")
    serv->Register("File", _file0);

So I tried just saving the file in the $MIDAS_DIR and tried to use mserver with JSROOT.openFile. I attached the html file and 
a test root file.

Cheers,
Marius
          Reply  04 Mar 2021, Konstantin Olchanski, Forum, Using JSROOT.openFile with Midas 
well, if this is something in ROOT, perhaps you can pursue it with the ROOT crowd,
they are quite friendly.

on my side, if all you need is to pull event data banks, this is easy to add
in mhttpd.

the jsonrpc request will look something like this:

get_event {
"buffer":"system",
"get_type":"GET_LATEST", (or whatever bm_receive_event() can do)
"include_banks":["AAAA","BBBB"],
"exclude_banks":["CCCC","DDDD"]
}

and return something like this:

event {
"header":{"event_id":1,...},
"banks":{
"AAAA":[1,2,3,4],
"BBBB":NULL (you asked for it, so you always get it, but it is NULL if bank does not exist)
}
}

would this work for what you are doing?

(this is not good enough if data has to be pre-digested by c++ analysis in rootana)

K.O.
             Reply  04 Mar 2021, Stefan Ritt, Forum, Using JSROOT.openFile with Midas 
I also need midas events going back to the browser for single event display, so put +1 for me.

Please also consider to use JavaScript typed arrays instead of JSON. For large midas banks, type 
arrays are 5-10 times faster than JSON encoding/decoding.

Best,
Stefan
             Reply  04 Mar 2021, Marius Koeppel, Forum, Using JSROOT.openFile with Midas 
> would this work for what you are doing?

Yes, having such a function would be perfect for the applications I have a the moment.

> (this is not good enough if data has to be pre-digested by c++ analysis in rootana)

Also agree, if one wants to have a more sophisticated applications it is definitely needed to preprocess the data.

Cheers,
Marius
Entry  25 Feb 2021, Isaac Labrie Boulay, Bug Report, Undefined client causing issues in transition. error_message.PNGundefined_client.PNG
Hi all,

I'm currently experiencing an issue during run transitions. It comes in the form 
of an alert saying "TypeError: Cannot read property 'length' of undefined" 
whenever I'm in the "transition" window on mhttpd. I have attached an image of 
what the transition window looks like when this happens. 

By the looks of it and by peering at the lines in transition.html where the 
error occurs, it's pretty obvious that there is some strange undefined client 
that the web page tries to access.

I don't know how to find what this client is. Is there a way to see it in the 
ODB? 

The issues happens in show_client() of transition.html (called by callback()). 
Here's the trace:

Uncaught (in promise) TypeError: Cannot read property 'length' of undefined
    at show_client (?cmd=Transition:227)
    at callback (?cmd=Transition:420)
    at ?cmd=Transition:430

Any help would be very appreciated!

Thanks so much.

Isaac
    Reply  25 Feb 2021, Konstantin Olchanski, Bug Report, Undefined client causing issues in transition. 
Clearly something goes wrong with the STARTABORT transition. Actually from your 
sceenshot, it is not clear why the STARTABORT transition was initiated.

Usually it is called after some client fails the "start run" transition to inform 
other clients that the run did not start after all. (mlogger uses this to close the 
output file, etc).

But in the screenshot, we do not see any client fail the transition (only rootana1 
was called, and it returned "green").

So, a puzzle. One possibility is that the transition code gets so confused
that it does not record correct transition data to ODB, then the web page
gets even more confused.

One way to see what happens, is to run the odbedit command "start now -v".

Can you try that? And attach all it's output here?

K.O.
       Reply  26 Feb 2021, Isaac Labrie Boulay, Bug Report, Undefined client causing issues in transition. start_now_-v_(1).PNGstop.PNG
> Clearly something goes wrong with the STARTABORT transition. Actually from your 
> sceenshot, it is not clear why the STARTABORT transition was initiated.
> 
> Usually it is called after some client fails the "start run" transition to inform 
> other clients that the run did not start after all. (mlogger uses this to close the 
> output file, etc).
> 
> But in the screenshot, we do not see any client fail the transition (only rootana1 
> was called, and it returned "green").
> 
> So, a puzzle. One possibility is that the transition code gets so confused
> that it does not record correct transition data to ODB, then the web page
> gets even more confused.
> 
> One way to see what happens, is to run the odbedit command "start now -v".
> 
> Can you try that? And attach all its output here?
> 
> K.O.

Thanks for getting back to me right away. I've attached two screenshots. The first one 
is the output after running "start now -v" (everything seemed to work nicely there), the 
second output is after using odbedit to stop the run with "stop". Notice that the DAQ 
never stops because it gets stuck in between transitions (You can see the run status 
being "stopping run" with the cancel transition button).

Thanks.

Isaac
          Reply  26 Feb 2021, Konstantin Olchanski, Bug Report, Undefined client causing issues in transition. 
So there is no error on run start anymore? To debug the stuck run stop, please use "stop -v" 
to see where it got stuck. You can also play with the RPC timeouts (the connect timeout and 
the response timeout), to make it get "unstuck" quicker. Definitely it should not be stuck 
forever, it should timeout at maximum of "rpc timeout * number of clients". K.O.
             Reply  26 Feb 2021, Isaac Labrie Boulay, Bug Report, Undefined client causing issues in transition. 
> So there is no error on run start anymore? To debug the stuck run stop, please use "stop -v" 
> to see where it got stuck. You can also play with the RPC timeouts (the connect timeout and 
> the response timeout), to make it get "unstuck" quicker. Definitely it should not be stuck 
> forever, it should timeout at maximum of "rpc timeout * number of clients". K.O.

You're right it does not stay stuck forever, it eventually gets unstuck. I forgot to mention this. 
I will try to play with these timeout parameters. It does not get stuck if I run my DAQ using the 
odbedit commands (start/stop). I don't know if this is relevant information that could help us 
identify the problem.

Thanks for all your help as always!

Isaac
                Reply  03 Mar 2021, Konstantin Olchanski, Bug Report, Undefined client causing issues in transition. 
> It does not get stuck if I run my DAQ using the odbedit commands (start/stop).

Interesting. Run start/stop from odbedit works but from mhttpd gets stuck.

I think they do not run the transition quite the same way. mhttpd uses the multithreaded transition.

So we can debug this using the "mtransition" program. Try:

- mtransition -v -d 1 START/STOP -- this should be same as odbedit
- mtransition -m -v -d 1 START/STOP -- this should be same as mhttpd

The "-v" and "-d 1" flags should cause lots of output, for failed transitions,
cut-and-paste it all into this elog here, should give us plenty of meat to debug.

K.O.
Entry  02 Mar 2021, Konstantin Olchanski, Info, shortest possible sleep 
since I am implementing a polled equipment, I was curious what is the smallest possible sleep time on current computers.

in current UNIX, there are 2 system calls available for sleeping: select() (with microsecond granularity) and nanosleep() (with nanosecond granularity).

So I wrote a little test program to check it out (progs/test_sleep).

First, Linux result using select(). Typical run on AMD 3700X CPU (4.1 GHz turbo boost) with Ubuntu LTS 20, linux kernel 5.8:

daq13:midas$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1003368.855 usec actual, 100336.885 usec actual per loop, oversleep 336.885 usec, 0.3%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1008512.020 usec actual, 10085.120 usec actual per loop, oversleep 85.120 usec, 0.9%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1062137.842 usec actual, 1062.138 usec actual per loop, oversleep 62.138 usec, 6.2%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1528650.999 usec actual, 152.865 usec actual per loop, oversleep 52.865 usec, 52.9%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  6250898.123 usec actual, 62.510 usec actual per loop, oversleep 52.510 usec, 525.1%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total, 54056918.144 usec actual, 54.057 usec actual per loop, oversleep 53.057 usec, 5305.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,   210875.988 usec actual, 0.211 usec actual per loop, oversleep 0.111 usec, 110.9%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total,   204804.897 usec actual, 0.205 usec actual per loop, oversleep 0.195 usec, 1948.0%
daq13:midas$ 

How to read this:

First line is 10 sleeps of 100 ms, for a total of 1 sec. this actually sleeps for a bit longer,
average over-sleep is 300 usec out of 100 ms is 0.3%.

Next few lines use progressively shorter sleep, 10 ms, 1 ms and 0.1 ms. over-sleep is consistently around 50-60 usec,
which I conclude to be this linux sleep granularity.

Last two lines try sleep for 0.1 usec and 0.01 usec, resulting in a zero-time sleep of select(),
so we just measure the average time cost of a linux syscall, around 200 ns in this machine.

Going to different machines:

Intel E-2236 (4.8 GHz tutboboost), Ubuntu LTS 20, linux kernel 5.8: over-sleep is 60 usec, zero-sleep is 400 ns.
Intel E-2226G (same, see arc.intel.com), CentOS-7, linux kernel 3.10: over-sleep is 60 usec, zero-sleep is 600 ns.
VME processor (2 GHz Intel T7400), Ubuntu 20, linux kernel 5.8: over-sleep is 60 usec, zero-sleep is 1700 ns.

This is pretty consistent, select() over-sleep is 60 usec on all hardware, zero-sleep tracks CPU GHz ratings.

Next, MacOS result, MacBookAir2020, MacOS 10.15.7, CPU 1.2 GHz i7-1060G7:

4ed0:midas olchansk$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1031108.856 usec actual, 103110.886 usec actual per loop, oversleep 3110.886 usec, 3.1%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1091104.984 usec actual, 10911.050 usec actual per loop, oversleep 911.050 usec, 9.1%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1270800.829 usec actual, 1270.801 usec actual per loop, oversleep 270.801 usec, 27.1%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1370345.116 usec actual, 137.035 usec actual per loop, oversleep 37.035 usec, 37.0%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  1706473.112 usec actual, 17.065 usec actual per loop, oversleep 7.065 usec, 70.6%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total,  5150341.034 usec actual, 5.150 usec actual per loop, oversleep 4.150 usec, 415.0%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,   595654.011 usec actual, 0.596 usec actual per loop, oversleep 0.496 usec, 495.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total,   591560.125 usec actual, 0.592 usec actual per loop, oversleep 0.582 usec, 5815.6%
4ed0:midas olchansk$ 

things are quite different here, OS is Mach microkernel with an oldish FreeBSD UNIX single-server (from NextSTEP),
so the sleep granularity is different, better than linux. zero-sleep still measures the syscall time, 600 ns on this machine.

Next we measure the same using the nansleep() syscall.

daq13:midas$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1004133.940 usec actual, 100413.394 usec actual per loop, oversleep 413.394 usec, 0.4%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1046117.067 usec actual, 10461.171 usec actual per loop, oversleep 461.171 usec, 4.6%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1096894.979 usec actual, 1096.895 usec actual per loop, oversleep 96.895 usec, 9.7%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1526744.843 usec actual, 152.674 usec actual per loop, oversleep 52.674 usec, 52.7%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  6250154.018 usec actual, 62.502 usec actual per loop, oversleep 52.502 usec, 525.0%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total, 53344123.125 usec actual, 53.344 usec actual per loop, oversleep 52.344 usec, 5234.4%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total, 52641665.936 usec actual, 52.642 usec actual per loop, oversleep 52.542 usec, 52541.7%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total, 52637501.001 usec actual, 52.638 usec actual per loop, oversleep 52.628 usec, 526275.0%
daq13:midas$ 

Here everything is simple. sleep longer than 1000 usec works the same as select(), sleep for shorter than 100 usec sleeps for 52 usec, regardless of what 
we ask for.

MacOS does no better, long sleeps are same as select(), sleeps is 1 usec or less sleep for too long. no improvement over select().

4ed0:midas olchansk$ ./bin/test_sleep 
sleep      10 loops, 0.100000 sec per loop, 1.000000 sec total,  1023327.827 usec actual, 102332.783 usec actual per loop, oversleep 2332.783 usec, 2.3%
sleep     100 loops, 0.010000 sec per loop, 1.000000 sec total,  1130330.086 usec actual, 11303.301 usec actual per loop, oversleep 1303.301 usec, 13.0%
sleep    1000 loops, 0.001000 sec per loop, 1.000000 sec total,  1333846.807 usec actual, 1333.847 usec actual per loop, oversleep 333.847 usec, 33.4%
sleep   10000 loops, 0.000100 sec per loop, 1.000000 sec total,  1402330.160 usec actual, 140.233 usec actual per loop, oversleep 40.233 usec, 40.2%
sleep   99999 loops, 0.000010 sec per loop, 0.999990 sec total,  2034706.831 usec actual, 20.347 usec actual per loop, oversleep 10.347 usec, 103.5%
sleep 1000000 loops, 0.000001 sec per loop, 1.000000 sec total,  6646192.074 usec actual, 6.646 usec actual per loop, oversleep 5.646 usec, 564.6%
sleep 1000000 loops, 0.000000 sec per loop, 0.100000 sec total,  7556284.189 usec actual, 7.556 usec actual per loop, oversleep 7.456 usec, 7456.3%
sleep 1000000 loops, 0.000000 sec per loop, 0.010000 sec total, 15720005.035 usec actual, 15.720 usec actual per loop, oversleep 15.710 usec, 157100.1%
4ed0:midas olchansk$ 

On Linux, strace tells us that the actual syscall behind nanosleep() is this:
clock_nanosleep(CLOCK_REALTIME, 0, {tv_sec=0, tv_nsec=10000}, 0x7fffc159e200) = 0

Let's try it directly... result is the same.
Let's try it with CLOCK_MONOTONIC... result is the same.

The man page of clock_nanosleep() specifies that this syscall always suspends the calling thread,
so what we see here is the Linux scheduler tick size.

Bottom line.

On current linux, shortest sleep is around 100 usec both select() and nanosleep().
On MacOS, shortest sleep is down to 5 usec using select(), but I cannot tell if CPU sleeps or busy-loops.

select() is still the best syscall for sleeping.

K.O.
    Reply  02 Mar 2021, Stefan Ritt, Info, shortest possible sleep 
Why do you need that? Periodic equipment typically runs ever ten seconds or so, meaning one can do this easily in a scheduler.

For polled equipment, you don't want to sleep at all. Because if you sleep, you might miss an event. That's why I put my poll in mfe.c into a for() loop. No 
sleep, maximum polling rate. I just double checked on my macbook air. 

- If poll is always false (no event available), the loop executes 50M times in 100ms (calibrated during startup of the frontend). That means one iteration 
takes 2ns (!). So if an event occurs, the readout is started with a 2ns overhead. No sleep can beat that. In a real world application, one has to add of course 
the VME access or so to poll for the event.

- If poll is always true, the framework generates about 700k events each second (returning jus a few bytes of event data).

So if one adds any sleep here, things can get only worse, so I don't see the point for that. Of course polling eats one kernel at 100%, but these days every 
CPU has more than one, even my 800 MHz Xilinx embedded ARM CPU (Zynq).

Best,
Stefan
       Reply  03 Mar 2021, Konstantin Olchanski, Info, shortest possible sleep 
> Why do you need that?

UNIX/POSIX advertises functions for sleeping in microseconds and nanoseconds,
for sure it is interesting to know what they actually do and what happens
when you ask them to sleep for 1 microsecond or 1 nanosecond.

To sleep or not to sleep that is a question.

But if I do decide to sleep, and I call the sleep function, I want to know what actually happens.

Now I do and I share it with all.

On current Linux, shortest sleep is around 60 usec. select() with sleep
shorter than that will not sleep at all, nanosleep() will always sleep for
the shortest amount.

P.S. For fans of interrupts ("because they are fast"), sleep waiting for interrupt
probably has same latency/granularity as above (60 usec), so if I drive a DMA engine
and I except the DMA transfer to complete under 60 usec, I should use a busy loop
to poll the "DMA done" bit instead of going to sleep and wait for the DMA interrupt.

K.O.
Entry  25 Feb 2021, Lars Martin, Bug Report, tmfe_main.cxx missing include <signal.h> 
The most recent commit (b43aef648c2f8a7e710a327d0b322751ae44afea) throws this 
compiler error:
src/tmfe_main.cxx:39:11: error: 'SIGPIPE' was not declared in this scope
    signal(SIGPIPE, SIG_IGN);

It's fixed by adding #include <signal.h> to that file.
    Reply  25 Feb 2021, Konstantin Olchanski, Bug Report, tmfe_main.cxx missing include <signal.h> 
> The most recent commit (b43aef648c2f8a7e710a327d0b322751ae44afea) throws this 
> compiler error:
> src/tmfe_main.cxx:39:11: error: 'SIGPIPE' was not declared in this scope
>     signal(SIGPIPE, SIG_IGN);
> 
> It's fixed by adding #include <signal.h> to that file.

"but it works just fine on my mac!"

anyhow, thank you for reporting this problem, it already fixed. the bitbucket auto-
build also caught it. I also boogered up "make remoteonly", also fixed now.

BTW, for production use I recommend midas from the "release" branches, unless one 
needs a bug fix or new feature from the development branch.

K.O.
       Reply  26 Feb 2021, Lars Martin, Bug Report, tmfe_main.cxx missing include <signal.h> 
> BTW, for production use I recommend midas from the "release" branches, unless one 
> needs a bug fix or new feature from the development branch.

Fair point. I would suggest adding that recommendation to the wiki instructions. I 
forget to add that step otherwise.
Entry  10 Feb 2021, Isaac Labrie Boulay, Forum, Javascript error during run transitions. 
Hi all,

I am encountering a Javascript error (TypeError: client.error is undefined) when 
I transition between run states. Does anybody have an idea of what my problem 
might be? I have pasted an example of what MIDAS logs during such sequences.

Thanks for all the help!

Isaac


09:24:08.611 2021/02/10 [mhttpd,INFO] Executing script 
"~/ANIS_20210106/scripts/start_daq.sh" from ODB "/Script/Start DAQ"

09:24:13.833 2021/02/10 [Logger,LOG] Program Logger on host localhost started

09:24:28.598 2021/02/10 [fevme,LOG] Program fevme on host localhost started

09:24:33.951 2021/02/10 [mhttpd,INFO] Run #234 started

09:26:30.970 2021/02/10 [mhttpd,ERROR] [midas.cxx:4260:cm_transition_call,ERROR] 
Client "Logger" transition 2 aborted while waiting for client "fevme": 
"/Runinfo/Transition in progress" was cleared

09:26:31.015 2021/02/10 [mhttpd,ERROR] [midas.cxx:5120:cm_transition,ERROR] 
transition STOP aborted: "/Runinfo/Transition in progress" was cleared

09:27:27.270 2021/02/10 [mhttpd,ERROR] 
[system.cxx:4937:ss_recv_net_command,ERROR] timeout receiving network command 
header

09:27:27.270 2021/02/10 [mhttpd,ERROR] [midas.cxx:12262:rpc_client_call,ERROR] 
call to "fevme" on "localhost" RPC "rc_transition": timeout waiting for reply
    Reply  10 Feb 2021, Konstantin Olchanski, Forum, Javascript error during run transitions. 
> I am encountering a Javascript error (TypeError: client.error is undefined) when 
> I transition between run states. Does anybody have an idea of what my problem 
> might be? I have pasted an example of what MIDAS logs during such sequences.


Not enough information. Can you do this:

a) for the javascript error, if you get it every time, open the javascript debugger 
and capture the stack trace? or at least the file name, function name and line number 
where the javascript exception is thrown?

b) for the run start failure, start the run from odbedit "start now -v" or from 
"mtransition -v -d 1 START" (or "stop" as the case may be). capture the output, email 
to me directly or put in this elog here.

K.O.


> 
> Thanks for all the help!
> 
> Isaac
> 
> 
> 09:24:08.611 2021/02/10 [mhttpd,INFO] Executing script 
> "~/ANIS_20210106/scripts/start_daq.sh" from ODB "/Script/Start DAQ"
> 
> 09:24:13.833 2021/02/10 [Logger,LOG] Program Logger on host localhost started
> 
> 09:24:28.598 2021/02/10 [fevme,LOG] Program fevme on host localhost started
> 
> 09:24:33.951 2021/02/10 [mhttpd,INFO] Run #234 started
> 
> 09:26:30.970 2021/02/10 [mhttpd,ERROR] [midas.cxx:4260:cm_transition_call,ERROR] 
> Client "Logger" transition 2 aborted while waiting for client "fevme": 
> "/Runinfo/Transition in progress" was cleared
> 
> 09:26:31.015 2021/02/10 [mhttpd,ERROR] [midas.cxx:5120:cm_transition,ERROR] 
> transition STOP aborted: "/Runinfo/Transition in progress" was cleared
> 
> 09:27:27.270 2021/02/10 [mhttpd,ERROR] 
> [system.cxx:4937:ss_recv_net_command,ERROR] timeout receiving network command 
> header
> 
> 09:27:27.270 2021/02/10 [mhttpd,ERROR] [midas.cxx:12262:rpc_client_call,ERROR] 
> call to "fevme" on "localhost" RPC "rc_transition": timeout waiting for reply
       Reply  11 Feb 2021, Isaac Labrie Boulay, Forum, Javascript error during run transitions. start_now_-v.PNGCall_Stack_for_JavaScript_Error.PNG
> > I am encountering a Javascript error (TypeError: client.error is undefined) when 
> > I transition between run states. Does anybody have an idea of what my problem 
> > might be? I have pasted an example of what MIDAS logs during such sequences.
> 
> 
> Not enough information. Can you do this:
> 
> a) for the javascript error, if you get it every time, open the javascript debugger 
> and capture the stack trace? or at least the file name, function name and line number 
> where the javascript exception is thrown?

I've attached a screenshot of the call stack showing the file names and line numbers.

> b) for the run start failure, start the run from odbedit "start now -v" or from 
> "mtransition -v -d 1 START" (or "stop" as the case may be). capture the output, email 
> to me directly or put in this elog here.

I have also attached a screen capture of the output.

Thanks for your help as always.

Isaac

> K.O.
> 
> 
> > 
> > Thanks for all the help!
> > 
> > Isaac
> > 
> > 
> > 09:24:08.611 2021/02/10 [mhttpd,INFO] Executing script 
> > "~/ANIS_20210106/scripts/start_daq.sh" from ODB "/Script/Start DAQ"
> > 
> > 09:24:13.833 2021/02/10 [Logger,LOG] Program Logger on host localhost started
> > 
> > 09:24:28.598 2021/02/10 [fevme,LOG] Program fevme on host localhost started
> > 
> > 09:24:33.951 2021/02/10 [mhttpd,INFO] Run #234 started
> > 
> > 09:26:30.970 2021/02/10 [mhttpd,ERROR] [midas.cxx:4260:cm_transition_call,ERROR] 
> > Client "Logger" transition 2 aborted while waiting for client "fevme": 
> > "/Runinfo/Transition in progress" was cleared
> > 
> > 09:26:31.015 2021/02/10 [mhttpd,ERROR] [midas.cxx:5120:cm_transition,ERROR] 
> > transition STOP aborted: "/Runinfo/Transition in progress" was cleared
> > 
> > 09:27:27.270 2021/02/10 [mhttpd,ERROR] 
> > [system.cxx:4937:ss_recv_net_command,ERROR] timeout receiving network command 
> > header
> > 
> > 09:27:27.270 2021/02/10 [mhttpd,ERROR] [midas.cxx:12262:rpc_client_call,ERROR] 
> > call to "fevme" on "localhost" RPC "rc_transition": timeout waiting for reply
          Reply  25 Feb 2021, Konstantin Olchanski, Forum, Javascript error during run transitions. 
> 
> I have also attached a screen capture of the output.
> 

so the error is gone?

K.O.
Entry  18 Feb 2021, Pintaudi Giorgio, Bug Report, Unexpected end-of-file 
Hello!
Sometimes when I mess around with the history plots I get the following error:

[mhttpd,ERROR] [history.cxx:97:xread,ERROR] Error: Unexpected end-of-file when 
reading file "/home/wagasci-ana/Data/online/210219.hst"

I have tried the following without success:

- Remove the MIDAS history files
- Restart mhttpd and mlogger

I do not know what triggers the error but when it triggers the above message is 
printed hundres of times a second, completely spamming the message log.

It happened again today after I set the label of a frontend too long making 
mlogger crash. After fixing the label length, the above message appeared and it 
does not seem to go away.
    Reply  18 Feb 2021, Pintaudi Giorgio, Bug Report, Unexpected end-of-file Screenshot_from_2021-02-19_15-41-23.png
It appears that the issue is trigger by a nonexisting Event and Variable as shown 
in the attached picture. This issue can arise when restoring the ODB from a 
previous version or importing ODB values from other MIDAS instances.
It might be useful if the error message were more clear about the source of the 
problem.

> Hello!
> Sometimes when I mess around with the history plots I get the following error:
> 
> [mhttpd,ERROR] [history.cxx:97:xread,ERROR] Error: Unexpected end-of-file when 
> reading file "/home/wagasci-ana/Data/online/210219.hst"
> 
> I have tried the following without success:
> 
> - Remove the MIDAS history files
> - Restart mhttpd and mlogger
> 
> I do not know what triggers the error but when it triggers the above message is 
> printed hundres of times a second, completely spamming the message log.
> 
> It happened again today after I set the label of a frontend too long making 
> mlogger crash. After fixing the label length, the above message appeared and it 
> does not seem to go away.
       Reply  25 Feb 2021, Konstantin Olchanski, Bug Report, Unexpected end-of-file 
> > [mhttpd,ERROR] [history.cxx:97:xread,ERROR] Error: Unexpected end-of-file when 
> > reading file "/home/wagasci-ana/Data/online/210219.hst"

I am puzzled. We can try two things:

a) look inside the "bad" hst file, maybe we can see something. run "mhdump -L 
/home/wagasci-ana/Data/online/210219.hst". If there is anything wrong with the file, it 
will be probably at the end. You can also try to run it without "-L".

b) switch from "midas" history (.hst files) to "FILE" history (mh*.dat files), the 
"FILE" history code is newer and the file format is more robust, with luck it may 
survive whatever trouble is happening in your experiment. This is controlled in ODB 
/Logger/History/XXX/Active (set to "y/n").

c) the output of "mlogger -v" may give us some clue, it usually complains if something 
is not right with definitions of history data.

K.O.
Entry  25 Feb 2021, Lars Martin, Forum, TMFePollHandlerInterface timing 
Am I right in thinking that the TMFE HandlePoll function is calle once per 
PollMidas()? And what is the difference to HandleRead()?
    Reply  25 Feb 2021, Konstantin Olchanski, Forum, TMFePollHandlerInterface timing 
> Am I right in thinking that the TMFE HandlePoll function is calle once per 
> PollMidas()? And what is the difference to HandleRead()?

Actually, polled equipment is not implemented yet in TMFE, as you noted, the 
internal scheduler needs to be reworked.

Anyhow, I think with modern c++ and with threads, both "periodic" and "polled" 
equipments are not strictly necessary.

Periodic equipment is effectively this:

in a thread:
while (1) {
do stuff, read data, send events
sleep
}

Polled equipment is effectively this:

in a thread:
while (1) {
if (poll()) { read data, send events }
else { sleep for a little bit }
}

Example of such code is the "bulk" equipment in progs/fetest.cxx.

But to implement the same in a single threaded environment (eliminates
problems with data locking, race conditions, etc) and to provide additional
structure to the user code, the plan is to implement polled equipment in TMFE
frontends. (periodic equipment is already implemented).

K.O.
Entry  24 Feb 2021, Zaher Salman, Bug Report, history reload 
I have a history that is embedded in a custom page using

<div class="mjshistory" data-group="SampleCryo" data-panel="SampleTemp" data-scale="30m" style="'+size+' position: relative;left: 640px;top: -205px;"></div>

this works fine when I load the page but seems to cause a timeout when reloading (F5) the page. It used to work fine last year but since a midas update this year it does not work. 

When I manually stop the script when firefox reports that it is slowing down the browse I get the following in the console:

Script terminated by timeout at:
binarySearch@http://xxx.psi.ch:8081/mhistory.js:1051:11
MhistoryGraph.prototype.findMinMax@http://xxx.psi.ch:8081/mhistory.js:1583:28
MhistoryGraph.prototype.loadInitialData/<@http://xxx.psi.ch:8081/mhistory.js:780:15

any ideas what may be causing this?

thanks.

Another hint to the problem, the custom page is accessible via 
http://xxx.psi.ch:8081/?cmd=custom&page=SampleCryo&
once loaded the address changes to where A and B change values as time passes (I guess B-A=30m).
http://xxx.psi.ch:8081/?cmd=custom&page=SampleCryo&&A=1614173369&B=1614175169
    Reply  25 Feb 2021, Stefan Ritt, Bug Report, history reload 
I have to reproduce the problem. Can you please send me the full link by direct email. As you know, I'm also at PSI.

Stefan
Entry  12 Feb 2021, Konstantin Olchanski, Bug Report, mlogger history snafu 
there is a problem with mlogger between commits xxx (17 Nov 2020) and a762bb8 (12 feb 2021). because of 
confusion between seconds and milliseconds, FILE (mhf*.dat files) and SQL history are recording with 
incorrect timestamps.

- traditional MIDAS history (*.hst files) does not have this problem (because of a buglet)
- midas-2020-12 release does not have this problem (it has mlogger from midas-2020-08 release)

there are some additional changes in mlogger that we are sorting out, when ready, we will make a new 
release of midas.

K.O.
Entry  10 Feb 2021, Konstantin Olchanski, Release, midas-2020-12-a 
midas-2020-12-a is here, see https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12

notable change from previous midas releases:

Use of ODB "Common" by mfe.c frontends has changed. New preferred behaviour
is to have the values defined in the equipment structure in the source code
to always overwrite values in ODB /Equipment/Foo/Common, except for the value
of "Common/enabled" (equipment_common_overwrite set to TRUE).

All mfe.c frontends will need to be modified for this change:

- for old behaviour (use ODB "Common"), add: BOOL equipment_common_overwrite = false;
- for new behaviour (use equipment values in the source code), add: BOOL equipment_common_overwrite = true;

The TMFE C++ frontend does not implement this change yet, it uses all "Common" values from ODB
and there is no way to overwrite things like the MIDAS event buffer name from C++ code. This may
change with the next version.

notable updates since midas-2020-08:

- new ODB variable /Experiment/Enable sound can be used to globally prevent mhttpd from playing sounds.
- Lazylogger now supports writing data over SFTP.
- odbvalue elements on custom pages now support an onload() callback as well as onchange(). Most elements now also 
support a data-validate callback.
- custom pages can now tie a select drop-down box to an ODB value using modbselect.
- ability to choose whether the code or the current ODB values take precendence for the "Common" settings of an 
equipment when starting a frontend. See elog thread 2014 for more details, and the "Upgrade guide" below for 
instructions.
- minor improvements to mdump program - support for 64-bit data types and ability to load larger events if needed.
- minor improvements to History plots and Buffers webpage.
- bug fixes

plans for next development: major update of mlogger to simplify channel 
configuration in odb, improvements to mhttpd multithreading, new history plot 
configuration page, more c++ification.

To obtain this release, either checkout the top of branch release/midas-2020-12 
(recommended) or checkout the tag midas-2020-12-a.

K.O.
Entry  25 Jan 2021, Thomas Lindner, Suggestion, mhttpd browser caching 
I have a more subtle point about the new ODB key for using an external elog I mentioned in [1].  I was very confused after changing the ODB "External Elog" because mhttpd still wasn't using my external elog URL.  I started trying to debug mhttpd.cxx, but found a lot of bits of mhttpd didn't seem to be getting called.  I eventually realized that my browser had been caching the responses for some (though not all) of the MIDAS navigation buttons.  Clearing my browser cache fixed the problem and allowed me to use the MIDAS button to the external ELOG.  This caching happens on my macbook for both Firefox 84.0.2 and Safari 13.1.

Many of the requests to mhttpd end up going to send_fp(), where we explicitly set the cache time to 24 hours.

   // send HTTP cache control headers                                                                                               
   time_t now = time(NULL);
   now += (int) (3600 * 24);
   struct tm* gmt = gmtime(&now);
   const char* format = "%A, %d-%b-%y %H:%M:%S GMT";
   char str[256];
   strftime(str, sizeof(str), format, gmt);
   r->rsprintf("Expires: %s\r\n", str);

Some other MIDAS buttons don't seem to be cached by the browser; for instance the response for the 'OldHistory' button doesn't get cached.

Should we remove the cache instruction for at least some of the buttons?  At least for the elog button where we want the link direction to get switched by an ODB key the caching seems a bad idea.

[1] https://midas.triumf.ca/elog/Midas/2078
    Reply  25 Jan 2021, Stefan Ritt, Suggestion, mhttpd browser caching 
Let me first explain a bit why caching is there. Once we had the case that someone from 
TRIUMF opened a midas custom page at T2K. It took about one minute (!) to load the page. 

When we looked at it, we found that the custom page pulled about 100 items with individual
HTTP requests from Japan, each taking about one second for the roundtrip. Then we redesigned
the custom page communication so that many ODB entries could be retrieved in one operation,
which improved the loading time from 100s to about 2s.

With the buttons we will have to make the same compromise. If we do not cache anything,
loading the midas status page over the Pacific takes many seconds. If we cache all, any
change on the midas side will not be reflected on the web page. So there is a compromise
to be made. I thought I designed it such that the side menu is cached locally, but when
the user presses "reload", then the full menu is fetched from the server. Of course one
has to remember this, so changing the ELOG URL or other things on the menu require a
reload (or wait a certain time for the cache to expire). So try again if that's working
for you. If not, I can visit it again and check if there is any bug.

If we go the route to disable the cache, better try this to T2K and see what you get before
we commit ourselves to that. Last time TRIUMF people were complaining a lot about long
load times.

Best,
Stefan
       Reply  25 Jan 2021, Thomas Lindner, Suggestion, mhttpd browser caching 
I tried reloading the pages.  If I reloaded the actual elog page 

https://server.triumf.ca/?cmd=Elog

then it bypassed the cache and got the correct updated page from mhttpd.

However, if when I reloaded the status page

https://server.triumf.ca/?cmd=Status

and then clicked the Elog button then I just got the cached (old) page.  Admittedly reloading the status page doesn't make so much sense (once I thought about it), but it is what I tried first (I'm good at modelling unexpected user behaviour); so there is some risk that the user will try reloading the wrong page and will be stuck not getting the external elog page (until 24 hours runs out).

Anyway, I will update the documentation to note that you need to reload the elog page after changing this variable.  That's probably an adequate solution.

I certainly don't suggest getting rid of caching entirely.  I was trying to think whether there was a set of pages where it would make sense to disable the cache (like the elog page).  But maybe that will just cause more problems.


> Let me first explain a bit why caching is there. Once we had the case that someone from 
> TRIUMF opened a midas custom page at T2K. It took about one minute (!) to load the page. 
> 
> When we looked at it, we found that the custom page pulled about 100 items with individual
> HTTP requests from Japan, each taking about one second for the roundtrip. Then we redesigned
> the custom page communication so that many ODB entries could be retrieved in one operation,
> which improved the loading time from 100s to about 2s.
> 
> With the buttons we will have to make the same compromise. If we do not cache anything,
> loading the midas status page over the Pacific takes many seconds. If we cache all, any
> change on the midas side will not be reflected on the web page. So there is a compromise
> to be made. I thought I designed it such that the side menu is cached locally, but when
> the user presses "reload", then the full menu is fetched from the server. Of course one
> has to remember this, so changing the ELOG URL or other things on the menu require a
> reload (or wait a certain time for the cache to expire). So try again if that's working
> for you. If not, I can visit it again and check if there is any bug.
> 
> If we go the route to disable the cache, better try this to T2K and see what you get before
> we commit ourselves to that. Last time TRIUMF people were complaining a lot about long
> load times.
> 
> Best,
> Stefan
    Reply  08 Feb 2021, Konstantin Olchanski, Suggestion, mhttpd browser caching 
>    r->rsprintf("Expires: %s\r\n", str);

The best I can tell, none of this works in current browsers. with google-chrome,
I see it cache pretty much everything regardless of "expires", "no cache", etc
and anything else I tried.

Things like shift-<reload>, etc used to work to refresh the cache, but not any more.

So, I too, see confusing side-effects of caching, where I change something in ODB,
but "nothing happens". Then I scratch my head for 30 minutes until I remember
to open the javascript debugger where shift-<reload> (or is it ctrl-<reload>) actually works.

It seems that the only reliable way to bypass the browser cache is to add
a tag with a random number to the URL ("&ts=currenttime").

This is for HTTP GET requests. HTTP POST does not seem to be cached, so I do not worry
about this nonsense for json-rpc requests.

Perhaps we should do this random number trick for all user actions. User can
press buttons only so fast, we should be able to sustain the rate. Anything
loaded automatically or from a timer, we should allow caching.

BTW, things like midas.js are also cached, and it is common to see problems
after updating midas, where status.html is newly loaded, but midas.js is an old
stale version from cache.

Messy.

K.O.
       Reply  08 Feb 2021, Stefan Ritt, Suggestion, mhttpd browser caching 
> It seems that the only reliable way to bypass the browser cache is to add
> a tag with a random number to the URL ("&ts=currenttime").

Indeed that's the only reliable way to avoid caching across browsers. An alternative is

("&r=" + Math.random())

to add a random number.


> BTW, things like midas.js are also cached, and it is common to see problems
> after updating midas, where status.html is newly loaded, but midas.js is an old
> stale version from cache.

Reloading JavaScript file NOT from the cache is really tricky these days. I added a
special Google Chrome extension to clear my browser cache, which works reliably:

https://chrome.google.com/webstore/detail/clear-cache/cppjkneekbjaeellbfkmgnhonkkjfpdn

Stefan
Entry  13 Jan 2021, Isaac Labrie Boulay, Forum, poll_event() is very slow. 
Hi all,

I'm currently trying to see if I can speed up polling in a frontend I'm testing. 
Currently it seems like I can't get 'lam's to happen faster than 120 times/second. 
There must be a way to make this faster. From what I understand, changing the poll 
time (500ms by default) won't affect the frequency of polling just the 'lam' 
period.

Any suggestions?

Thanks for your help!

Isaac

Hi,

What is the actual readout time, event size?
Do you have multiple equipment and of what type if any?

PAA
    Reply  13 Jan 2021, Konstantin Olchanski, Forum, poll_event() is very slow. 
> 
> I'm currently trying to see if I can speed up polling in a frontend I'm testing. 
> Currently it seems like I can't get 'lam's to happen faster than 120 times/second. 
> There must be a way to make this faster. From what I understand, changing the poll 
> time (500ms by default) won't affect the frequency of polling just the 'lam' 
> period.
> 
> Any suggestions?
> 

You could switch from the traditional midas mfe.c frontend to the C++ TMFE frontend,
where all this "lam" and "poll" business is removed.

At the moment, there are two example programs using the C++ TMFE frontend,
single threaded (progs/fetest_tmfe.cxx) and multithreaed (progs/fetest_tmfe_thread.cxx).

K.O.
       Reply  15 Jan 2021, Isaac Labrie Boulay, Forum, poll_event() is very slow. 
> > 
> > I'm currently trying to see if I can speed up polling in a frontend I'm testing. 
> > Currently it seems like I can't get 'lam's to happen faster than 120 times/second. 
> > There must be a way to make this faster. From what I understand, changing the poll 
> > time (500ms by default) won't affect the frequency of polling just the 'lam' 
> > period.
> > 
> > Any suggestions?
> > 
> 
> You could switch from the traditional midas mfe.c frontend to the C++ TMFE frontend,
> where all this "lam" and "poll" business is removed.
> 
> At the moment, there are two example programs using the C++ TMFE frontend,
> single threaded (progs/fetest_tmfe.cxx) and multithreaed (progs/fetest_tmfe_thread.cxx).
> 
> K.O.

Ok. I did not know that there was a C++ OOD frontend example in MIDAS. I'll take a look at 
it. Is there any documentation on it works?

Thanks for the support!

Isaac
    Reply  13 Jan 2021, Stefan Ritt, Forum, poll_event() is very slow. 
Something must be wrong on your side. If you take the example frontend under

midas/examples/experiment/frontend.cxx

and let it run to produce dummy events, you get about 90 Hz. This is because we have a

  ss_sleep(10);

in the read_trigger_event() routine to throttle things down. If you remove that sleep, 
you get an event rate of about 500'000 Hz. So the framework is really quick.

Probably your routine which looks for a 'lam' takes really long and should be fixed.

Stefan
       Reply  14 Jan 2021, Pintaudi Giorgio, Forum, poll_event() is very slow. 
> Something must be wrong on your side. If you take the example frontend under
> 
> midas/examples/experiment/frontend.cxx
> 
> and let it run to produce dummy events, you get about 90 Hz. This is because we have a
> 
>   ss_sleep(10);
> 
> in the read_trigger_event() routine to throttle things down. If you remove that sleep, 
> you get an event rate of about 500'000 Hz. So the framework is really quick.
> 
> Probably your routine which looks for a 'lam' takes really long and should be fixed.
> 
> Stefan

Sorry if I am going off-topic but, because the ss_sleep function was mentioned here, I 
would like to take the chance and report an issue that I am having.

In all my slow control frontends, the CPU usage for each frontend is close to 100%. This 
means that each frontend is monopolizing a single core. When I did some profiling, I 
noticed that 99% of the time is spent inside the ss_sleep function. Now, I would expect 
that the ss_sleep function should not require any CPU usage at all or very little.

So my two questions are:
Is this a bug or a feature?
Would you able to check/reproduce this behavior or do you need additional info from my 
side?
       Reply  14 Jan 2021, Isaac Labrie Boulay, Forum, poll_event() is very slow. 
> Something must be wrong on your side. If you take the example frontend under
> 
> midas/examples/experiment/frontend.cxx
> 
> and let it run to produce dummy events, you get about 90 Hz. This is because we have a
> 
>   ss_sleep(10);
> 
> in the read_trigger_event() routine to throttle things down. If you remove that sleep, 
> you get an event rate of about 500'000 Hz. So the framework is really quick.
> 
> Probably your routine which looks for a 'lam' takes really long and should be fixed.
> 
> Stefan

Hi Stefan,

I should mention that I was using midas/examples/Triumf/c++/fevme.cxx. I was trying to see 
the max speed so I had the 'lam' always = 1 with nothing else to add overhead in the 
poll_event(). I was getting <200 Hz. I am assuming that this is a bug. There is no 
ss_sleep() in that function.

Thanks for your quick response!

Isaac
          Reply  08 Feb 2021, Konstantin Olchanski, Forum, poll_event() is very slow. 
> I should mention that I was using midas/examples/Triumf/c++/fevme.cxx

this is correct, the fevme frontend is written to do 100% CPU-busy polling.

there is several reasons for this:
- on our VME processors, we have 2 core CPUs, 1st core can poll the VME bus, 2nd core can run 
mfe.c and the ethernet transmitter.
- interrupts are expensive to use (in latency and in cpu use) because kernel handler has to call 
use handler, return back etc
- sub-millisecond sleep used to be expensive and unreliable (on 1-2GHz "core 1" and "core 2" 
CPUs running SL6 and SL7 era linux). As I understand, current linux and current 3+GHz CPUs can 
do reliable microsecond sleep.

K.O.
Entry  21 Jan 2021, Thomas Lindner, Info, Using external ELOG with newer mhttpd 
A warning, in case others have the same problem I had.

In the past you could configure mhttpd so that the 'Elog' button would redirect to an external ELOG server; to do this you only needed to create and set the ODB variable '/Elog/URL' to the URL of your external ELOG server.

But with the newer MIDAS you need to set two ODB variables:

* "/Elog/URL" needs to be set to the URL of the external ELOG.
* "/Elog/External Elog" needs to be set to 'y'

I hadn't noticed this and was confused why my Elog button wasn't working after upgrading MIDAS.

MIDAS documentation was updated to reflect this change:

https://midas.triumf.ca/MidasWiki/index.php/Electronic_Logbook_(ELOG)
Entry  09 Dec 2020, Frederik Wauters, Forum, history and variables confusion 
I have a fe, with 2 "equipments" (2 different types of LV supplies).

Equipment/../Setting has a "Names" key, with the actual channel names (ch1, ch2, ...) of the devices.

Equipment/../Variables has channel states, voltage, etc. 

I also write a separate midas bank for each supply.

When I turn the "Log History" on, 2 things happen which cause troubles:
1. It writes the data of both bank to both the /Equipment/(Device1/Device2)/Variables .
2. I have e.g. 4 channels. In the banks I write current and voltages, so 8 numbers. When I turn on the logger I get an "Array size mismatch" between names and the midas bank size.

The only way around this is to disable the history logging in the equipment, and set "virtual" History events?
    Reply  09 Dec 2020, Stefan Ritt, Forum, history and variables confusion 
First, the writing of banks is completely independent of the history system. Banks go to the log file only, 
while the history is only linked to the "Variables" section in the ODB.

Second, it's advisable to group similar equipment into one. Like if you have five power supplies powering
and experiment, you don't want to have five equipments Supply1, Supply2, ..., but only one equipment
"Power Supplies". In the frontend belonging to that equipment, you define a DEVICE_DRIVER list with
one entry for each power supply. If you interact with an mscb device, there are some helper functions
which simplify the definition of the equipment and which I can send you privately. So your device
driver looks a bit like the one attached.

If you cannot do that and absolutely want separate equipments, please post a complete ODB subtree of your
settings, and I can try to reproduce your problem.

Stefan

======================

DEVICE_DRIVER power_driver[] = {
   {"Power Supply 1", mscbdev, 0, NULL, DF_INPUT | DF_MULTITHREAD},
   {"Power Supply 2", mscbdev, 0, NULL, DF_INPUT | DF_MULTITHREAD},
   {"Power Supply 3", mscbdev, 0, NULL, DF_INPUT | DF_MULTITHREAD},
   {""}
};

...

INT frontend_init()
{
   mscb_define("mscbxxx.psi.ch", "Power Supplies", "Power Supply 1", power_driver, 1, 0, "Output 1", 0.1);
   mscb_define("mscbxxx.psi.ch", "Power Supplies", "Power Supply 1", power_driver, 1, 1, "Output 2", 0.1);
   ...
}

/*-- Function to define MSCB variables in a convenient way ---------*/

void mscb_define(const char *submaster, const char *equipment, const char *devname, 
                 DEVICE_DRIVER *driver, int address, unsigned char var_index, 
                 const char *name, double threshold)
{
   int i, dev_index, chn_index, chn_total;
   char str[256];
   float f_threshold;
   HNDLE hDB;

   cm_get_experiment_database(&hDB, NULL);

   if (submaster && submaster[0]) {
      sprintf(str, "/Equipment/%s/Settings/Devices/%s/Device", equipment, devname);
      db_set_value(hDB, 0, str, submaster, 32, 1, TID_STRING);
      sprintf(str, "/Equipment/%s/Settings/Devices/%s/Pwd", equipment, devname);
      db_set_value(hDB, 0, str, "meg", 32, 1, TID_STRING);
   }

   /* find device in device driver */
   for (dev_index=0 ; driver[dev_index].name[0] ; dev_index++)
      if (equal_ustring(driver[dev_index].name, devname))
         break;

   if (!driver[dev_index].name[0]) {
      cm_msg(MERROR, "mscb_define", "Device \"%s\" not present in device driver list", devname);
      return;
   }

   /* count total number of channels */
   for (i=chn_total=0 ; i<=dev_index ; i++)
      if (((driver[dev_index].flags & DF_INPUT) > 0 && (driver[i].flags & DF_INPUT)) ||
          ((driver[dev_index].flags & DF_OUTPUT) > 0 && (driver[i].flags & DF_OUTPUT)))
      chn_total += driver[i].channels;

   chn_index = driver[dev_index].channels;
   sprintf(str, "/Equipment/%s/Settings/Devices/%s/MSCB Address", equipment, devname);
   db_set_value_index(hDB, 0, str, &address, sizeof(int), chn_index, TID_INT, TRUE);
   sprintf(str, "/Equipment/%s/Settings/Devices/%s/MSCB Index", equipment, devname);
   db_set_value_index(hDB, 0, str, &var_index, sizeof(char), chn_index, TID_BYTE, TRUE);

   if (threshold != -1 && (driver[dev_index].flags & DF_INPUT) > 0) {
     sprintf(str, "/Equipment/%s/Settings/Update Threshold", equipment);
     f_threshold = (float) threshold;
     db_set_value_index(hDB, 0, str, &f_threshold, sizeof(float), chn_total, TID_FLOAT, TRUE);
   }

   if (name && name[0]) {
      sprintf(str, "/Equipment/%s/Settings/Names %s", equipment, devname);
      db_set_value_index(hDB, 0, str, name, 32, chn_total, TID_STRING, TRUE);
   }

   /* increment number of channels for this driver */
   driver[dev_index].channels++;
}
       Reply  10 Dec 2020, Frederik Wauters, Forum, history and variables confusion genesys.odb
I wanted to have a c++ style driver, e.g. a instance of a "PowerSupply" class. This was not compatible with the list of DEVICE_DRIVER structs, with needs a C function entry point with variable arguments. 

Anyways, I attach my odb. I believe the issue stands regardless of the specific design choice here. Setting the History Log flag copies the banks created to the "Variables" of every equipments initialized, leading to a mismatch between the names array, and the variables. Can be solved by not using FE history events, but Virtual, but the flag in the Equipment is confusing.  

Bank creation in readout function:

for(const auto& d: drivers)
{
...
...
  bk_create(pevent,bk_name, TID_FLOAT, (void **)&pdata);
...			
  std::vector<float> voltage = d->GetVoltage();
  std::vector<float> current = d->GetCurrent();
  for(channels)
  {
    *pdata++ = voltage.at(iChannel);
    *pdata++ = current.at(iChannel);
  }
  bk_close(pevent, pdata);
}
          Reply  11 Dec 2020, Frederik Wauters, Forum, history and variables confusion 
1. ok, so calling the same readout functions from different equipments is just a bad idea, my bad, no blame for Midas to write data from both bank to both odb trees ...

2. One needs the same amount of bank entries as the size of settings/names[] . Otherwise the "History Log" flag does not work. So just don`t us "names" but "channel names" or something. 

" Second, it's advisable to group similar equipment into one. Like if you have five power supplies powering
and experiment, you don't want to have five equipments Supply1, Supply2, ..., but only one equipment
"Power Supplies".  "

It would be nice if this also works with c++ style drivers, i.e. a instance of a class. I don`t now how one would give an entry point to the "DEVICE_DRIVER" struct then.



> I wanted to have a c++ style driver, e.g. a instance of a "PowerSupply" class. This was not compatible with the list of DEVICE_DRIVER structs, with needs a C function entry point with variable arguments. 
> 
> Anyways, I attach my odb. I believe the issue stands regardless of the specific design choice here. Setting the History Log flag copies the banks created to the "Variables" of every equipments initialized, leading to a mismatch between the names array, and the variables. Can be solved by not using FE history events, but Virtual, but the flag in the Equipment is confusing.  
> 
> Bank creation in readout function:
> 
> for(const auto& d: drivers)
> {
> ...
> ...
>   bk_create(pevent,bk_name, TID_FLOAT, (void **)&pdata);
> ...			
>   std::vector<float> voltage = d->GetVoltage();
>   std::vector<float> current = d->GetCurrent();
>   for(channels)
>   {
>     *pdata++ = voltage.at(iChannel);
>     *pdata++ = current.at(iChannel);
>   }
>   bk_close(pevent, pdata);
> }
             Reply  15 Dec 2020, Konstantin Olchanski, Forum, history and variables confusion 
I think you are facing several problems:

a) mlogger does not clearly explain what history names will be used for which entries
in /eq/xxx/variables. "mlogger -v" almost does it, but we also need
"mlogger -v -n" to "show what you will do, but do not do it yet".

b) the mfe.c and the device class driver structure is very dated, tries to "do c++ in c". If it works for you,
certainly use it, but if it confuses you (as it confuses me), it probably only takes a few lines of c++
to replace the whole thing (minus the actual device drivers, which is the meat of it).

It think today, you face a choice:
- invest some time to understand the old device driver framework
- invest some time to do it "by hand" in c++, write your own device drivers (or use 3rd party drivers or snarf the "c" drivers from midas). Use the TMFE C++ frontend class if you go this route.

I would estimate that both choices are about the same amount of work.

K.O.


> 1. ok, so calling the same readout functions from different equipments is just a bad idea, my bad, no blame for Midas to write data from both bank to both odb trees ...
> 
> 2. One needs the same amount of bank entries as the size of settings/names[] . Otherwise the "History Log" flag does not work. So just don`t us "names" but "channel names" or something. 
> 
> " Second, it's advisable to group similar equipment into one. Like if you have five power supplies powering
> and experiment, you don't want to have five equipments Supply1, Supply2, ..., but only one equipment
> "Power Supplies".  "
> 
> It would be nice if this also works with c++ style drivers, i.e. a instance of a class. I don`t now how one would give an entry point to the "DEVICE_DRIVER" struct then.
> 
> 
> 
> > I wanted to have a c++ style driver, e.g. a instance of a "PowerSupply" class. This was not compatible with the list of DEVICE_DRIVER structs, with needs a C function entry point with variable arguments. 
> > 
> > Anyways, I attach my odb. I believe the issue stands regardless of the specific design choice here. Setting the History Log flag copies the banks created to the "Variables" of every equipments initialized, leading to a mismatch between the names array, and the variables. Can be solved by not using FE history events, but Virtual, but the flag in the Equipment is confusing.  
> > 
> > Bank creation in readout function:
> > 
> > for(const auto& d: drivers)
> > {
> > ...
> > ...
> >   bk_create(pevent,bk_name, TID_FLOAT, (void **)&pdata);
> > ...			
> >   std::vector<float> voltage = d->GetVoltage();
> >   std::vector<float> current = d->GetCurrent();
> >   for(channels)
> >   {
> >     *pdata++ = voltage.at(iChannel);
> >     *pdata++ = current.at(iChannel);
> >   }
> >   bk_close(pevent, pdata);
> > }
                Reply  08 Jan 2021, Stefan Ritt, Forum, history and variables confusion 
We kind of agreed to rewrite the slow control system in C++. Each device will have its own driver derived from a common base class implementing the general communication. The reason we need a "system" and not only a "hand-written" driver is because we want:

- glue many device drivers together for a single equipment
- have a dedicated readout thread for every device, in order not to block other devices
- have a common error reporting scheme working with several threads
- being able to disable/enable individual devices without changing the history system each time
- having a common naming scheme for all devices (like "enforce" /Equipment/<name>/Settings/Names xxx) which is needed by the history system
- ...

Will see when we have time for that.

Stefan
Entry  06 Jan 2021, Isaac Labrie Boulay, Info, Recovering a corrupted ODB using odbinit. 
Hi all,

I am currently trying to recover my corrupted ODB using odbinit and I am still 
getting issues after doing 'odbinit --cleanup' and trying to reload the saved 
ODB (last.json). Here is the output:


************************************************
(odbinit cleanup) Note* the ERROR in system.cxx
************************************************


[caendaq@cu332 ANIS]$ odbinit --cleanup
Checking environment... experiment name is "ANIS", remote hostname is ""
Checking command line... experiment "ANIS", cleanup 1, dry_run 0, create_exptab                                
0, create_env 0
Checking MIDASSYS....../home/caendaq/packages/midas
Checking exptab... experiments defined in exptab file "/home/caendaq/ANIS/exptab                               
":
0: "ANIS" <-- selected experiment
 

Checking exptab... selected experiment "ANIS", experiment directory "/home/caend                               
aq/ANIS/"
 

Checking experiment directory "/home/caendaq/ANIS/"
Found existing ODB save file: "/home/caendaq/ANIS/.ODB.SHM"
 

Checking shared memory...
Deleting old ODB shared memory...
[system.cxx:1052:ss_shm_delete,ERROR] shm_unlink(/1001_ANIS_ODB__home_caendaq_AN                               
IS_) errno 2 (No such file or directory)
Good: no ODB shared memory
Deleting old ODB semaphore...
Deleting old ODB semaphore... create status 1, delete status 1
Preserving old ODB save file /home/caendaq/ANIS/.ODB.SHM" to "/home/caendaq/ANIS                               
/.ODB.SHM.1609951022"
 

Checking ODB size...
Requested ODB size is 0 bytes (0.00B)
ODB size file is "/home/caendaq/ANIS//.ODB_SIZE.TXT"
Saved ODB size from "/home/caendaq/ANIS//.ODB_SIZE.TXT" is 1048576 bytes (1.05MB                               
)
We will initialize ODB for experiment "ANIS" on host "" with size 1048576 bytes                                
(1.05MB)

 
Creating ODB...
Creating ODB... db_open_database() status 302
Saving ODB...
Saving ODB... db_close_database() status 1
Connecting to experiment...
 

Connected to ODB for experiment "ANIS" on host "" with size 1048576 bytes (1.05M                               
B)
Checking experiment name... status 1, found "ANIS"
Disconnecting from experiment...
 

Done
 

****************************************
(Loading the last copy of my ODB)
*************************************



[caendaq@cu332 data]$ odbedit
[local:ANIS:S]/>load last.json
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
[ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink 
callback
 


**********************************************
(Now trying to run my frontend and analyzer)
*********************************************
 


[caendaq@cu332 ANIS]$ ./start_daq.sh

mlogger: no process found
fevme: no process found
manalyzer.exe: no process found
manalyzer_example_cxx.exe: no process found
roody: no process found
[ODBEdit,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" is 
corrupted, mismatch of buffer name in shared memory ""
 

11:38:30 [ODBEdit,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" 
is corrupted, mismatch of buffer name in shared memory ""
Becoming a daemon...
Becoming a daemon...
Please point your web browser to http://localhost:8081
To look at live histograms, run: roody -Hlocalhost
Or run: mozilla http://localhost:8081
[caendaq@cu332 ANIS]$ Frontend name          :     fevme
Event buffer size      :     1048576
User max event size    :     204800
User max frag. size    :     1048576
# of events per buffer :     5
 

Connect to experiment ANIS...

OK

[fevme,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" is 
corrupted, mismatch of buffer name in shared memory ""
[fevme,ERROR] [mfe.cxx:596:register_equipment,ERROR] Cannot open event buffer 
"SYSTEM" size 33554432, bm_open_buffer() status 219

Has anyone ever encountered these issues?

Thanks for your time.

Isaac
Entry  05 Jan 2021, Isaac Labrie Boulay, Bug Report, Logger: Disk nearly full. 
Hi all,

I've ran into a problem where my experiment gets interrupted with a message from 
the logger saying that my disk is nearly full. This does not make sense to me 
because I have deleted almost all the data files from my data directory. I'm 
guessing that somewhere the ODB perceives that the directory is full when in 
reality its not.

Here is the exact message:

[ODBEdit,INFO] Run #252 stopped

09:22:19 [Logger,TALK] disk nearly full, stopping the run

09:22:19 [Logger,ERROR] [mlogger.cxx:4475:log_write,ERROR] Disk '/home/caendaq/A       
NIS/data/run00252.mid.lz4' is almost full: 81 MiBytes free out of 922497 MiBytes       
, stopping the run

Does any body have a solution for this? Thanks so much.

Isaac
    Reply  06 Jan 2021, Stefan Ritt, Bug Report, Logger: Disk nearly full. 
The logger simple requests the disk free space level from the operating system in the same 
way as the "df" command does. Can you do a "df" on your system? I have seen that some file 
systems free up space not immediately if you delete files, but some times later (like 24h).

Stefan
       Reply  06 Jan 2021, Isaac Labrie Boulay, Bug Report, Logger: Disk nearly full. 
> The logger simple requests the disk free space level from the operating system in the same 
> way as the "df" command does. Can you do a "df" on your system? I have seen that some file 
> systems free up space not immediately if you delete files, but some times later (like 24h).
> 
> Stefan

Thanks Stefan. Yes the files were still held open by some processes. It's solved now.

Cheers.

Isaac
Entry  17 Dec 2020, Amy Roberts, Suggestion, Improving variable functionality in Sequencer? 
We're using the sequencer to manage runs, and this typically looks something like:

1. save ODB keys to variables via ODBGET
2. set ODB keys to new values for a "pre-run" process
3. return ODB keys to values created in line 1
4. take data

The problem I'm running into is that the list of ODB keys to save is pretty 
unwieldy.  I'm wondering if there are sequencer features that exist or that I could 
request that might make this easier.

For example, having a way to list ODB keys, save ODB directories, and load ODB 
directories would be much more concise way for me to write my script.

Another option might be to have some version of the ODBSET wildcards for ODBGET.  
Although for this, setting the variable names might be tricky.  

In any case, even being able to ODBGET an array and set that to one variable name 
would be a big improvement.
    Reply  05 Jan 2021, Amy Roberts, Suggestion, Improving variable functionality in Sequencer? 
Hello, just wanted to re-ping on this question now that folks are starting to get back from 
the holidays.
       Reply  06 Jan 2021, Stefan Ritt, Suggestion, Improving variable functionality in Sequencer? laser.msl
I guess you use a wrong pattern here. There is no need to copy ODB values to local variables, 
then change them, then write them back. You can rather directly write values to the ODB. We run 
all our experiments in that way and we can do what we want. So most of our scripts have sections 
like

 ODBSUBDIR "/Equipment/Laser/Variables"
   ODBSET "Setting[*]", 0, 0
   ODBSET "Output[1]", 0, 0
   ODBSET "Output[2]", 1, 0
   ODBSET "Output[3]", 0, 0
   ODBSET "Output[4]", 1, 1
 ENDODBSUBDIR

Note that both the path and the indices can contain wild cards, making this pattern more 
flexible. Wildcards are however not (yet) supported for local variables, that's why we use 
directly the ODBSET directive.

I attach a larger example from the MEG experiment here for your reference.

Stefan
Entry  18 Dec 2020, Stefan Ritt, Suggestion, Code formatting .clang-formatcnaf_callback_llvm.cxxcnaf_callback_root.cxxcnaf_callback_gnu.cxxcnaf_callback_google.cxx
May I ask for your quick opinion on code formatting. MIDAS had a coding style 
which pretty much followed the ROOT coding style described at

https://root.cern/contribute/coding_conventions/

so we followed the "3 spaces indent" convention, braces according to Kernigham & 
Ritchie and a few other things. I see however that code written by different 
people still is formatted differently, like spaces before and after comparators 
etc. I wonder if it would make sense to keep a consistent code formatting through 
the whole midas repository.

Looking again at what the ROOT guys doe (see link above), they have a ClangFormat 
file, which I attached to this post. Putting this file into the root of midas 
ensures that all files are formatted in exactly the same way, which would increase 
readability largely.

The nice thing with ClangFormat is that can be integrated into my editor (Clion) as 
well as in emacs and vim:

https://clang.llvm.org/docs/ClangFormat.html

This would also make the emacs settings in our files obsolete:

/* emacs
 * Local Variables:
 * tab-width: 8
 * c-basic-offset: 3
 * indent-tabs-mode: nil
 * End:
 */

I don't like these because they are only for people using emacs. If everybody would 
put statements into the files with their favourite editor, all our source files 
would be cluttered quite a bit.

So the question is now how style to use? I attached different trials with a simple 
file from the distribution, so you can see the differences. They use the style from

- LLVM
- ROOT
- GNU
- Google

I consciously skipped the "Microsoft" style ;-)

Which one should we settle on? Any opinion? If I don't hear anything, I will pick a 
style at the end of this year 2020. I have a slight favour of the ROOT style, although 
I don't like that the "case" is not indented there under the opening brace of the 
switch statement which seems inconsistent to me. The only one doing that right is the 
Google format, but that one has an indentation of 2 chars instead our usual 3 chars. 
At the end of the day I think it's not so important on which style we agree, as long 
as we DO have a common style for all midas files.

Best,
Stefan
    Reply  04 Jan 2021, Stefan Ritt, Suggestion, Code formatting .clang-format
After pondering over the holidays, I decided to use the widely used LLVM code formatting, 
just adapted slightly for 3 spaces and "case" indentation in a "switch" statement. This 
formatting is now very close to our original one. Nevertheless, I did not reformat all 
existing code, since that would screw up the git repository, and you cannot see then anymore 
who wrote which line of code. But having the .clang-format file now in the midas root, all 
NEW files fill follow that standard. 

The CLion editor automatically picks up the .clang-format file if your enable ClangFomrat 
via Preferences -> Code Style -> General -> Enable ClangFormat.

EMACS can also use this file by adding following lines to your .emacs:

(load "<path-to-clang>/tools/clang-format/clang-format.el")
(global-set-key [C-M-tab] 'clang-format-region)

One problem left is if you check out midas on a new machine, you might not have there your 
personal .emacs file. If there is a way to ship a .emacs with midas, which gets 
automatically loaded, I would be happy to put this into the distribution.

Stefan
Entry  16 Dec 2020, Isaac Labrie Boulay, Forum, Issues building banks. 
Hi all,

I'm currently trying to build events through doing block transfers. The worry was 
that organizing and packaging bank data into an array would produce too much dead 
time causing too many missed events. Trying out that method, I'm running into all 
sorts of issues such as unaligned transfers where the QDC events are unaligned, or 
improperly aligned banks. Giving me a headache.

My question is, if I were to revert back to simple 32 bit read cycles and using 
the fevme.cxx template's method of organizing data before sending them to the 
buffer, what kind of deadtime should I expect? Am I wrong to assume that this 
would result in deadtime at all? I'm using a CAEN V792n 16 channel QDC and the hit 
frequency that I'm using to test is 20kHz.

Thanks.

Isaac
    Reply  16 Dec 2020, Konstantin Olchanski, Forum, Issues building banks. 
> I'm currently trying to build events through doing block transfers.

I am confused by your question. I assume you read a CAEN V792 ADC, but I do not know what VME master you 
use. The restrictions on data alignment come from the VME master.
I am mostly familiar with restrictions of UniverseII and tsi148 PCI-VME bridges.
I think there is no restriction for USB-VME bridges and similar.

Anyhow. Which block transfer do you use? 32-bit block transfer (BLT32)? 64-bit block transfer (MBLT64)? 
(no 128-bit 2eVME/2eSST transfers from the V792). Maybe the "simulated block transfer" (DMA engine uses 
single-word reads instead of block transfer)?

> The worry was that organizing and packaging bank data into an array would produce too much dead time 
causing too many missed events.

Valid concerns.

> I'm running into all sorts of issues such as unaligned transfers where the QDC events are unaligned, or 
improperly aligned banks.

You should not see any problems with unaligned transfers if you give the DMA engine
correct memory addresses as required by the hardware:

- always aligned to 32-bit (4 bytes, last two address bits set to 0)
- aligned to 64-bits for MBLT64 64-bit transfers, this would be the normal case for the V792 (8 bytes, 
last 3 address bits set to 0)
- aligned to 128-bits for 2eVME/2eSST transfers (16 bytes, last 4 bits of address are zero).

You also need to specify correct amount of data to read: number of bytes should be multiple of 4 for 32-
bit transfers, multiple to 8 for 64-bit transfers and multiple of 16 for 128-bit transfers (2eVME/2eSST).

Very often this requires reading "extra" data words. Most VME modules can generate extra pad words to 
align event length to DMA restrictions. Sometimes you need to
enable this in a control register (V792, V1190).

> Giving me a headache.

Me too. MIDAS recently introduced the QWORD 64-bit data type, banks of this type
should have correct alignment for 64-bit VME block transfers. But for 2eVME/2eSST
transfers, I still have to ensure alignment "by hand" (SIS3820, VF48, etc).

With QWORD banks, you need to use bk_init32a() instead of bk_init32().

> My question is, if I were to revert back to simple 32 bit read cycles

Yes, I always test with single-word reads first, with the 32-bit block transfer second and try the 64-bit 
block transfer last.

Sometimes there are unrelated problems (with the VME modules, VME bus, etc, or
with bugs in the frontend, etc) and this approach helps to identify the source
of trouble.

> and using 
> the fevme.cxx template's method of organizing data before sending them to the 
> buffer, what kind of deadtime should I expect? Am I wrong to assume that this 
> would result in deadtime at all? I'm using a CAEN V792n 16 channel QDC and the hit 
> frequency that I'm using to test is 20kHz.

Yes, with asynchronous read using 64-bit block transfer, 20 kHz should be achievable.

The old fevme frontend is based on the mfe.c framework and implementing
async readout requires special contortions. The structure of the new TMFE C++ frontend
class is supposed to make it easier, but I do not have an example TMFE based fevme yet.

P.S. Without using block transfer, your max rate is limited to:

16 channels, 1 word per channel, plus 1 header and 1 footer = 18 words (by luck, 64-bit aligned for 
correct BLT64 block read).

using VME single-word read at 1 us per transfer, 18 us per event = 55 kHz repetition rate.

(you do not say if you have any other VME modules you have to read)

K.O.
       Reply  16 Dec 2020, Isaac Labrie Boulay, Forum, Issues building banks. 
Thanks for the quick reply,

> > I'm currently trying to build events through doing block transfers.
> 
> I am confused by your question. I assume you read a CAEN V792 ADC, but I do not know what VME master you 
> use. The restrictions on data alignment come from the VME master.
> I am mostly familiar with restrictions of UniverseII and tsi148 PCI-VME bridges.
> I think there is no restriction for USB-VME bridges and similar.
> 
> Anyhow. Which block transfer do you use? 32-bit block transfer (BLT32)? 64-bit block transfer (MBLT64)? 
> (no 128-bit 2eVME/2eSST transfers from the V792). Maybe the "simulated block transfer" (DMA engine uses 
> single-word reads instead of block transfer)?

I read a single CAEN V792n QDC, 18 words, and a single CAEN V1190 TDC, 2 channels so 8 words. When I poll, I 
read on every poll_event() and read whatever data is in whatever module (TDC_dataready || QDC_dataready). The 
VME master that I'm using to talk to the modules is a CAEN V1718. I am trying to read data by BLT32. Sorry for 
the confusing question (Can you tell I'm an intern?).

> > The worry was that organizing and packaging bank data into an array would produce too much dead time 
> causing too many missed events.
> 
> Valid concerns.
> 
> > I'm running into all sorts of issues such as unaligned transfers where the QDC events are unaligned, or 
> improperly aligned banks.
> 
> You should not see any problems with unaligned transfers if you give the DMA engine
> correct memory addresses as required by the hardware:
> 
> - always aligned to 32-bit (4 bytes, last two address bits set to 0)
> - aligned to 64-bits for MBLT64 64-bit transfers, this would be the normal case for the V792 (8 bytes, 
> last 3 address bits set to 0)
> - aligned to 128-bits for 2eVME/2eSST transfers (16 bytes, last 4 bits of address are zero).
> 
> You also need to specify correct amount of data to read: number of bytes should be multiple of 4 for 32-
> bit transfers, multiple to 8 for 64-bit transfers and multiple of 16 for 128-bit transfers (2eVME/2eSST).

I am transferring 32-bit words. Transferring 32-bit words should always read multiples of 4 bytes so that's 
good.

> Very often this requires reading "extra" data words. Most VME modules can generate extra pad words to 
> align event length to DMA restrictions. Sometimes you need to
> enable this in a control register (V792, V1190).
> 
> > Giving me a headache.
> 
> Me too. MIDAS recently introduced the QWORD 64-bit data type, banks of this type
> should have correct alignment for 64-bit VME block transfers. But for 2eVME/2eSST
> transfers, I still have to ensure alignment "by hand" (SIS3820, VF48, etc).
> 
> With QWORD banks, you need to use bk_init32a() instead of bk_init32().
> 
> > My question is, if I were to revert back to simple 32 bit read cycles
> 
> Yes, I always test with single-word reads first, with the 32-bit block transfer second and try the 64-bit 
> block transfer last.
> 
> Sometimes there are unrelated problems (with the VME modules, VME bus, etc, or
> with bugs in the frontend, etc) and this approach helps to identify the source
> of trouble.
> 
> > and using 
> > the fevme.cxx template's method of organizing data before sending them to the 
> > buffer, what kind of deadtime should I expect? Am I wrong to assume that this 
> > would result in deadtime at all? I'm using a CAEN V792n 16 channel QDC and the hit 
> > frequency that I'm using to test is 20kHz.
> 
> Yes, with asynchronous read using 64-bit block transfer, 20 kHz should be achievable.
> 
> The old fevme frontend is based on the mfe.c framework and implementing
> async readout requires special contortions. The structure of the new TMFE C++ frontend
> class is supposed to make it easier, but I do not have an example TMFE based fevme yet.
> 
> P.S. Without using block transfer, your max rate is limited to:
> 
> 16 channels, 1 word per channel, plus 1 header and 1 footer = 18 words (by luck, 64-bit aligned for 
> correct BLT64 block read).
> 
> using VME single-word read at 1 us per transfer, 18 us per event = 55 kHz repetition rate.
> 
> (you do not say if you have any other VME modules you have to read)
> 

Okay so transferring 18 + 6 words should give me close to 40kHz repetition rate. That's good news. I will just 
stick to 1 word transfers.

The way that transfers are done in the fevme.cxx requires iterating through 16 word arrays a number of time (3 
times I believe if you include the iterations taking place in v792_EventRead()). Does that not pose a 
significant deadtime concern? 

> K.O.

Thanks again for taking the time to help me out!

Cheers.

Isaac
          Reply  16 Dec 2020, Konstantin Olchanski, Forum, Issues building banks. 
> > > I'm currently trying to build events through doing block transfers.
> > 
> > I am confused by your question. I assume you read a CAEN V792 ADC, but I do not know what VME master you 
> > use. The restrictions on data alignment come from the VME master.
> > I am mostly familiar with restrictions of UniverseII and tsi148 PCI-VME bridges.
> > I think there is no restriction for USB-VME bridges and similar.
> > 
> > Anyhow. Which block transfer do you use? 32-bit block transfer (BLT32)? 64-bit block transfer (MBLT64)? 
> > (no 128-bit 2eVME/2eSST transfers from the V792). Maybe the "simulated block transfer" (DMA engine uses 
> > single-word reads instead of block transfer)?
> 
> I read a single CAEN V792n QDC, 18 words, and a single CAEN V1190 TDC, 2 channels so 8 words. When I poll, I 
> read on every poll_event() and read whatever data is in whatever module (TDC_dataready || QDC_dataready). The 
> VME master that I'm using to talk to the modules is a CAEN V1718. I am trying to read data by BLT32. Sorry for 
> the confusing question (Can you tell I'm an intern?).
> 

Ok, I see. Using the normal mfe.c structure, you will not be able to read the VME modules
at maximum speed. This is because you must have two concurrent activities happening at the same time:

(1) tell the VME bridge to read data,
(2) package this data into midas banks and events and write it to the MIDAS event buffer.

If you do these tasks sequentially, obviously the VME bus will be idle during step (2),
and unless (2) takes 0 seconds (it does not) you will have a slow down.

So for maximum data rate, I prefer to have 3 threads:

thread 1: run the VME transfers, store data in circular buffer (today it would be std::deque<std::vector<char>>)
thread 2: encode the data into midas banks and midas events, store completed events in a circular buffer 
(std::deque<EVENT_HEADER*>).
thread 3: write data to midas event buffer (call bm_send_event(), etc)

This is very hard to do using the mfe.c frontend. (the main reason I wrote the TMFE C++ frontend class).

>
> Okay so transferring 18 + 6 words should give me close to 40kHz repetition rate. That's good news. I will just 
> stick to 1 word transfers.
>

I do not know the timing of CAEN V1718 single-word transfers. It may be significantly longer than 1 us:

V7865: DWORD read - CPU - PCI bus - tsi148 - VME
V1718: encode request as USB packet - CPU - PCI bus - USB hub - USB bus - USB asic - FPGA - VME (on the way back, 
"extract data from USB packet")

> 
> The way that transfers are done in the fevme.cxx requires iterating through 16 word arrays a number of time (3 
> times I believe if you include the iterations taking place in v792_EventRead()). Does that not pose a 
> significant deadtime concern? 
> 

Hmm... I am not sure what fevme you refer to. I guess I can find version of fevme.cxx where data is read at
maximum VME speed if you want it.

K.O.
             Reply  16 Dec 2020, Isaac Labrie Boulay, Forum, Issues building banks. 
> > > > I'm currently trying to build events through doing block transfers.
> > > 
> > > I am confused by your question. I assume you read a CAEN V792 ADC, but I do not know what VME master you 
> > > use. The restrictions on data alignment come from the VME master.
> > > I am mostly familiar with restrictions of UniverseII and tsi148 PCI-VME bridges.
> > > I think there is no restriction for USB-VME bridges and similar.
> > > 
> > > Anyhow. Which block transfer do you use? 32-bit block transfer (BLT32)? 64-bit block transfer (MBLT64)? 
> > > (no 128-bit 2eVME/2eSST transfers from the V792). Maybe the "simulated block transfer" (DMA engine uses 
> > > single-word reads instead of block transfer)?
> > 
> > I read a single CAEN V792n QDC, 18 words, and a single CAEN V1190 TDC, 2 channels so 8 words. When I poll, I 
> > read on every poll_event() and read whatever data is in whatever module (TDC_dataready || QDC_dataready). The 
> > VME master that I'm using to talk to the modules is a CAEN V1718. I am trying to read data by BLT32. Sorry for 
> > the confusing question (Can you tell I'm an intern?).
> > 
> 
> Ok, I see. Using the normal mfe.c structure, you will not be able to read the VME modules
> at maximum speed. This is because you must have two concurrent activities happening at the same time:
> 

I am using the mfe.cxx backend thread, I'm guessing that this is the file you are referring to.

> (1) tell the VME bridge to read data,
> (2) package this data into midas banks and events and write it to the MIDAS event buffer.
> 
> If you do these tasks sequentially, obviously the VME bus will be idle during step (2),
> and unless (2) takes 0 seconds (it does not) you will have a slow down.
> 

I see.


> So for maximum data rate, I prefer to have 3 threads:
> 
> thread 1: run the VME transfers, store data in circular buffer (today it would be std::deque<std::vector<char>>)
> thread 2: encode the data into midas banks and midas events, store completed events in a circular buffer 
> (std::deque<EVENT_HEADER*>).
> thread 3: write data to midas event buffer (call bm_send_event(), etc)
> 
> This is very hard to do using the mfe.c frontend. (the main reason I wrote the TMFE C++ frontend class).

Yes it seems like a bit of work

> >
> > Okay so transferring 18 + 6 words should give me close to 40kHz repetition rate. That's good news. I will just 
> > stick to 1 word transfers.
> >
> 
> I do not know the timing of CAEN V1718 single-word transfers. It may be significantly longer than 1 us:
> 
> V7865: DWORD read - CPU - PCI bus - tsi148 - VME
> V1718: encode request as USB packet - CPU - PCI bus - USB hub - USB bus - USB asic - FPGA - VME (on the way back, 
> "extract data from USB packet")

I found the following information in the CAEN V1718 manual:

"Transfer Rate = ~30MByte/s. Transfer rate supported in MBLT read cycles (block size = 32 kb), using a PC host with 
Windows XP or Linux and High Speed USB"

I'm guessing the sentence simply means that the rate increases with multiplexed block transfers. If the transfer rate 
is 30MBytes/s I should be able to write words at a transfer rate of 7500000 words per second.

> 
> > 
> > The way that transfers are done in the fevme.cxx requires iterating through 16 word arrays a number of time (3 
> > times I believe if you include the iterations taking place in v792_EventRead()). Does that not pose a 
> > significant deadtime concern? 
> > 
> 
> Hmm... I am not sure what fevme you refer to. I guess I can find version of fevme.cxx where data is read at
> maximum VME speed if you want it.

This is the VME C++ frontend example in the directory /midas/examples/Triumf/c++/

If you can find a faster version of this code I would definitely like to check it out!

> 
> K.O.


Thanks again.

Isaac
             Reply  16 Dec 2020, Stefan Ritt, Forum, Issues building banks. 
> This is very hard to do using the mfe.c frontend. (the main reason I wrote the TMFE C++ frontend class).

Actually that's not true. Just look at 

midas/examples/mtfe/mtfe.c

this is an example for a frontend with equipment with the EQ_USER flag, which allows you easily to run a separate 
thread (or more) for event collection and processing. Of course all old-fashioned C style (code is from 2007) but it 
works.

Stefan
                Reply  16 Dec 2020, Isaac Labrie Boulay, Forum, Issues building banks. 
> > This is very hard to do using the mfe.c frontend. (the main reason I wrote the TMFE C++ frontend class).
> 
> Actually that's not true. Just look at 
> 
> midas/examples/mtfe/mtfe.c
> 
> this is an example for a frontend with equipment with the EQ_USER flag, which allows you easily to run a separate 
> thread (or more) for event collection and processing. Of course all old-fashioned C style (code is from 2007) but it 
> works.
> 
> Stefan

Thank you sir I'll give it a look.

Cheers

Isaac
Entry  24 Nov 2020, Isaac Labrie Boulay, Forum, Invalid name "Analyzer/Tests" 
Hi everyone,

I've recently took the analyzer template from $MIDASSYS/examples/experiment and 
modified it to be able to use Roody on a very simple frontend setup. The 
analyzer works fine and I am able to view the online histograms but my console 
prints out this error:

[Analyzer,ERROR] [odb.cxx:919:db_validate_name,ERROR] Invalid name 
"/Analyzer/Tests/Always true/Rate [Hz]" passed to db_create_key_wlocked: should 
not contain "["                      
[Analyzer,ERROR] [odb.cxx:919:db_validate_name,ERROR] Invalid name 
"/Analyzer/Tests/low_sum/Rate [Hz]" passed to db_create_key_wlocked: should not 
contain "["
[Analyzer,ERROR] [odb.cxx:919:db_validate_name,ERROR] Invalid name 
"/Analyzer/Tests/high_sum/Rate [Hz]" passed to db_create_key_wlocked: should not 
contain "["

The error keeps getting printed even after stopping the run.

I do have these 3 keys under Analyzer/Tests/ in my ODB but I do not know where 
they come from. Any suggestions on what the root of the issue is?

Thanks for the help!

Isaac
    Reply  27 Nov 2020, Konstantin Olchanski, Forum, Invalid name "Analyzer/Tests" 
> I've recently took the analyzer template from $MIDASSYS/examples/experiment and 
> modified it to be able to use Roody on a very simple frontend setup.

Hmm... the old midas analyzer framework is very old and I do not recommend
to use it for new experiments.

A newer analyzer system is ROOTANA and an even newer is the "m" analyzer (manalyzer). These
analyzers progressively introduce improved c++-style programming environments amongst other
improvements. If starting from scratch, I recommend that you use manalyzer (currently from the rootana
git repository).

> The analyzer works fine and I am able to view the online histograms but my console 
> prints out this error:
> 
> [Analyzer,ERROR] [odb.cxx:919:db_validate_name,ERROR] Invalid name 
> "/Analyzer/Tests/Always true/Rate [Hz]" passed to db_create_key_wlocked: should 
> not contain "["

The error says what it means. "[" is not a permitted character in odb names. It is used
by many odb functions to access array elements.

The midas analyzer example should be updated to change "[Hz]" to "(Hz)" or something similar.

K.O.
       Reply  27 Nov 2020, Konstantin Olchanski, Forum, Invalid name "Analyzer/Tests" 
https://bitbucket.org/tmidas/midas/issues/298/invalid-odb-names-in-example-midas
K.O.
          Reply  07 Dec 2020, Isaac Labrie Boulay, Forum, Invalid name "Analyzer/Tests" 
> https://bitbucket.org/tmidas/midas/issues/298/invalid-odb-names-in-example-midas
> K.O.

Hi K.O.

Ok I see, I will use the most up to date analyzer.

Thanks a ton for your help.

Isaac
Entry  24 Sep 2020, Gennaro Tortone, Forum, subrun  
Hi,

I was wondering if there is a "mechanism" to run an executable
file after each subrun is closed...

I need to convert .mid.lz4 subrun files to ROOT (TTree) files;

Thanks,
Gennaro
    Reply  01 Dec 2020, Stefan Ritt, Forum, subrun  
There is no "mechanism" foreseen to be executed after each subrun. But you could 
run a shell script after each run which loops over all subruns and converts them 
one after the other.

Stefan

> Hi,
> 
> I was wondering if there is a "mechanism" to run an executable
> file after each subrun is closed...
> 
> I need to convert .mid.lz4 subrun files to ROOT (TTree) files;
> 
> Thanks,
> Gennaro
       Reply  01 Dec 2020, Ben Smith, Forum, subrun  
We use the lazylogger for something similar to this. You can specify the path to a custom script, and it will be run for each midas file that gets written:
https://midas.triumf.ca/MidasWiki/index.php/Lazylogger#Using_a_script
This means that you don't have to wait until the end of the run to start processing.

If the ROOT conversion is going to be slow, but you have a batch system available, you could use the lazylogger script to submit a job to the batch system for each file.

> 
> > Hi,
> > 
> > I was wondering if there is a "mechanism" to run an executable
> > file after each subrun is closed...
> > 
> > I need to convert .mid.lz4 subrun files to ROOT (TTree) files;
> > 
> > Thanks,
> > Gennaro
Entry  30 Nov 2020, Konstantin Olchanski, Info, more wisdom from linux kernel people 
As you may know, I am a big fan of two software projects - the linux kernel and ROOT. The linux kernel is one of 
the few software projects "done right". ROOT is where normal people try to "get it right" with real-world level 
of success. I use both softwares daily and I try to apply their ways and methods to MIDAS as much as I can.

So just in time for our discussion of array indexes, a talk by gregkh shows
up on slashdot. The title is "how to keep your users happy". (Nobody
ever wants to be nasty to their users, but do read his talk).

https://git.sr.ht/~gregkh/presentation-application_summit/tree/main/keep_users_happy.pdf

The talk refers to some older stuff, still relevant, of course, in case you miss the links
in the pdf file, here they are:

https://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html
https://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html
https://ozlabs.org/~rusty/ols-2003-keynote/img0.html (click on "continue" to see next page)

K.O.
Entry  24 Nov 2020, Amy Roberts, Suggestion, ODBSET wildcards with array keys in Sequencer files 
I'm interested in using the matching feature for ODBSET explained on 
https://midas.triumf.ca/MidasWiki/index.php/Sequencer for settings that are in an 
array, like:

COMMENT "Ground the detectors"
ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[?]" 0

Currently I get an error when I try to run this script.  Is this expected?  Would it 
be possible to implement matching for array values?

Thanks!
    Reply  25 Nov 2020, Marco Francesconi, Suggestion, ODBSET wildcards with array keys in Sequencer files 
Hi,
I guess the issue is in the "[?]" part of the command, the indexing is handled differently from the odb path and does not 
support "?".
Are you trying to set only the first 9 channels?
Could you try with "[*]" or "[0-9]" instead?

Marco

> I'm interested in using the matching feature for ODBSET explained on 
> https://midas.triumf.ca/MidasWiki/index.php/Sequencer for settings that are in an 
> array, like:
> 
> COMMENT "Ground the detectors"
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[?]" 0
> 
> Currently I get an error when I try to run this script.  Is this expected?  Would it 
> be possible to implement matching for array values?
> 
> Thanks!
       Reply  25 Nov 2020, Amy Roberts, Suggestion, ODBSET wildcards with array keys in Sequencer files 
The following all fail with "Cannot find ODB key "<key>""

ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0


> Hi,
> I guess the issue is in the "[?]" part of the command, the indexing is handled differently from the odb path and does not 
> support "?".
> Are you trying to set only the first 9 channels?
> Could you try with "[*]" or "[0-9]" instead?
> 
> Marco
> 
> > I'm interested in using the matching feature for ODBSET explained on 
> > https://midas.triumf.ca/MidasWiki/index.php/Sequencer for settings that are in an 
> > array, like:
> > 
> > COMMENT "Ground the detectors"
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[?]" 0
> > 
> > Currently I get an error when I try to run this script.  Is this expected?  Would it 
> > be possible to implement matching for array values?
> > 
> > Thanks!
          Reply  25 Nov 2020, Marco Francesconi, Suggestion, ODBSET wildcards with array keys in Sequencer files 
I created some keys in my ODB to try to match yours.
The ODBSET commands you wrote are all working fine (of course with different results), except only for the "/Detectors/Det*/Settings/Charge/Bias (V)*" which I will have to 
look into.
In any case the error message I'm getting is "could not match ay key" and not the one you are reporting.

Now I'm a bit puzzled:
Are you sure your ODB contains those keys?
Are you testing the ODBSET inside a more complex sequencer or on its own?

Maybe I can try to reproduce it using your ODB setup.
Could you send an ODB dump of the "/Detectors" folder using the "save" command of odbedit ("cd /Detectors" and then "save detector.odb")?

Best,

Marco


> The following all fail with "Cannot find ODB key "<key>""
> 
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0
> 
> 
> > Hi,
> > I guess the issue is in the "[?]" part of the command, the indexing is handled differently from the odb path and does not 
> > support "?".
> > Are you trying to set only the first 9 channels?
> > Could you try with "[*]" or "[0-9]" instead?
> > 
> > Marco
> > 
> > > I'm interested in using the matching feature for ODBSET explained on 
> > > https://midas.triumf.ca/MidasWiki/index.php/Sequencer for settings that are in an 
> > > array, like:
> > > 
> > > COMMENT "Ground the detectors"
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[?]" 0
> > > 
> > > Currently I get an error when I try to run this script.  Is this expected?  Would it 
> > > be possible to implement matching for array values?
> > > 
> > > Thanks!
             Reply  25 Nov 2020, Amy Roberts, Suggestion, ODBSET wildcards with array keys in Sequencer files 
I think the issue may be the version of MIDAS I'm using.  Mine is current as of February 4, 2020.  

But since then there have been changes to the sequencer code, specifically parts that handle indexing.

I'll try this out with an updated version of MIDAS and report back if there are still any issues after updating.

> I created some keys in my ODB to try to match yours.
> The ODBSET commands you wrote are all working fine (of course with different results), except only for the "/Detectors/Det*/Settings/Charge/Bias (V)*" which I will have to 
> look into.
> In any case the error message I'm getting is "could not match ay key" and not the one you are reporting.
> 
> Now I'm a bit puzzled:
> Are you sure your ODB contains those keys?
> Are you testing the ODBSET inside a more complex sequencer or on its own?
> 
> Maybe I can try to reproduce it using your ODB setup.
> Could you send an ODB dump of the "/Detectors" folder using the "save" command of odbedit ("cd /Detectors" and then "save detector.odb")?
> 
> Best,
> 
> Marco
> 
> 
> > The following all fail with "Cannot find ODB key "<key>""
> > 
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0
> > 
> > 
> > > Hi,
> > > I guess the issue is in the "[?]" part of the command, the indexing is handled differently from the odb path and does not 
> > > support "?".
> > > Are you trying to set only the first 9 channels?
> > > Could you try with "[*]" or "[0-9]" instead?
> > > 
> > > Marco
> > > 
> > > > I'm interested in using the matching feature for ODBSET explained on 
> > > > https://midas.triumf.ca/MidasWiki/index.php/Sequencer for settings that are in an 
> > > > array, like:
> > > > 
> > > > COMMENT "Ground the detectors"
> > > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[?]" 0
> > > > 
> > > > Currently I get an error when I try to run this script.  Is this expected?  Would it 
> > > > be possible to implement matching for array values?
> > > > 
> > > > Thanks!
          Reply  27 Nov 2020, Konstantin Olchanski, Suggestion, ODBSET wildcards with array keys in Sequencer files 
> The following all fail with "Cannot find ODB key "<key>""
> 
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
> ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0
> 

It would be cool if ODB pattern matching in the sequencer
were consistent with the ODB pattern matching in the json-rpc
interface for web pages:

https://midas.triumf.ca/MidasWiki/index.php/Mjsonrpc#Supported_array_index_syntax

K.O.
             Reply  30 Nov 2020, Marco Francesconi, Suggestion, ODBSET wildcards with array keys in Sequencer files 
I totally agree that we should have a consistent formatting for array index expansion.
I had a look to the mjsonrpc code and I found the function parse_array_index_list(...) which does this job.
I have a similar function (adapted form previous code) in odb.cxx called strarrayindex(...) that is designed for the same "consistency" purposes between odbedit and sequencer.

Let me put few points that I noticed:
- mjsonrpc has a very different way to write the full array (no indexes given) while currently sequencer requires "[*]" to do the same (otherwise it only changes the first value of the array)
- currently the sequencer and the underlying ODB calls use two indexes (that are the same if you want to write only one key) so we will need a serious rewriting to allow something like "ODBSET array[1,3,5]"
- if I correctly understood the code, mjsonrpc instead generates a list of indices and then calls an ODB write on each of them. That's not always a good thing, for example if you are writing an array of n parameters on a DAQ 
board you will call the hotlink on that key n times
- in addition to that the sequencer will also have to cope with variable-based indexes like "ODBSET array[$val]", but then how it should parse something like "[$a,1]" or "[$a*]"?

For the very first point I do not see a clean way to do this without breaking the compatibility of existing sequencers or having a difference between the two implementations.
For the others I guess we can find a way out, however that's a major modification so I will put it on my todo list when I can find some free time.
In any case I would propose to merge the two functions, so we have only to maintain a single implementation of the parsing.

I guess it's a good moment to brainstorm about that, let me know what you think

Marco


> > The following all fail with "Cannot find ODB key "<key>""
> > 
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
> > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0
> > 
> 
> It would be cool if ODB pattern matching in the sequencer
> were consistent with the ODB pattern matching in the json-rpc
> interface for web pages:
> 
> https://midas.triumf.ca/MidasWiki/index.php/Mjsonrpc#Supported_array_index_syntax
> 
> K.O.
                Reply  30 Nov 2020, Konstantin Olchanski, Suggestion, ODBSET wildcards with array keys in Sequencer files 
> I totally agree that we should have a consistent formatting for array index expansion.
> I had a look to the mjsonrpc code and I found the function parse_array_index_list(...) which does this job.

Yes, it is good to review this stuff. I think the json-rpc call should accept more array index patterns:

a[*] - whole array (even though it is unnatural use in javascript, we do not say "let a[*] = b[*]", we say "let a = b".
a[5-] - from 5th element to the end (in the case we do not know the length)
a[-10] - from first element to element 10 (this is same as a[0-10], but needed for consistency with previous case).

K.O.

> I have a similar function (adapted form previous code) in odb.cxx called strarrayindex(...) that is designed for the same "consistency" purposes between odbedit and sequencer.
> 
> Let me put few points that I noticed:
> - mjsonrpc has a very different way to write the full array (no indexes given) while currently sequencer requires "[*]" to do the same (otherwise it only changes the first value of the array)
> - currently the sequencer and the underlying ODB calls use two indexes (that are the same if you want to write only one key) so we will need a serious rewriting to allow something like "ODBSET array[1,3,5]"
> - if I correctly understood the code, mjsonrpc instead generates a list of indices and then calls an ODB write on each of them. That's not always a good thing, for example if you are writing an array of n parameters on a DAQ 
> board you will call the hotlink on that key n times
> - in addition to that the sequencer will also have to cope with variable-based indexes like "ODBSET array[$val]", but then how it should parse something like "[$a,1]" or "[$a*]"?
> 
> For the very first point I do not see a clean way to do this without breaking the compatibility of existing sequencers or having a difference between the two implementations.
> For the others I guess we can find a way out, however that's a major modification so I will put it on my todo list when I can find some free time.
> In any case I would propose to merge the two functions, so we have only to maintain a single implementation of the parsing.
> 
> I guess it's a good moment to brainstorm about that, let me know what you think
> 
> Marco
> 
> 
> > > The following all fail with "Cannot find ODB key "<key>""
> > > 
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[*]" 0
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[0-9]" 0
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)[1]" 0
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)*" 0
> > > ODBSET "/Detectors/Det*/Settings/Charge/Bias (V)" 0
> > > 
> > 
> > It would be cool if ODB pattern matching in the sequencer
> > were consistent with the ODB pattern matching in the json-rpc
> > interface for web pages:
> > 
> > https://midas.triumf.ca/MidasWiki/index.php/Mjsonrpc#Supported_array_index_syntax
> > 
> > K.O.
             Reply  30 Nov 2020, Stefan Ritt, Suggestion, ODBSET wildcards with array keys in Sequencer files 
Hi Konstantin,

we are considering to make the range selection uniform among json, sequencer and 
odbedit "set" command. Having multiple ranges like [1,4-5] will be quite some work, so 
my question is did you just implement it on the json side because it was easy, or are 
there experiments who really need it? Wouldn't it be enough to have

[*]
[n]
[n-m]

This way we always have only one db_set_data() value behind that. Any set of indices 
we have to split into several db_set_data(), which especially for the front-end 
configuration can cause trouble by triggering a hot link on each access.

Stefan
                Reply  30 Nov 2020, Konstantin Olchanski, Suggestion, ODBSET wildcards with array keys in Sequencer files 
> 
> we are considering to make the range selection uniform among json, sequencer and 
> odbedit "set" command. Having multiple ranges like [1,4-5] will be quite some work, so 
> my question is did you just implement it on the json side because it was easy, or are 
> there experiments who really need it? Wouldn't it be enough to have
> 
> [*]
> [n]
> [n-m]
> 

It has been a long time, but most likely I designed the interface to work this
way to permit maximum flexibility for writing into an array using just one
rpc operation.

The generalized form is:
[range,range,range...]

where range is:
array index or
index1-index2 or
index2-index1 (write in reverse order)

This is all documented here:
https://midas.triumf.ca/MidasWiki/index.php/Mjsonrpc#Supported_array_index_syntax

I think it is too late to change it.

>
> This way we always have only one db_set_data() value behind that. Any set of indices 
> we have to split into several db_set_data()
> 

Sounds good. I think it is easy to have a common implementation:

One would need following functions:
parse range selection from string into std::vector<int> of array indices (we already have it)
call db_set_data_range() (this is easy to add).

>
> trouble by triggering a hot link on each access.
>

There is no escaping this trouble. Half the experiments want notification
"per array", the other half, "per array element". We cannot choose one or the other for them,
we have to provide a way for the user to say how they want it.

P.S. With existing ODB calls, you cannot do [n-m] ranges. You can do whole array or you
can do one-element-at-a-time.

K.O.
Entry  17 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
Today I addressed a topic which bugged me since long time. The ODB contains 
settings under /Equipment/<name>/Common which are a "mirror" of the equipment[] 
setting in a frontend (using the mfe.cxx framework). If the "Common" entry in 
the ODB is not present (fresh experiment), the equipment[] settings from the 
frontend are copied to the ODB. But if it exists, it takes precedence over the 
equipment[] entries, which is wrong in my opinion. Like if you change some 
settings in equipment[] (like the logging period of the history), then recompile 
and restart the frontend, the old values in the ODB are kept and your 
modification in the frontend code has no effect.

Starting on commit c3017c6c on Nov. 17th 2020 I reversed the precedence: Now, on 
each start of the frontend program, the values from equipment[] are written to 
the ODB. They are still "live". If one changes them when the frontend is 
running, that change takes effect immediately. But on the next restart of the 
frontend, the old values from equipment[] is put back there.

I fell too many times into this trap, and I hope the modification helps 
everybody. If there are however experiments which rely on the fact that the 
common settings in the ODB are NOT overwritten by the frontend, please let me 
know and I can put a flag "EQUIPMENT_FE_PRECEDENCE = FALSE" somewhere to restore 
the old behaviour.

Stefan
    Reply  20 Nov 2020, Pierre-Andre Amaudruz, Info, Equipment "common" settings in ODB 
Indeed this "mirror" of the ODB in settings option can cause frustration in 
particular when we think the ODB is empty but is not.
In the other hand, over time the settings are adjusted to a particular 
configuration or touched or not by the individual run preset parameters. Later, if 
a bug or code correction requires multiple restart of the fe, for every start of 
the application, you loose the latest configuration. This can be frustrating as 
well until you force a post-setting or report the specifics parameters in the fe 
code.
BTW I believe, we originally went for the ODB priority for that specific reason.
 
I would be in favour for having a general flag (FALSE) in /experiment which would 
define this global behaviour.  
PAA

> Today I addressed a topic which bugged me since long time. The ODB contains 
> settings under /Equipment/<name>/Common which are a "mirror" of the equipment[] 
> setting in a frontend (using the mfe.cxx framework). If the "Common" entry in 
> the ODB is not present (fresh experiment), the equipment[] settings from the 
> frontend are copied to the ODB. But if it exists, it takes precedence over the 
> equipment[] entries, which is wrong in my opinion. Like if you change some 
> settings in equipment[] (like the logging period of the history), then recompile 
> and restart the frontend, the old values in the ODB are kept and your 
> modification in the frontend code has no effect.
> 
> Starting on commit c3017c6c on Nov. 17th 2020 I reversed the precedence: Now, on 
> each start of the frontend program, the values from equipment[] are written to 
> the ODB. They are still "live". If one changes them when the frontend is 
> running, that change takes effect immediately. But on the next restart of the 
> frontend, the old values from equipment[] is put back there.
> 
> I fell too many times into this trap, and I hope the modification helps 
> everybody. If there are however experiments which rely on the fact that the 
> common settings in the ODB are NOT overwritten by the frontend, please let me 
> know and I can put a flag "EQUIPMENT_FE_PRECEDENCE = FALSE" somewhere to restore 
> the old behaviour.
> 
> Stefan
    Reply  27 Nov 2020, Konstantin Olchanski, Info, Equipment "common" settings in ODB 
> Today I addressed a topic which bugged me since long time.

Right. No easy subject. For me, too, this has been a problem in MIDAS for a long time.

> Now, on each start of the frontend program, the values from equipment[] are written to 
> the ODB. They are still "live". If one changes them when the frontend is 
> running, that change takes effect immediately. But on the next restart of the 
> frontend, the old values from equipment[] is put back there.

There is a downside from this behaviour.

If some values in equipment/common are "live" and the user is expected to change them,
the user will be unpleasantly surprised when their changes magically disappear (after reboot,
after frontend crash, after run restart if experiment requires restarting some frontends
before starting a new run).

This change will also break some experiments that rely in things like specifying
event buffer names through ODB. But experiments can adapt and specify buffer names
through command line switch instead of ODB.

This new way also it makes the "live" Common/Period unusable. Sure I can speed up or slow
down a frontend even during the run, but if my change does not "stick", what good is it?

Personally, I think there is no easy solution for all these troubles.

I would advocate the following approach:

- think of MIDAS as a "mature" system,
- treasure backward compatibility
- (if we must break backward compatibility to introduce a new "must have" improvement, so be it)
- document how things work. if it is clearly written down what different fields in "common" do, fewer people 
"get burned" by unexpected or illogical things. (and any non-trivial system has plenty of those).

Going back to ODB equipment/common, my experience with midas and odb tells me
that one should avoid mixing together ODB entries set by user and ODB entries set by code.

For example, separating them as equipment/settings and equipment/variables works well. Mixing
them as in equipment/common and sequencer/state causes trouble.

So perhaps we should split Equipment/common into two pieces, user settable fields like
"Period" and "event buffer name" would move to equipment/settings or whatever.

This will open the discussion of which items in equipment/common should be user settable,
and some people would want event buffer specified in the code to prevail, while other
people would want the name from odb to prevail, and both are valid but conflicting preferences.

Or we could bite the bullet and say, equipment/common is controlled by the frontend code,
the user should not change it. (and mark it read-only in ODB).

For all the pain this may cause, at least this will make it self-consistent.

Per this proposal, in addition to Stefan's change, the hotlink on equipment/common goes away,
"period" is no longer "live" and the whole subdirectory is made "read-only".

K.O.
       Reply  27 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
Ok, so what about the following proposal:

- I change back the mfe.cxx code to behave like before (ODB has precedence and does not get overwritten when the 
front-end restarts)

- I add a global flag

BOOL equipment_common_overwrite;

and pre-set it to FALSE;

- So if nothing is changed the flag stays false and ODB keeps precedence

- If a frontend wants to overwrite equipment/common on each start, the user sets

BOOL equipment_common_overwrite = TRUE;

near the equipment[] structure in the front-end code. 

- If the flag is true, the mfe.cxx init code copies the equipment[] structure to the ODB on each frontend start

I believe this way we can keep backward compatibility, and add the new way with minimal effort. The only downside 
is that all frontends on this plane have to add at least "BOOL equipment_common_overwrite = FALSE;" in their 
code.

I know global variables are evil, but this way the user can just add the line above to the equipment[] array, so 
one sees this when one edits the equipment[] array, giving motivation to change as needed. So the code would be



BOOL equipment_common_overwrite = TRUE;

EQUIPMENT equipment[] = {
 ....
}



An alternative way would be to add a function

  set_equipment_common_overwrite(TRUE);

into the frontend_init() code. That's somehow cleaner (still needs an internal global variable), but it has to go 
into frontend_init() so won't be at the same place as the EQUIPMENT list in the frontend.

Thoughts?

Best,
Stefan
          Reply  27 Nov 2020, Konstantin Olchanski, Info, Equipment "common" settings in ODB 
Yes, I think this will work.

For old mfe.c frontends, global variable set to "do it the new way" should be okey,
new experiments will have it the new way. Old experiments, will be forced to add a one-line definition
of this global variable (otherwise mfe.o will not link), at that time they get to chose "new way" or "old way".

For the new TMFE c++ frontend, this will work naturally when they create the Equipment Common object,
in the object constructor, you can see how it explicitly honors or overwrites the ODB common entries.

The TMFE frontend does not do a live "period", so there should be no issue with that.

Should I open a bitbucket issue "update TMFE frontend to new Equipment/Common scheme", to make sure
I do not forget about it?

K.O.
             Reply  30 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
Ok, I implemented it the following way:

- Added a boolean flag "equipment_common_overwrite", which must be contained in EACH frontend, preferably just 
before the EQUIPMENT structure, such as:

BOOL equipment_common_overwrite = TRUE;

EQUIPMENT equipment[] = {
...
};

- If that flag is TRUE, then the contents of the "equipment" structure is copied to the ODB on each start of the 
front-end

- If the flag is FALSE, then the ODB values are kept on the start of the front-end

The setting of the flag depends now on the philosophy of the experiment. Some experiments say that everything 
needed should be in the front-end code, so when it starts everything gets set correctly. They don't change the 
values in the ODB, but in the frontend code, which then goes into their repository. Other experiments just need 
some default values from the frontend code, and the fine-tune things by changing values in the ODB. These 
experiments should set this flag to FALSE.

*****

Please note that EVERY frontend now needs this flag, so all of you have to add it to all of your front-ends, 
otherwise the front-end will not compile! I could not figure out how to this could be done without this 
requirement, since you can define a global variable only once.

*****


Stefan
                Reply  30 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB 
One more change: 

After using the new code for some hours, we realized that the "enabled" flag should not come from the frontend code, 
but always be defined by the ODB. So if you quickly have to disable some equipment because the associated hardware is 
off, you want to change this flag only in the ODB and not have to recompile the frontend. So we exclude that flag from 
being set by the frontend. It is anyhow special, because one sees all disable equipment in the main midas status page, 
so one knows what's on and what's off.

Please comment here if you think that change causes problem. Anyhow it's working now for the enabled flag as before 
all these changes.

Stefan
                   Reply  30 Nov 2020, Konstantin Olchanski, Info, Equipment "common" settings in ODB 
> One more change: 
> 
> After using the new code for some hours, we realized that the "enabled" flag should not come from the frontend code, 
> but always be defined by the ODB. So if you quickly have to disable some equipment because the associated hardware is 
> off, you want to change this flag only in the ODB and not have to recompile the frontend. So we exclude that flag from 
> being set by the frontend. It is anyhow special, because one sees all disable equipment in the main midas status page, 
> so one knows what's on and what's off.
> 
> Please comment here if you think that change causes problem. Anyhow it's working now for the enabled flag as before 
> all these changes.
> 

Good catch. I still think this is fundamentally impossible to "get right". But good, you
are now in the same boat with me. The documentation will read: "if flag is TRUE, these data fields
are read from ODB, if flag is FALSE, those other fields are read from ODB". I will have to check
how this will work out for the TMFE C++ frontend (I think both mfe.c and TMFE frontends should
work "the same").

I think we have at least one month to play with this, I do not think we can do the next release
of midas until January.

K.O.
Entry  06 Nov 2020, Alexandr Kozlinskiy, Suggestion, cmake build fixes 
hi,

there are several problems with current cmake build files in midas:
- not all systems have cuda libs in /usr/local/cuda
- not all cmake version like when redefining vars
  (i.e. redefining ROOT_CXX_FLAGS)
- c++ standard not matching the one used to build ROOT
- ROOTSYS is not needed to find ROOT (it is enough to have root in PATH)

I have posted pull request 'https://bitbucket.org/tmidas/midas/pull-requests/17'
which tries to fix some of the problems.
Tests and comments are welcome.
    Reply  27 Nov 2020, Konstantin Olchanski, Suggestion, cmake build fixes 
Hi, Alexandr, thank you for making improvements to MIDAS. I have some question
about your suggestions:

> there are several problems with current cmake build files in midas:
> - not all systems have cuda libs in /usr/local/cuda
> - not all cmake version like when redefining vars

we do not see these problems with the normal cmake on our current linux systems,
centos-7 and -8, Ubuntu LTS 18.04, 20.04.

so you have something different? can you be a bit more specific,
which version of cmake and which OS you have so see these troubles?

> - c++ standard not matching the one used to build ROOT
> - ROOTSYS is not needed to find ROOT (it is enough to have root in PATH)

Again ROOT tangles with the build of MIDAS.

MIDAS does not use ROOT. As a convenience to the users, we have a "ROOT output" driver
in mlogger and we build a special executable rmlogger with ROOT. Only this special
executable should be linked with ROOT and compiled with ROOT-specific flags.

The rest of the MIDAS build should not be affected by presence or absence of ROOT.

One would have to read old messages on this forum to understand this situation.

> 
> I have posted pull request 'https://bitbucket.org/tmidas/midas/pull-requests/17'
> which tries to fix some of the problems.
> Tests and comments are welcome.
>

I look at the diffs:

- CUDA detection is changed to "find_package(CUDA)". This code was added by Joseph and Ben, and there 
must be a reason why they did not use find_package(CUDA). They will have to sign-off on this change.

- ROOT related logic assumes that all of MIDAS will be built "the ROOT way". CFLAGS are changed, the C++ 
standard is changed, etc. this assumption is wrong. only rmlogger and rmana should be built "with ROOT".

If you want to follow through on this, I suggest that you split the pull request into two,
one pull request for the CUDA changes and one pull request for the ROOT changes. Also rework
your ROOT changes as I explained above (but also read all ROOT-related messages on this forum).

K.O.
Entry  05 Nov 2020, Isaac Labrie Boulay, Forum, Building an experiment using CAEN VME interface - unknown type name 'VARIANT_BOOL' 
Hi everyone,

I have been building an experiment using the v1718 CAEN interface to talk to my modules and I am using the CAENVMElib Linux Library (2.50). I've managed to deal with data type issues by including additional libraries to my driver code but there is one type error that persists:


In file included from /usr/include/CAENVMElib.h:27:0,
             	from include/v1718.h:25,
             	from v1718.c:26:
/usr/include/CAENVMEtypes.h:323:9: error: unknown type name ‘VARIANT_BOOL’
     	CAEN_BOOL cvDS0;  	/* Data Strobe 0 signal                     	*/


The header file used to defined the CAEN types (CAENVMEtypes.h) defines 'CAEN_BOOL' like this:


#ifdef LINUX
#define CAEN_BYTE   	unsigned char
#define CAEN_BOOL   	int
#else
#define CAEN_BYTE   	byte
#define CAEN_BOOL   	VARIANT_BOOL
#endif


Has anyone ever ran into that problem when setting up an experiment using the CAEN standard?

Thanks for your help.

Isaac
    Reply  05 Nov 2020, Pierre-Andre Amaudruz, Forum, Building an experiment using CAEN VME interface - unknown type name 'VARIANT_BOOL' 
Hi,

You're building under Linux like. You want to define the LINUX and skip the VARIANT_BOOL all together.
PAA

> Hi everyone,
> 
> I have been building an experiment using the v1718 CAEN interface to talk to my modules and I am using the CAENVMElib Linux Library (2.50). I've managed to deal with data type issues by including additional libraries to my driver code but there is one type error 
that persists:
> 
> 
> In file included from /usr/include/CAENVMElib.h:27:0,
>              	from include/v1718.h:25,
>              	from v1718.c:26:
> /usr/include/CAENVMEtypes.h:323:9: error: unknown type name ‘VARIANT_BOOL’
>      	CAEN_BOOL cvDS0;  	/* Data Strobe 0 signal                     	*/
> 
> 
> The header file used to defined the CAEN types (CAENVMEtypes.h) defines 'CAEN_BOOL' like this:
> 
> 
> #ifdef LINUX
> #define CAEN_BYTE   	unsigned char
> #define CAEN_BOOL   	int
> #else
> #define CAEN_BYTE   	byte
> #define CAEN_BOOL   	VARIANT_BOOL
> #endif
> 
> 
> Has anyone ever ran into that problem when setting up an experiment using the CAEN standard?
> 
> Thanks for your help.
> 
> Isaac
       Reply  06 Nov 2020, Isaac Labrie Boulay, Forum, Building an experiment using CAEN VME interface - unknown type name 'VARIANT_BOOL' 
Yes, you are right. That fixed it and my frontend is compiling.

Thanks Pierre-Andre.

Isaac


> Hi,
> 
> You're building under Linux like. You want to define the LINUX and skip the VARIANT_BOOL all together.
> PAA
> 
> > Hi everyone,
> > 
> > I have been building an experiment using the v1718 CAEN interface to talk to my modules and I am using the CAENVMElib Linux Library (2.50). I've managed to deal with data type issues by including additional libraries to my driver code but there is one type error 
> that persists:
> > 
> > 
> > In file included from /usr/include/CAENVMElib.h:27:0,
> >              	from include/v1718.h:25,
> >              	from v1718.c:26:
> > /usr/include/CAENVMEtypes.h:323:9: error: unknown type name ‘VARIANT_BOOL’
> >      	CAEN_BOOL cvDS0;  	/* Data Strobe 0 signal                     	*/
> > 
> > 
> > The header file used to defined the CAEN types (CAENVMEtypes.h) defines 'CAEN_BOOL' like this:
> > 
> > 
> > #ifdef LINUX
> > #define CAEN_BYTE   	unsigned char
> > #define CAEN_BOOL   	int
> > #else
> > #define CAEN_BYTE   	byte
> > #define CAEN_BOOL   	VARIANT_BOOL
> > #endif
> > 
> > 
> > Has anyone ever ran into that problem when setting up an experiment using the CAEN standard?
> > 
> > Thanks for your help.
> > 
> > Isaac
    Reply  27 Nov 2020, Konstantin Olchanski, Forum, Building an experiment using CAEN VME interface - unknown type name 'VARIANT_BOOL' 
> 
> The header file used to defined the CAEN types (CAENVMEtypes.h) defines 'CAEN_BOOL' like this:
> 
> 
> #ifdef LINUX
> #define CAEN_BYTE   	unsigned char
> #define CAEN_BOOL   	int
> #else
> #define CAEN_BYTE   	byte
> #define CAEN_BOOL   	VARIANT_BOOL
> #endif
> 

Complain to CAEN.

The year is 2020 and they should use standard C/C++ data types from stdint.h (uint32_t, etc).

K.O.
Entry  19 Nov 2020, Joseph McKenna, Forum, History plot consuming too much memory 

A user reported an issue that if they were to plot some history data from 
2019 (a range of one day), the plot would spend ~4 minutes loading then 
crash the browser tab. This seems to effect chrome (under default settings) 
and not firefox

I can reproduce the issue, "Data Being Loaded" shows, then the page and 
canvas loads, then all variables get a correct "last data" timestamp, then 
the 'Updating data ...' status shows... then the tab crashes (chrome)


It seems that the browser is loading all data until the present day (maybe 4 
Gb of data in this case). In chrome the tab then crashes. In firefox, I do 
not suffer the same crash, but I can see the single tab is using ~3.5 Gb of 
RAM

Tested with midas-2020-08-a up until the HEAD of develop

I could propose the user use firefox, or increase the memory limit in 
chrome, however are there plans to limit the data loaded when specifically 
plotting between two dates?
    Reply  19 Nov 2020, Stefan Ritt, Forum, History plot consuming too much memory 
The history code is right now programmes in such a way that when you request
an old time window, then all data from that window until the present date
gets loaded. When we implemented that, this worked fine for data ranges of 
several years with a delay of just a few seconds. Of course one can only
load that specific window, but when the user then scrolls right, one has to
append new data to the "right side" of the array stored in the browser. If the
user jumps to another location, then the browser has to keep track of which 
windows are loaded and which windows not, making the history code much more 
complicated. Therefore I'm only willing to spend a few days of solid work
if this really becomes a problem. 

Are you sure that the delay comes from the browser or actually from mhttpd
digging through GBytes of history data? I realized that you need solid state
disks to get a real quick response.

Stefan
       Reply  20 Nov 2020, Joseph McKenna, Forum, History plot consuming too much memory 
Poking at the behavior of this, its fairly clear the slow response is from the data 
being loaded off an HDD, when we upgrade this system we will allocate enough SSD 
storage for the histories.

Using Firefox has resolved this issue for the user's project here

Taking this down a tangent, I have a mild concern that a user could temporarily 
flood our gigabit network if we do have faster disks to read the history data. Have 
there been any plans or thoughts on limiting the bandwidth users can pull from 
mhttpd? I do not see this as a critical item as I can plan the future network 
infrastructure at the same time as the next system upgrade (putting critical data 
taking traffic on a separate physical network).

> Of course one can only
> load that specific window, but when the user then scrolls right, one has to
> append new data to the "right side" of the array stored in the browser. If the
> user jumps to another location, then the browser has to keep track of which 
> windows are loaded and which windows not, making the history code much more 
> complicated. Therefore I'm only willing to spend a few days of solid work
> if this really becomes a problem. 

For now the user here has retrieved all the data they need, and I can direct others 
towards mhist in the near future. Being able to load just a specific window would be 
very useful in the future, but I comprehend how it would be a spike in complexity.
          Reply  20 Nov 2020, Stefan Ritt, Forum, History plot consuming too much memory 
 > Taking this down a tangent, I have a mild concern that a user could temporarily 
> flood our gigabit network if we do have faster disks to read the history data. Have 
> there been any plans or thoughts on limiting the bandwidth users can pull from 
> mhttpd?

I guess this will not be network limiting but CPU limiting of the mhttpd process. But I'm 
not 100% sure, depends on the actual hardware. But even if we improve the history 
retrieval to "window only", the user could request all data form 2010 to 2020. So one 
would need some code which estimates the amount of data, then tell the user "do you really 
want that?". But still, a novice user can simply click "yes" without much of a thought. So 
in conclusion I believe proper user training is better than software limits. Like the 
other guy "I did 'rm -rf /', and now nothing works any more, can you help?".

Stefan
          Reply  27 Nov 2020, Konstantin Olchanski, Forum, History plot consuming too much memory 
>
> Taking this down a tangent, I have a mild concern that a user could temporarily 
> flood our gigabit network if we do have faster disks to read the history data.
>

By my measurements, right now our javascript code can reach 30-50-70% of Gige ethernet
bandwidth, so, no, we cannot flood the network just by making history plots.

(we cannot reach 100% because javascript code is not multithreaded,
it cycles through "request new data" and "decode javascript, make plot" states,
and the network is idle in this second state).

>
> Have there been any plans or thoughts on limiting the bandwidth users can pull from 
> mhttpd?
>

10gige networking is here (and 5 and 2.5 Gige, too). I would not worry too much
about saturating 1gige network interfaces.

>
> I do not see this as a critical item as I can plan the future network 
> infrastructure at the same time as the next system upgrade (putting critical data 
> taking traffic on a separate physical network).
>

10gige network between all computers, everything on SSD ZFS arrays, except
bulk data on ZFS HDD arrays (only for cost reasons $$$/TB).

K.O.
       Reply  27 Nov 2020, Konstantin Olchanski, Forum, History plot consuming too much memory 
> 
> Are you sure that the delay comes from the browser or actually from mhttpd
> digging through GBytes of history data?
>

I think we will need to address this question "head-on". The history plot
will need to display the following information:

"time to load data from disk: N seconds, time to transfer data to javascript: M 
seconds, time to make the plot: Q seconds".

The second and third items are already available, the first one will need
to be computed in mhttpd and passed to javascript.

K.O.
    Reply  27 Nov 2020, Konstantin Olchanski, Forum, History plot consuming too much memory 
> 
> Tested with midas-2020-08-a up until the HEAD of develop
> 

Just so you know, it took myself and Stefan quite a bit of effort
to improve memory and data handling in the new history plots
to be able to plot 1 year of data without bogging down too much. I got
to learn the google-chrome javascript cpu profiler, memory profiler
and the intricacies of javascript shift() and unshift() operators.

Before midas-2020-08-a, pressing the zoom-out button you would never
reach the javascript memory limit, the code would go into "100% cpu use"
and the browser tab will become progressively unresponsive well before
running out of memory. With the original code, our alpha-g history plots
could go back a few weeks at most, with the current code, we can go back
about 11 months. Compared to the old "C" history plots that can
do "last 10 years", no problem.

Loading all the history data into the browser is a design choice.

It has benefits and downsides.

The main benefit is that looking at immediate live data is much easier.

The main downside is that "plot last 10 years" becomes impossible.

As they say "appetite comes during eating", we have learned about these
downsides as we developed the new system. When we started, we did not
know much about javascript memory limits, cpu limits, etc. We did learn
a lot, though.

With the current code, we are limited to loading history data up to 50% of
the javascript memory limit. I know how to change the code to get up to 100%,
but I think it is not worth it, it still does not get as to plot "last 10 year".

We think the solution to recovering "last 10 years" capability is to use
binned data (which the history system can already deliver to javascript).
With binned data, the data volume in Mbytes remains constant, javascript
memory use has an upper-bound (we never use more memory than X Mbytes)
and data movement over the network is reduced.

Another way to look at this - typical display has only 1000-4000 vertical pixels,
it cannot physically display a bigger number of data points (no more
then 1 data point per pixel). So why load 1000000 data points when we only
can plot 1000-4000 of them?

So all the infrastructure for plotting binned data is already there,
but the javascript code still needs to be written. I think the biggest
challenge will be in blending or combining binned and unbinned data
on the same plot or in seamlessly switching the plot between binned and
unbinned data.

K.O.
       Reply  27 Nov 2020, Konstantin Olchanski, Forum, History plot consuming too much memory 
>
> With the current code, we are limited to loading history data up to 50% of
> the javascript memory limit.
>

The javascript memory limit itself seems to be a moving target. (google javascript 
memory limit, and good luck!).

Historically, javascript did not have any memory or cpu use limits, but with
the raise of abusive web sites, bitcoin miners, etc, I see browsers clamp down
on allowed/allocated CPU use (inactive tabs are throttled down). memory use
is already clamped down severely, on a 64 GB computer, a browser tab
can only allocate a handful of GBs.

This throttling of browser tabs is already intrusive enough that we need
to be careful in programming midas web pages. for examples throttled events
are not firing at the same rate or in the same order as in active tabs.

One logical conclusion of these restrictions could be that, eventually,
google-chrome permits only just enough cpu and memory to run gmail.

K.O.
Entry  13 Oct 2020, Soichiro Kuribayashi, Info, About remote control of front end part of MIDAS on chip 
Hello!

My name is Soichiro Kuribayashi and I am a Ph.D. student at Kyoto University. 
I'm a T2K collaborator and working for Super FGD which is new detector in ND280.

I'm a beginner of MIDAS and I've just started to develop the DAQ software with 
MIDAS for Super FGD.
For the DAQ of Super FGD, we will run remotely front end part of MIDAS on ZYNQ 
which is system on chip.

For this remote control of front end part with mserver, we have to mount home 
directory of DAQ PC(Cent OS8) on that of Linux on ZYNQ.
So I wonder if we should use NFS(Network file system) + NIS(Network information 
service) + autofs for the mounting. Is it correct?

If you have any information or any suggestion for the remote control on chip, 
please let me know.

Best regards,
Soichiro 
    Reply  13 Oct 2020, Konstantin Olchanski, Info, About remote control of front end part of MIDAS on chip 
> My name is Soichiro Kuribayashi and I am a Ph.D. student at Kyoto University. 
> I'm a T2K collaborator and working for Super FGD which is new detector in ND280.

Hi! I did much of the DAQ software for the original FGD. I hope I can help.

> For the DAQ of Super FGD, we will run remotely front end part of MIDAS on ZYNQ 
> which is system on chip.

This would be the same as the existing FGD. Inside the FGD DCC is a Virtex4 FPGA
with a 300MHz PPC CPU running Linux from a CompactFlash card (Kentaro-san did this 
part). On this linux system runs the FGD DCC midas frontend. It connects
to the FGD midas instance using the mserver. This frontend executable is
copied to the DCC using "scp", there is no common nfs mounted home directory.

> For this remote control of front end part with mserver, we have to mount home 
> directory of DAQ PC(Cent OS8) on that of Linux on ZYNQ.
> So I wonder if we should use NFS(Network file system) + NIS(Network information 
> service) + autofs for the mounting. Is it correct?

Since you have a bigger SOC and you can run pretty much a complete linux,
I do recommend that you go this route. During development it is very convenient
to have common home directories on the main machine and on the frontend fpga
machines.

But this is not necessary. the midas mserver connection does not require
common (nfs-mounted) home directory, you can copy the files to the frontend
fpga using scp and rsync and you can use the gdb "remote debugger" function.

I can also suggest that on your frontend SOC/FPGA machine, you boot linux
using the "nfs-root" method. This way, the local flash memory only
contains a boot loader (and maybe the linux kernel image, depending on
bootloader limitations). The rest of the linux rootfs can be on your
central development machine. This way management of flash cards,
confusion with different contents of local flash and need to make backups
of frontend machines is much reduced.

If you use a fast SSD and ZFS with deduplication, you will also have good
performance gain (NFS over 1gige network to server with fast SSD works
so much better compared to the very slow SD/MMC/NAND flash).

I can point you to some of my documentation how we do this.

>
> If you have any information or any suggestion for the remote control on chip, 
> please let me know.
> 

I would say you are on a good track. For early development on just one board,
pretty much any way you do it will work, but once you start scaling up
beyound 3-4-5 frontends, you will start seeing benefits from common NFS-mounted
home directories, NFS-root booted linux, etc.

And of course you may want to study the existing ND280/FGD DAQ. I hope you
have access to the running system at Jparc. If not, I have a copy of
pretty much everything (except for running hardware, it is stored in the basement, 
dead) and I can give you access.

P.S. This reminds me that the cascade software from ND280 (they key part
for connecting the FGD, the TPC, the slow controls & etc into one experiment)
was never merged into the midas repository. I opened a ticket for this,
now we will not forget again:

https://bitbucket.org/tmidas/midas/issues/291/import-cascase-frontend-from-t2k-
nd280-fgd

K.O.
       Reply  13 Oct 2020, Soichiro Kuribayashi, Info, About remote control of front end part of MIDAS on chip 
Dear Konstantin,

Thank you very much for your reply and detailed information.
I would appreciate if you could help us.

> I can also suggest that on your frontend SOC/FPGA machine, you boot linux
> using the "nfs-root" method. This way, the local flash memory only
> contains a boot loader (and maybe the linux kernel image, depending on
> bootloader limitations). The rest of the linux rootfs can be on your
> central development machine. This way management of flash cards,
> confusion with different contents of local flash and need to make backups
> of frontend machines is much reduced.

As you said, we can run complete Linux (Ubuntu 16) on ZYNQ and I'm using common NFS 
system now. However, I didn't know "nfs-root" method which you mentioned and this method 
seems to be reasonable way to just share linux rootfs.
First of all, I will try this method for simpler system.

> If you use a fast SSD and ZFS with deduplication, you will also have good
> performance gain (NFS over 1gige network to server with fast SSD works
> so much better compared to the very slow SD/MMC/NAND flash).
>
> I can point you to some of my documentation how we do this.

I'm concerned about such performance and I have checked the performance with common NFS 
over gige network and my DAQ PC roughly(data transfer rate ~ O(10) MByte/sec). However, I 
didn't know the ZFS and also how we can have performance gain with a fast SSD and ZFS.
Please let me know your documentation how to do it if possible.

> I would say you are on a good track. For early development on just one board,
> pretty much any way you do it will work, but once you start scaling up
> beyound 3-4-5 frontends, you will start seeing benefits from common NFS-mounted
> home directories, NFS-root booted linux, etc.

I'm developing with just one board and common NFS-mounted now. I'm looking forward to 
seeing such benefits when I will use multiple frontends.
 
> And of course you may want to study the existing ND280/FGD DAQ. I hope you
> have access to the running system at Jparc. If not, I have a copy of
> pretty much everything (except for running hardware, it is stored in the basement, 
> dead) and I can give you access.

I don't have access to the system at Jparc, but Nick has told us where FGD DAQ code is.
Is bellow URL everything of code of FGD DAQ?
https://git.t2k.org/hastings/fgddaq/-/tree/master

Best regards,
Soichiro
          Reply  20 Oct 2020, Stefan Ritt, Info, About remote control of front end part of MIDAS on chip 
We also use a Zynq chip and boot in the following order:

1. SD Card
   a. First Stage Bootloader
   b. PL Firmware
   c. UBOOT
2. NFS over Ethernet
   a. Linux kernel
   b. RootFS
   c. Mounting home directories


If you need details I can bring you in contact with the person who actually implemented that.

Best,
Stefan
             Reply  21 Oct 2020, Soichiro Kuribayashi, Info, About remote control of front end part of MIDAS on chip 
Dear Stefan,

Thank you very much for your help.

I have already contacted someone who has used ZYNQ in that order and It's working fine for now.
But, I'll let you know if something goes wrong.

Best regards,
Soichiro 
Entry  29 Sep 2020, Amy Roberts, Forum, using python client to start and stop run 
I'm using a python client to start and stop runs, and the following code *appears* 
to set the MIDAS state to "Run"

client.odb_set("/Runinfo/State", 3)

However, it doesn't seem to do other things associated with a run, like start 
accumulating events.

Is there a different way I should start the run from the python client?

Thanks!
    Reply  29 Sep 2020, Ben Smith, Forum, using python client to start and stop run 
The ODB variable "/Runinfo/State" is a symptom of starting/stopping a run, rather than the cause.

In C++, one uses `cm_transition()` to start/stop runs.

In python code you can use the `start_run()` and `stop_run()` functions from `midas.client`: https://bitbucket.org/tmidas/midas/src/00ff089a836100186e9b26b9ca92623e672f0030/python/midas/client.py#lines-793:808
       Reply  06 Oct 2020, Konstantin Olchanski, Forum, using python client to start and stop run 
> The ODB variable "/Runinfo/State" is a symptom of starting/stopping a run, rather than the cause.
> 
> In C++, one uses `cm_transition()` to start/stop runs.
> 
> In python code you can use the `start_run()` and `stop_run()` functions from `midas.client`: https://bitbucket.org/tmidas/midas/src/00ff089a836100186e9b26b9ca92623e672f0030/python/midas/client.py#lines-793:808

one can also run an external command: "mtransition START" and "mtransition STOP"

K.O.
Entry  02 Sep 2020, Ruslan Podviianiuk, Forum, Transition status message issue.png
Hello,

I got an error after start of run and it would be good to show this error (or 
errors) in UI that I am developing. I see this error in the Transition 
directory (please see the attached file). Is it possible to read the status 
message and error messages from the Transition directory using jsonrpc? If yes, 
could you please explain me how to do this.

Thank you.
Ruslan  
    Reply  02 Sep 2020, Ben Smith, Forum, Transition status message 
The information you want is in the ODB:
* "/System/Transition/status" is the overall integer status code.
* "/System/Transition/error" is the overall error message string.

There is also per-client status information in the ODB:
* "/System/Transition/Clients/<client_name>/status"
* "/System/Transition/Clients/<client_name>/error"
       Reply  02 Sep 2020, Ruslan Podviianiuk, Forum, Transition status message 
> The information you want is in the ODB:
> * "/System/Transition/status" is the overall integer status code.
> * "/System/Transition/error" is the overall error message string.
> 
> There is also per-client status information in the ODB:
> * "/System/Transition/Clients/<client_name>/status"
> * "/System/Transition/Clients/<client_name>/error"


Thank you so much, Ben!
          Reply  08 Sep 2020, Konstantin Olchanski, Forum, Transition status message 
> > The information you want is in the ODB:
> > * "/System/Transition/status" is the overall integer status code.
> > * "/System/Transition/error" is the overall error message string.
> > 
> > There is also per-client status information in the ODB:
> > * "/System/Transition/Clients/<client_name>/status"
> > * "/System/Transition/Clients/<client_name>/error"

You can also use web page .../resources/transition.html as an example of how
to read transition (and other) data from ODB into your own web page. example.html
may also be helpful.

K.O.
             Reply  08 Sep 2020, Ruslan Podviianiuk, Forum, Transition status message 
> > > The information you want is in the ODB:
> > > * "/System/Transition/status" is the overall integer status code.
> > > * "/System/Transition/error" is the overall error message string.
> > > 
> > > There is also per-client status information in the ODB:
> > > * "/System/Transition/Clients/<client_name>/status"
> > > * "/System/Transition/Clients/<client_name>/error"
> 
> You can also use web page .../resources/transition.html as an example of how
> to read transition (and other) data from ODB into your own web page. example.html
> may also be helpful.
> 
> K.O.

Thank you Konstantin!

Ruslan
Entry  08 Sep 2020, Zaher Salman, Forum, json parser error 
I am getting the following error alert in a custom page whenever a run starts

json parser exception: SyntaxError: Unexpected token < in JSON at position 985, batch request: method: "db_get_values", params: [object Object], id: 1598691925697 method: "get_alarms", params: null, id: 1598691925697 method: "cm_msg_retrieve", params: [object Object], id: 1598691925697 method: "cm_msg_retrieve", params: [object Object], id: 1598691925697

Does anyone know why and what causes this? This does not affect anything and things seem to continue running fine.

thanks.
    Reply  08 Sep 2020, Konstantin Olchanski, Forum, json parser error 
> I am getting the following error alert in a custom page whenever a run starts
> json parser exception: SyntaxError: Unexpected token < in JSON at position 985, batch request: method: "db_get_values", params: [object Object], id: 1598691925697 method: "get_alarms", params: null, id: 1598691925697 method: "cm_msg_retrieve", params: [object Object], id: 1598691925697 method: "cm_msg_retrieve", params: [object Object], id: 1598691925697
> Does anyone know why and what causes this? This does not affect anything and things seem to continue running fine.

this is bug #242, https://bitbucket.org/tmidas/midas/issues/242/mjsonrpc-calls-should-return-valid-utf8

we read stuff from midas.log and push it to the web browser. we have seen this stuff
contain arbitrary binary data (both intentionally written into midas.log by cm_msg() and
file content corruption/truncation from computer crashes), the json decoder in the web browser
does not like that stuff - it is invalid utf-8 unicode - and throws an exception.

since we cannot ensure content of midas.log (and other files on disk) are always valid utf-8,
we have to sanitize it before sending it to the browser.

right now I am not sure of the best way to do this sanitizing. we do have a function to check
for valid utf-8 unicode, perhaps it should be extended to replace invalid unicode with spaces
or Xes or "?" or whatever, I am open to suggestions and ideas.

BTW, this is a new recent change to how strings generally work. C NUL-terminated strings are
permitted to contain arbitrary binary data (except for NUL char, of course). C++ std::string
are permitted to contain arbitrary binary data. but javascript strings are only permitted
to contain valid unicode, and the json standard was recently amended to require that json
strings are valid utf-8 unicode. So there is a disconnect between C/C++ code written in the
last 50 years where strings can contain binary data and the javascript world requiring
valid utf-8 unicode pretty much everywhere.

K.O.
Entry  21 Aug 2020, Ruslan Podviianiuk, Forum, time information Running_time.png
Hello,

I have a few questions about time information:
1. Is it possible to get "Running time" using, for example, jsonrpc? (please see 
the attached file)
2. Is it possible to configure "Start time" and "Stop time" with time zone? For 
example when I start a new run, value of "Start time" key is automatically changed 
to "Fri Aug 21 12:38:36 2020" without time zone. 

Thank you.
    Reply  24 Aug 2020, Stefan Ritt, Forum, time information 
> 1. Is it possible to get "Running time" using, for example, jsonrpc? (please see 
> the attached file)

You have in the ODB "/Runinfo/Start time binary" which is measured in seconds since 
1970. By subtracting this from the current time, you get the running time.

> 2. Is it possible to configure "Start time" and "Stop time" with time zone? For 
> example when I start a new run, value of "Start time" key is automatically changed 
> to "Fri Aug 21 12:38:36 2020" without time zone. 

"Start time binary" and "Stop time binary" are in seconds since the 1970 in UTC, so no 
time zone involved there. The ASCII versions of the start/stop time are derived from 
the binary time using the server's local time zone. If you want to display them in a 
different time zone, you have to create a custom page and convert it to another time 
zone using JavaScript like

var d = new Date(start_time_binary);

Stefan
       Reply  25 Aug 2020, Ruslan Podviianiuk, Forum, time information 
Thank you, Stefan

Ruslan 



> > 1. Is it possible to get "Running time" using, for example, jsonrpc? (please see 
> > the attached file)
> 
> You have in the ODB "/Runinfo/Start time binary" which is measured in seconds since 
> 1970. By subtracting this from the current time, you get the running time.
> 
> > 2. Is it possible to configure "Start time" and "Stop time" with time zone? For 
> > example when I start a new run, value of "Start time" key is automatically changed 
> > to "Fri Aug 21 12:38:36 2020" without time zone. 
> 
> "Start time binary" and "Stop time binary" are in seconds since the 1970 in UTC, so no 
> time zone involved there. The ASCII versions of the start/stop time are derived from 
> the binary time using the server's local time zone. If you want to display them in a 
> different time zone, you have to create a custom page and convert it to another time 
> zone using JavaScript like
> 
> var d = new Date(start_time_binary);
> 
> Stefan
Entry  24 Aug 2020, Konstantin Olchanski, Release, midas-2020-12 
midas-2020-12-a is here.

new features and notable updates since midas-2020-03:

- new C++ ODB interface odbxx.h
- image history
- much improved history plots
- new sequencer pages
- UTF-8 clean ODB (complains if any TID_STRING is invalid UTF-8)
- mhttpd update to mongoose 6.16 with much improved mulththreading
- mhttpd update to use MBEDTLS in preference to problematic OpenSSL
- MidasConfig.cmake contributed by Mathieu Guigue

plans for next development: major update of mlogger to simplify channel 
configuration in odb, improvements to mhttpd multithreading, new history plot 
configuration page, more c++ification.

To obtain this release, either checkout the top of branch release/midas-2020-08 
(recommended) or checkout the tag midas-2020-08-a.

K.O.
Entry  28 Aug 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies fedummy.cxxMakefile
Hello experts,

I have been writing a SC frontend for a powersupply. I have used the model 
where the frontend can be started with "-i n" option so that each fe can 
control a different supply. During the development/testing of the program I 
would normally only run a single instance with "-i 1". However when I started
a second instance with "-i 2" I found problems with the history plots that
were being made for the original "-i 1" instance. The variable being plotted
seemed to randomly jump between the value from the "-i 1" instance and 
the "-i 2" instance.  confirmed that the "correct" values exist for each 
frontend in the odb under /Equipment/Foo01/Variables and 
/Equipment/Foo02/Variables

This is also not just a plotting artifact since I was also
able to see the two different values by running mhist.

I saw this behaviour using midas-2019-03 and also the head of the development
branch (686e4de2b55023b0d1936c60bcf4767c5e6caac0 from just under 48 hours ago). 

I was able to reproduce this with a stripped down frontend that just 
sets a variable that is equal to its frontend_index. Please find the code 
and Makefile attached. Presumably I've done something wrong in my 
implementation that hopefully a more experienced person can spot quite 
quickly, but please let me know if any more information is needed.

I have seen this behaviour on both Debian 10 and on a CentOS 7 Singularity 
image running on top of Debian 10.

Thanks,

Nick.

P.S. I made the topic of this post "Forum" and not "Bug Report" since I
expect the root of this problem is somewhere between the keyboard and chair.
    Reply  28 Aug 2019, Stefan Ritt, Forum, History plot problems for frontend with multiple indicies 
My first question would be why are you using several font-ends at all? That makes things more 
complicated than needed. In the normal FE framework, you can define either several equipment 
served by one frontend, or even one equipment linked to several devices. In the MEG experiment 
we have one slow control frontend controlling ~100 devices without problem. In the old days there 
was a problem that some slow devices could throttle the readout, but since the invention of multi-
threaded slow control equipment, each device gets its own thread so they don't block each other.

Stefan
       Reply  28 Aug 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies 
Hi Stefan,

thanks for you quick reply.

> My first question would be why are you using several font-ends at all?

Becuase I was following the model used for many of the frontends for the ND280 FGD.

> That makes things more 
> complicated than needed. In the normal FE framework, you can define either several equipment 
> served by one frontend, or even one equipment linked to several devices. In the MEG experiment 
> we have one slow control frontend controlling ~100 devices without problem. In the old days 
there 
> was a problem that some slow devices could throttle the readout, but since the invention of 
multi-
> threaded slow control equipment, each device gets its own thread so they don't block each 
other.

Perhaps things have changed in the 10 years since the FGD SC code was written. I can do it 
differently but doing it that way seemed naturual since around 90% of the frontend code that I
have see does it that way.

Nick.
          Reply  28 Aug 2019, lcp, Forum, History plot problems for frontend with multiple indicies 
hi, 

> > That makes things more 
> > complicated than needed. In the normal FE framework, you can define either several equipment 
> > served by one frontend, or even one equipment linked to several devices. In the MEG experiment 
> > we have one slow control frontend controlling ~100 devices without problem. In the old days 
> there 
> > was a problem that some slow devices could throttle the readout, but since the invention of 
> multi-
> > threaded slow control equipment, each device gets its own thread so they don't block each 
> other.
> 

I agree with Stefan, that it's probably better to run a multi-threaded setup, than individual frontends.

The only place I've ever used the frontend index on startup is when I was testing and building
an eventbuilder.

https://midas.triumf.ca/MidasWiki/index.php/Event_Builder#Example

This might explain, why your history is swapping between frontends, as in the event builder, it gets
reconstructed.

Maybe this helps...

LCP


> Perhaps things have changed in the 10 years since the FGD SC code was written. I can do it 
> differently but doing it that way seemed naturual since around 90% of the frontend code that I
> have see does it that way.
             Reply  16 Sep 2019, Konstantin Olchanski, Forum, History plot problems for frontend with multiple indicies 
> it's probably better to run a multi-threaded setup, than individual frontends.

I recommend against using multiple threads if at all possible and unless absolutely required.

Only for one reason: multithreaded c++ programs are notoriously hard to debug.

In addition, one has to face several classes of bugs absent in single-threaded applications:

a) which thread "owns" which object
b) locking of all shared data
c) huge overheads from locking at high data rates (a performance bug)
d) correct locking order, dead locks, live locks
e) incomprehensible core dumps and stack traces
f) race conditions

To control 2 power supplies, run 2 frontend programs, 1 per power supply.

To control 64 frontend cards, run 1 frontend with many threads: 64 (per device) + 1 (main thread) + 1 (RPC handler) + 1 
(watchdog) + 1 (common event generator/data transmitter) + 1 (odb/web page status update). You *will* bump into each 
and every one of the problems (a) to (f) above.

K.O.
       Reply  16 Sep 2019, Konstantin Olchanski, Forum, History plot problems for frontend with multiple indicies 
> My first question would be why are you using several font-ends at all? That makes things more 
> complicated than needed. In the normal FE framework, you can define either several equipment 
> served by one frontend, or even one equipment linked to several devices.

I am the culprit here, as I wrote the original code for T2K/ND280 that Nick is looking at.

At the time, we needed to control multiple units of identical equipment. Most of these equipments
needed to be controlled independently from each other, so we could not/did not want to use
one single frontend executable to control all of them at the same time. For example, for equipment
not in use, we can stop the corresponding frontend. In case of trouble, we can restart
the corresponding frontend without disrupting the frontends for the other equipments.

The successful operation of the T2K/ND280 experiment is sufficient defence for the validity of this approach.

One lesson learned was that the MIDAS frontend framework did not make it easy to have multiple identical frontends 
for controlling multiple identical equipments. (typical use is control of 2-3 Wiener power supplies, 1-2-3 UPS 
devices, etc). At the time (and today), only the "i NNN" flag is available to tell the frontend "who am I?". To make it 
work, one has to use the hard to "%02d" stuff in the equipment name, and there are other complications. For my 
"next generation" of frontends, I tried to specialize the frontend executables at compile time using C/C++ 
preprocessor defines (-Dwiener01, -Dwiener02, etc), this worked better, but still not super happy.

My current solution as implemented by the tmfe frontend framework is to give the user full control
over the command line arguments (mfe.c did not permit any "user arguments" and did not allow
access to argc/argv) and full control over the equipment names (mfe.c equipment names are fixed at compile time).

K.O.
    Reply  29 Aug 2019, Ben Smith, Forum, History plot problems for frontend with multiple indicies 
Hi Nick,

I confirm that this issue appears when using the MIDAS history driver. The issue does not appear when using the MYSQL history driver.

One solution is to give each frontend instance a different Event ID (see example code below for doing this in frontend_init). The history system did still seem to be confused by the existing FeDummy equipments/events even after making this change. However, after changing EQ_NAME from FeDummy to FeDum (i.e. starting from a clean state history-wise) things behaved normally.

I will note that some experiments definitely have a need for the "-i" method, especially those that run on distributed clusters.

Ben


```
INT frontend_init()
{
   sprintf(eq_name, "%s%02d", EQ_NAME, get_frontend_index());

   // Ensure each FE gets a different Event ID in the history system (951, 952 etc)
   char keyname[128];
   HNDLE hkey;
   int status;
   sprintf(keyname, "/Equipment/%s/Common/Event ID", eq_name);
   status = db_find_key(hDB, 0, keyname, &hkey);
   if (status != DB_SUCCESS) abort();

   WORD new_ev_id = 950 + get_frontend_index();
   status = db_set_data_index(hDB, hkey, &new_ev_id, 2, 0, TID_WORD);
   if (status != DB_SUCCESS) abort();
   return SUCCESS;
}
```
       Reply  01 Sep 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies 
Hi Ben,

thanks for your reply. I can confirm that your suggested workaround does indeed
make the problem dissapear.

I guess this issue hasn't been seen at T2K since we use MYSQL for the history.

Thanks,

Nick.
          Reply  16 Sep 2019, Konstantin Olchanski, Forum, History plot problems for frontend with multiple indicies 
> thanks for your reply. I can confirm that your suggested workaround does indeed
> make the problem dissapear.
> I guess this issue hasn't been seen at T2K since we use MYSQL for the history.

I think you found the source of the problem, confused event id assignments. To confirm,
can you email me (or post here) the output of odbedit "ls -l /History/Events".

If that's the problem, you can avoid it completely by switching to a history storage method
that does not rely on magic mapping between equipment names and numeric event id's:
try the "FILE" method (set odb /Logger/History/FILE/Active to "y", restart the logger) or
the "MYSQL" method (you will need to setup a mysql database). You tell mhttpd and mhist which 
history data to read by setting ODB /History/LoggerHistoryChannel to one of the channel names 
from /logger/history/, restart mhttpd. (mhttpd and mhist used to print a message "reading history 
data from channel XXX", but somebody removed this message).

K.O.
             Reply  16 Sep 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies 
Hi Konstantin,

thanks for your reply.

> > thanks for your reply. I can confirm that your suggested workaround does indeed
> > make the problem dissapear.
> > I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
> 
> I think you found the source of the problem, confused event id assignments. To confirm,
> can you email me (or post here) the output of odbedit "ls -l /History/Events".

Sorry, do you want this for after I've applied the fix suggested by Ben or with the original code 
that I posted.

With the original code it only shows one fe even though both are running:

[local:e666:S]History>ls -l /History/Events
Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
1                               STRING  1     10    2m   0   RWD  FeDummy02
0                               STRING  1     16    2m   0   RWD  Run transitions

[local:e666:S]History> scl
Name                Host
mhttpd              localhost           
fedummy01           localhost           
fedummy02           localhost           
ODBEdit             localhost           
Logger              localhost           
[local:e666:S]History>ls "/History/Display/Default/Dummy/
Timescale                       1h
Zero ylow                       n
Show run markers                y
Show values                     y
Sort Vars                       n
Log axis                        n
Minimum                         0
Maximum                         0
Variables
                                FeDummy01:Data
                                FeDummy02:Data
Label
                                
                                
Colour
                                #00AAFF
                                #FF9000
Factor
                                0
                                0
Offset
                                0
                                0
Buttons
                                10m
                                1h
                                3h
                                12h
                                24h
                                3d
                                7d
Formula
                                
                                
Show old vars                   n

> If that's the problem, you can avoid it completely by switching to a history storage method
> that does not rely on magic mapping between equipment names and numeric event id's:
> try the "FILE" method (set odb /Logger/History/FILE/Active to "y", restart the logger) or
> the "MYSQL" method (you will need to setup a mysql database). You tell mhttpd and mhist which 
> history data to read by setting ODB /History/LoggerHistoryChannel to one of the channel names 
> from /logger/history/, restart mhttpd. (mhttpd and mhist used to print a message "reading history 
> data from channel XXX", but somebody removed this message).

Using the orginal code I posted and switching from MIDAS history to FILE history did not seem to 
change the random behaviour in the history plots.

Regards,

Nick.
                Reply  17 Sep 2019, Konstantin Olchanski, Forum, History plot problems for frontend with multiple indicies 
> [local:e666:S]History>ls -l /History/Events
> Key name                        Type    #Val  Size  Last Opn Mode Value
> ---------------------------------------------------------------------------
> 1                               STRING  1     10    2m   0   RWD  FeDummy02
> 0                               STRING  1     16    2m   0   RWD  Run transitions

Something is very broken. There should be more entries here, at least
there should be entries for "FeDummy01" and usually there is also an entry
for "FeDummy" because one invariably runs fedummy without "-i" at least once.

The fact that changing from "midas" storage to "file" storage makes no difference
also indicates that something is very broken.

I want to debug this.

Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" files
from the "file" storage.

K.O.
                   Reply  18 Sep 2019, Nick Hastings, Forum, History plot problems for frontend with multiple indicies 
Hi Konstantin,

> > [local:e666:S]History>ls -l /History/Events
> > Key name                        Type    #Val  Size  Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > 1                               STRING  1     10    2m   0   RWD  FeDummy02
> > 0                               STRING  1     16    2m   0   RWD  Run transitions
> 
> Something is very broken. There should be more entries here, at least
> there should be entries for "FeDummy01" and usually there is also an entry
> for "FeDummy" because one invariably runs fedummy without "-i" at least once.

This is a fresh experiment that I started just to test this this issue, that is why there are not many 
entries in /History/Events. I agree though that we should expect to see a FeDummy01 entry.
 
> The fact that changing from "midas" storage to "file" storage makes no difference
> also indicates that something is very broken.
> 
> I want to debug this.
> 
> Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" 
files
> from the "file" storage.

When I started this experiment yesterday(?) I disabled the Midas history when I enbled the file 
history. Jsut now I reenabled the Midas history, so they are currently both active.

% ls -l work/online/{*.hst,mhf*.dat}
-rw-r--r-- 1 hastings hastings  14996 Sep 17 10:21 work/online/190917.hst
-rw-r--r-- 1 hastings hastings   3292 Sep 18 16:29 work/online/190918.hst
-rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy01.dat
-rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy02.dat
-rw-r--r-- 1 hastings hastings    166 Sep 17 10:17 
work/online/mhf_1568683062_20190917_run_transitions.dat

And again, just as a sanity check:

% odbedit -c 'ls -l /History/Events'
Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
1                               STRING  1     10    1m   0   RWD  FeDummy02
0                               STRING  1     16    1m   0   RWD  Run transitions

Regards,

Nick.
                      Reply  27 Sep 2019, Konstantin Olchanski, Forum, History plot problems for frontend with multiple indicies 
We should fix this for midas-2019-10.

https://bitbucket.org/tmidas/midas/issues/193/confusion-in-history-event-ids

K.O.





> Hi Konstantin,
> 
> > > [local:e666:S]History>ls -l /History/Events
> > > Key name                        Type    #Val  Size  Last Opn Mode Value
> > > ---------------------------------------------------------------------------
> > > 1                               STRING  1     10    2m   0   RWD  FeDummy02
> > > 0                               STRING  1     16    2m   0   RWD  Run transitions
> > 
> > Something is very broken. There should be more entries here, at least
> > there should be entries for "FeDummy01" and usually there is also an entry
> > for "FeDummy" because one invariably runs fedummy without "-i" at least once.
> 
> This is a fresh experiment that I started just to test this this issue, that is why there are not many 
> entries in /History/Events. I agree though that we should expect to see a FeDummy01 entry.
>  
> > The fact that changing from "midas" storage to "file" storage makes no difference
> > also indicates that something is very broken.
> > 
> > I want to debug this.
> > 
> > Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> > with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" 
> files
> > from the "file" storage.
> 
> When I started this experiment yesterday(?) I disabled the Midas history when I enbled the file 
> history. Jsut now I reenabled the Midas history, so they are currently both active.
> 
> % ls -l work/online/{*.hst,mhf*.dat}
> -rw-r--r-- 1 hastings hastings  14996 Sep 17 10:21 work/online/190917.hst
> -rw-r--r-- 1 hastings hastings   3292 Sep 18 16:29 work/online/190918.hst
> -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy01.dat
> -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy02.dat
> -rw-r--r-- 1 hastings hastings    166 Sep 17 10:17 
> work/online/mhf_1568683062_20190917_run_transitions.dat
> 
> And again, just as a sanity check:
> 
> % odbedit -c 'ls -l /History/Events'
> Key name                        Type    #Val  Size  Last Opn Mode Value
> ---------------------------------------------------------------------------
> 1                               STRING  1     10    1m   0   RWD  FeDummy02
> 0                               STRING  1     16    1m   0   RWD  Run transitions
> 
> Regards,
> 
> Nick.
                         Reply  24 Aug 2020, Konstantin Olchanski, Forum, History plot problems for frontend with multiple indicies 
This turned out to be a tricky problem. I am adding a warning about it in mlogger. This should go into midas-
2020-07. Closing bug #193. K.O.


> We should fix this for midas-2019-10.
> 
> https://bitbucket.org/tmidas/midas/issues/193/confusion-in-history-event-ids
> 
> K.O.
> 
> 
> 
> 
> 
> > Hi Konstantin,
> > 
> > > > [local:e666:S]History>ls -l /History/Events
> > > > Key name                        Type    #Val  Size  Last Opn Mode Value
> > > > ---------------------------------------------------------------------------
> > > > 1                               STRING  1     10    2m   0   RWD  FeDummy02
> > > > 0                               STRING  1     16    2m   0   RWD  Run transitions
> > > 
> > > Something is very broken. There should be more entries here, at least
> > > there should be entries for "FeDummy01" and usually there is also an entry
> > > for "FeDummy" because one invariably runs fedummy without "-i" at least once.
> > 
> > This is a fresh experiment that I started just to test this this issue, that is why there are not many 
> > entries in /History/Events. I agree though that we should expect to see a FeDummy01 entry.
> >  
> > > The fact that changing from "midas" storage to "file" storage makes no difference
> > > also indicates that something is very broken.
> > > 
> > > I want to debug this.
> > > 
> > > Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> > > with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" 
> > files
> > > from the "file" storage.
> > 
> > When I started this experiment yesterday(?) I disabled the Midas history when I enbled the file 
> > history. Jsut now I reenabled the Midas history, so they are currently both active.
> > 
> > % ls -l work/online/{*.hst,mhf*.dat}
> > -rw-r--r-- 1 hastings hastings  14996 Sep 17 10:21 work/online/190917.hst
> > -rw-r--r-- 1 hastings hastings   3292 Sep 18 16:29 work/online/190918.hst
> > -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy01.dat
> > -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy02.dat
> > -rw-r--r-- 1 hastings hastings    166 Sep 17 10:17 
> > work/online/mhf_1568683062_20190917_run_transitions.dat
> > 
> > And again, just as a sanity check:
> > 
> > % odbedit -c 'ls -l /History/Events'
> > Key name                        Type    #Val  Size  Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > 1                               STRING  1     10    1m   0   RWD  FeDummy02
> > 0                               STRING  1     16    1m   0   RWD  Run transitions
> > 
> > Regards,
> > 
> > Nick.
Entry  12 Aug 2020, Yan Liu, Suggestion, adding db_get_mode ti check access mode for keys 
Hello,

I am wondering if there is a function that checks the access mode for a key? I 
found the db_set_mode() function that allows me to set the access mode for a key, 
but failed to find its counterpart get function.

Thanks in advance,
Yan
    Reply  13 Aug 2020, Stefan Ritt, Suggestion, adding db_get_mode ti check access mode for keys 
> Hello,
> 
> I am wondering if there is a function that checks the access mode for a key? I 
> found the db_set_mode() function that allows me to set the access mode for a key, 
> but failed to find its counterpart get function.
> 
> Thanks in advance,
> Yan


  KEY k;
  db_get_key(hDB, handle, &k);
  std::cout << k.access_mode << std::endl;

/Stefan
       Reply  13 Aug 2020, Yan Liu, Suggestion, adding db_get_mode ti check access mode for keys 
Thank you!

Yan

> > Hello,
> > 
> > I am wondering if there is a function that checks the access mode for a key? I 
> > found the db_set_mode() function that allows me to set the access mode for a key, 
> > but failed to find its counterpart get function.
> > 
> > Thanks in advance,
> > Yan
> 
> 
>   KEY k;
>   db_get_key(hDB, handle, &k);
>   std::cout << k.access_mode << std::endl;
> 
> /Stefan
Entry  07 Aug 2020, Konstantin Olchanski, Info, update of MYSQL history documentation 
I updated the documentation for setting up a MYSQL (MariaDB) database for 
recording MIDAS history: https://midas.triumf.ca/MidasWiki/index.php/History_System#Write_MYSQL-history_events

One thing to note: the "writer" user must have the "INDEX" permission, otherwise 
many things will not work correctly.

Included are the instructions for importing exiting *.hst history files into the 
SQL database: mh2sql --mysql mysql_writer.txt *.hst

Let me know if there is interest in adding support for writing into Postgres SQL 
database. We used to support both MySQL and Postgres through the ODBC library, 
but in the new code, each database has to be supported through it's native API. 
There is code for SQLITE, MYSQL, but no code for Postgres, although it is not too 
hard to add.

K.O.
Entry  15 Jul 2020, Stefan Ritt, Info, Minimal CMakeLists.txt for your midas front-end 
Since a few people asked me, here is a "minimal" CMakeLists.txt file for a user-written front-end 
program "myfe":

---------------------------

cmake_minimum_required(VERSION 3.0)
project(myfe)

# Check for MIDASSYS environment variable
if (NOT DEFINED ENV{MIDASSYS})
   message(SEND_ERROR "MIDASSYS environment variable not defined.")
endif()

set(CMAKE_CXX_STANDARD 11)
set(MIDASSYS $ENV{MIDASSYS})

if (${CMAKE_SYSTEM_NAME} MATCHES Linux)
   set(LIBS -lpthread -lutil -lrt)
endif()

add_executable(myfe myfe.cxx)

target_include_directories(myfe PRIVATE ${MIDASSYS}/include)
target_link_libraries(crfe ${MIDASSYS}/lib/libmfe.a ${MIDASSYS}/lib/libmidas.a ${LIBS})
Entry  28 Jun 2020, Konstantin Olchanski, Info, Makefile update 
I reworked the MIDAS Makefile to simplify things and to remove redundancy with functions 
provided by cmake.

When you say "make", the list of options is printed.

The first and main options are "make cmake" and "make cclean" to run the cmake build.

This is my recommended way to build midas - the output of "make cmake" was tuned to provide 
the information need to debug build problems (all compiler commands, command line switches 
and file paths are reported). (normal "cmake VERBOSE=1" is tuned for debugging of cmake and 
for maximum obfuscation of problems building the actual project).

Build options are implemented through cmake variables:

options that can be added to "make cmake":
      NO_LOCAL_ROUTINES=1 NO_CURL=1
      NO_ROOT=1 NO_ODBC=1 NO_SQLITE=1 NO_MYSQL=1 NO_SSL=1 NO_MBEDTLS=1
      NO_EXPORT_COMPILE_COMMANDS=1

for example "make cmake NO_ROOT=1" to disable auto-detection of ROOT.

Two more make targets create reduced builds of midas:

"make mini" builds a subset of midas suitable for building frontend programs. Big programs 
like mlogger and mhttpd are excluded, optional components like CURL or SQLITE are not needed.

"make remoteonly" builds a subset of midas suitable for building remotely connected 
frontends. Big parts of midas are excluded, many system-dependent functions are excluded, 
etc. This is intended for embedded applications, such as fpga, uclinux, etc.

But wait, there is more. Here is the full list:

daqubuntu:midas$ make
Usage:

   make cmake     --- full build of midas
   make cclean    --- remove everything build by make cmake

   options that can be added to "make cmake":
      NO_LOCAL_ROUTINES=1 NO_CURL=1
      NO_ROOT=1 NO_ODBC=1 NO_SQLITE=1 NO_MYSQL=1 NO_SSL=1 NO_MBEDTLS=1
      NO_EXPORT_COMPILE_COMMANDS=1

   make dox       --- run doxygen, results are in ./html/index.html
   make cleandox  --- remove doxygen output

   make htmllint  --- run html check on resources/*.html

   make test      --- run midas self test

   make mbedtls   --- enable mhttpd support for https via the mbedtls https library
   make update_mbedtls --- update mbedtls to latest version
   make clean_mbedtls  --- remove mbedtls from this midas build

   make mtcpproxy --- build the https proxy to forward root-only port 443 to mhttpd https 
port 8443

   make mini      --- minimal build, results are in linux/{bin,lib}
   make cleanmini --- remove everything build by make mini

   make remoteonly      --- minimal build, remote connetion only, results are in linux-
remoteonly/{bin,lib}
   make cleanremoteonly --- remove everything build by make remoteonly

   make linux32   --- minimal x86 -m32 build, results are in linux-m32/{bin,lib}
   make clean32   --- remove everything built by make linux32

   make linux64   --- minimal x86 -m64 build, results are in linux-m64/{bin,lib}
   make clean64   --- remove everything built by make linux64

   make linuxarm  --- minimal ARM cross-build, results are in linux-arm/{bin,lib}
   make cleanarm  --- remove everything built by make linuxarm

   make clean     --- run all 'clean' commands

daqubuntu:midas$ 

K.O.
    Reply  15 Jul 2020, Stefan Ritt, Info, Makefile update 
Please note that you can also compile midas in the standard cmake way with

$ mkdir build
$ cd build
$ cmake ..
$ make install

in the root midas directory. You might have to use "cmake3" on some systems.

Stefan
Entry  28 Jun 2020, Konstantin Olchanski, Info, mhttpd https support openssl -> mbedtls 
For password protection of midas web pages, https is required, good old http 
with passwords transmitted in-the-clear is no longer considered secure. Latest 
recommendation is to run mhttpd behind an industry-standard https proxy, for 
example apache httpd. These proxies provide built-in password protection and 
have integration with certbot to provide automatic renewal of https 
certificates.

That said, for a long time now mhttpd provides native https support through the 
mongoose web server library and the openssl cryptography library.

Unfortunately, for years now, we have been running into trouble with the midas 
build process bombing out due to inconsistent versions and locations of system-
provided and user-installed openssl libraries. Despite our best efforts (and 
through the switch to cmake!) these problems keep coming back and coming back.

Luckily, latest versions of mongoose support the mbedtls cryptography library. I 
have tested it and it works well enough for me to switch the MIDAS default build 
from "openssl if found" to "mbedtls if-asked-for-by-user".

Starting with commit e7b02f9, cmake builds do not look for and do not try to use 
openssl. mhttpd is built without support for https. This is consistent with the 
recommendation to run it behind an apache httpd password protected https proxy.

To enable https support using mbedtls, run "make mbedtls". This will "git clone" 
the mbedtls library and add it to the midas build. mhttpd will be built with 
https support enabled.

To disable mbedtls support, use "make cmake NO_MBEDTLS=1" or run "make 
clean_mbedtls" (this will remove the mbedtls sources from the midas build).

To restore previous use of openssl, set the cmake variable "USE_OPENSSL".

In my test, mhttpd with https through mbedtls and a letsencrypt certificate gain 
a score of "A" from SSLlabs. (very good).

(you have to use progs/mtcproxy to run this test - SSLlabs only probe port 443 
and mtcproxy will forward it to mhttpd port 8443. to build, run "make 
mtcpproxy").

References:
https://github.com/cesanta/mongoose
https://github.com/ARMmbed/mbedtls

K.O.
    Reply  28 Jun 2020, Konstantin Olchanski, Info, mhttpd https support openssl -> mbedtls 
To add. Using https with either openssl or mbedtls requires obtaining an https certificate. This can be self-
signed, or signed by a higher authority, or issued by the "let's encrypt" project.

mhttpd is looking for this certificate in the file ssl_cert.pem.

If this file does not exist, mhttpd will print the instructions for creating it using openssl (self-signed) or 
using certbot (instantaneously and automatically issued let's encrypt certificate).

The certbot route is recommended:

1) (as root) setup certbot (i.e. see my CentOS and Ubuntu instructions on DAQWiki)
2) (as root) copy /etc/letsencrypt/live/$HOME/fullchain.pem and privkey.pem to $MIDASSYS
3) cat fullchain.pem privkey.pem > ssl_cert.pem
4) start mhttpd, watch the first few lines it prints to confirm it found the right certificate file.

The only missing piece for using this in production is lack of integration
with certbot automatic certificate renewal:

- a script has to run for steps (2) and (3) above
- mhttpd has to tell openssl/mbedtls to reload the certificate file (alternative is to automatically restart 
mhttpd, bad!).

As an alternative, we can wait for the mongoose web server library and for the mbedtls crypto library to "grow" 
certbot-style automatic certificate renewal features. (unavoidable, in my view).

K.O.
Entry  24 Jun 2020, Stefan Ritt, Info, New image history system available Screenshot_2020-06-24_at_17.21.11_.png
I'm happy to report that the Corona Lockdown in Europe also had some positive side 
effects: Finally I found time to implement an image history system in midas, 
something I wanted to do since many years, but never found time for that.

The idea is that you can incorporate any network-connected WebCam into the midas 
history system. You specify an update interval (like one minute) and the logger 
fetches regularly images from that webcam. The images are stored as raw files in 
the midas history directory, and can be retrieved via the web browser similarly to 
the "normal" history. Attached is an image from the MEG Experiment at PSI to give 
you some idea.

The cool thing now is that you can go "backwards" in time and browse all stored 
images. The buttons at each image allow you to step backward, forward, and play a 
movie of images, forward or backward. You can query for a certain date/time and 
download a specific image to your local disk. You can even synchronize all time 
axes, drag left and right on each image to see your experiment from different 
cameras at the same time stamps. You see a blue ribbon below each image which shows 
time stamps for which an image is available. 

Initially, only the most recent image is loaded to speed up loading time. As soon 
as you click on the image or one of the arrow buttons, previous images are loaded 
progressively, which you can see in the ribbon bar becoming blue. For slow internet 
connections this can take some time. For typical webcams and one minute update 
period you get typically a few GB per week.

To make this happen, you define a new ODB subtree 

/History/Images/<name>/
  Name:          Name of Camera
  Enabled:       Boolean to enable readout of camera
  URL            URL to fetch an image from the camera
  Period         Time period in seconds to fetch a new image
  Storage hours  Number of hours to store the images (0 for infinite)
  Extension      Image file extension, usually ".jpg" or ".png"
  Timescale      Initial horizontal time scale (like 8h)

The tricky part is to obtain the URL from your camera. For some cameras you can get 
that from the manual, others you have to "hack": Display an image in your browser 
using the camera's internal web interface, inspect the source code of your web page 
and you get the URL. For AXIS cameras I use, the URL is typically

http://<name>/axis-cgi/jpg/image.cgi

For the Netatmo cameras I have at home (which I used during development in my home 
office), the procedure is more complicated, but you can google it. The logger is 
now linked against the CURL library to fetch images, so it also support https://. 
If libcurl is not installed on your system, the image history functionality will be 
disabled.

I tested the system for a few days now and it seem stable, which however does not 
mean that it is bug-free. So please report back any issue. The change is committed 
to the current develop branch.

I hope this extension helps all those people who are forced to do more remote 
monitoring of experiment during these times.

Best,
Stefan
Entry  28 May 2020, Marius Koeppel, Suggestion, ODB++ API - documantion updates and odb view after key creation test_odb_api.cu
Hello everybody,

I am really appreciate the development of the new odb++ API. So I directly started to rewrite the code for the Mu3e DAQ system.

I have a view questions / suggestions which came up during my work so fare:

1. The documentation seems to be quite new so there are some variables wrong named and small typo stuff. I would like to fix them. Should I request for an account or what else is needed to change them?

2. When I create an ODB structure with the new API I do for example:

    midas::odb stream_settings = {
            {"Test_odb_api", {
                                      {"Divider", 1000},     // int
                                      {"Enable", false},     // bool
                              }},
    };
    stream_settings.connect("/Equipment/Test/Settings", true);

and with 

midas::odb datagen("/Equipment/Test/Settings/Test_odb_api");
std::cout << "Datagenerator Enable is " << datagen["Enable"] << std::endl;

I am getting back false. Which looks nice but when I look into the odb via the browser the value is actually "y" meaning true which is stange. I added my frontend where I cleaned all function leaving only the frontend_init() one where I create this key. Its a cuda program but since I clean everything no cuda function is called anymore.

Thank you again for the nice development!

Cheers,
Marius 
    Reply  28 May 2020, Stefan Ritt, Suggestion, ODB++ API - documantion updates and odb view after key creation 
> 2. When I create an ODB structure with the new API I do for example:
> 
>     midas::odb stream_settings = {
>             {"Test_odb_api", {
>                                       {"Divider", 1000},     // int
>                                       {"Enable", false},     // bool
>                               }},
>     };
>     stream_settings.connect("/Equipment/Test/Settings", true);
> 
> and with 
> 
> midas::odb datagen("/Equipment/Test/Settings/Test_odb_api");
> std::cout << "Datagenerator Enable is " << datagen["Enable"] << std::endl;
> 
> I am getting back false. Which looks nice but when I look into the odb via the browser the value is actually "y" meaning true which is stange. I added my frontend where I cleaned all function leaving only the frontend_init() one where I create this key. Its a cuda program but since I clean everything no cuda function is called anymore.

I cannot confirm this behaviour. Just put following code in a standalone program:

cm_connect_experiment(NULL, NULL, "test", NULL);
   midas::odb::set_debug(true);

   midas::odb stream_settings = {
           {"Test_odb_api", {
               {"Divider", 1000},     // int
               {"Enable", false},     // bool
           }},
   };
   stream_settings.connect("/Equipment/Test/Settings", true);

   midas::odb datagen("/Equipment/Test/Settings/Test_odb_api");
   std::cout << "Datagenerator Enable is " << datagen["Enable"] << std::endl;

and run it. The result is:

...
Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
Datagenerator Enable is Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
false

Looking in the ODB, I also see 

[local:Online:S]/>cd Equipment/Test/Settings/Test_odb_api/
[local:Online:S]Test_odb_api>ls
Divider                         1000
Enable                          n
[local:Online:S]Test_odb_api>


So not sure what is different in your case. Are you looking to the same ODB? Maybe you have one remote, and local? 
Note that the "true" flag in stream_settings.connect(..., true); forces all default values into the ODB. 
So if the ODB value is "y", it will be cdhanged to "n".

Best,
Stefan
    Reply  30 May 2020, Stefan Ritt, Suggestion, ODB++ API - documantion updates and odb view after key creation 
Marius, has the problem been fixed in meantime?

Stefan

> I am getting back false. Which looks nice but when I look into the odb via the browser the value is actually "y" meaning true which is stange. 
> I added my frontend where I cleaned all function leaving only the frontend_init() one where I create this key. Its a cuda program but since 
> I clean everything no cuda function is called anymore.
       Reply  04 Jun 2020, Marius Koeppel, Suggestion, ODB++ API - documantion updates and odb view after key creation 
Hi Stefan,

your test program was only working for me after I changed the following lines inside the odbxx.cpp

diff --git a/src/odbxx.cxx b/src/odbxx.cxx
index 24b5a135..48edfd15 100644
--- a/src/odbxx.cxx
+++ b/src/odbxx.cxx
@@ -753,7 +753,12 @@ namespace midas {
          }
       } else {
          u_odb u = m_data[index];
-         status = db_set_data_index(m_hDB, m_hKey, &u, rpc_tid_size(m_tid), index, m_tid);
+         if (m_tid == TID_BOOL) {
+             BOOL ss = bool(u);
+             status = db_set_data_index(m_hDB, m_hKey, &ss, rpc_tid_size(m_tid), index, m_tid);
+         } else {
+             status = db_set_data_index(m_hDB, m_hKey, &u, rpc_tid_size(m_tid), index, m_tid);
+         }
          if (m_debug) {
             std::string s;
             u.get(s);

Likely not the best fix but otherwise I was always getting after running the test program:

[ODBEdit,INFO] Program ODBEdit on host localhost started
[local:Default:S]/>cd Equipment/Test/Settings/Test_odb_api/
key not found
makoeppe@office ~/mu3e/online/online (git)-[odb++_api] % test_connect
Created ODB key /Equipment/Test/Settings
Created ODB key /Equipment/Test/Settings/Test_odb_api
Created ODB key /Equipment/Test/Settings/Test_odb_api/Divider
Set ODB key "/Equipment/Test/Settings/Test_odb_api/Divider" = 1000
Created ODB key /Equipment/Test/Settings/Test_odb_api/Enable
Set ODB key "/Equipment/Test/Settings/Test_odb_api/Enable" = false
Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api"
Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Divider"
Get ODB key "/Equipment/Test/Settings/Test_odb_api/Divider": 1000
Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Enable"
Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Divider"
Get ODB key "/Equipment/Test/Settings/Test_odb_api/Divider": 1000
Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Enable"
Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
Datagenerator Enable is Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
false
makoeppe@office ~/mu3e/online/online (git)-[odb++_api] % odbedit
[ODBEdit,INFO] Program ODBEdit on host localhost started
[local:Default:S]/>cd Equipment/Test/Settings/Test_odb_api/
[local:Default:S]Test_odb_api>ls
Divider                         1000
Enable                          y 

> > I am getting back false. Which looks nice but when I look into the odb via the browser the value is actually "y" meaning true which is stange. 
> > I added my frontend where I cleaned all function leaving only the frontend_init() one where I create this key. Its a cuda program but since 
> > I clean everything no cuda function is called anymore.
          Reply  05 Jun 2020, Stefan Ritt, Suggestion, ODB++ API - documantion updates and odb view after key creation 
Hi Marius,

your fix is good. Thanks for digging out this deep-lying issue, which would have haunted us if we would not fix it. 
The problem is that in midas, the "BOOL" type is 4 Bytes long, actually modelled after MS Windows. Now I realized
that in c++, the "bool" type is only 1 Byte wide. So if we do the memcopy from a "c++ bool" to a "MIDAS BOOL", we
always copy four bytes, meaning that we copy three Bytes beyond the one-byte value of the c++ bool. So your fix
is absolutely correct, and I added it in one more space where we deal with bool arrays, where we need the same.

What I don't understand however is the fact why this fails for you. The ODB values are stored in the C union under

union {
  ...
  bool m_bool;
  double m_double;
  std::string *m_string;
  ...
}

Now the C compiler puts all values at the lowest address, so m_bool is at offset zero, and the string pointer reaches
over all eight bytes (we are on 64-bit OS).

Now when I initialize this union in odbxx.h:66, I zero the string pointer which is the widest object:

  u_odb() : m_string{} {};

which (at least on my Mac) sets all eight bytes to zero. If I then use the wrong code to set the bool value to the ODB 
in odbxx.cxx:756, I do 

  db_set_data_index(... &u, rpc_tid_size(m_tid), ...);

so it copies four bytes (=rpc_tid_size(TID_BOOL)) to the ODB. The first byte should be the c++ bool value (0 or 1),
and the other three bytes should be zero from the initialization above. Apparently on your system, this is not
the case, and I would like you to double check it. Maybe there is another underlying problem which I don't understand
at the moment but which we better fix.

Otherwise the change is committed and your code should work. But we should not stop here! I really want to understand
why this is not working for you, maybe I miss something.

Best,
Stefan

> Hi Stefan,
> 
> your test program was only working for me after I changed the following lines inside the odbxx.cpp
> 
> diff --git a/src/odbxx.cxx b/src/odbxx.cxx
> index 24b5a135..48edfd15 100644
> --- a/src/odbxx.cxx
> +++ b/src/odbxx.cxx
> @@ -753,7 +753,12 @@ namespace midas {
>           }
>        } else {
>           u_odb u = m_data[index];
> -         status = db_set_data_index(m_hDB, m_hKey, &u, rpc_tid_size(m_tid), index, m_tid);
> +         if (m_tid == TID_BOOL) {
> +             BOOL ss = bool(u);
> +             status = db_set_data_index(m_hDB, m_hKey, &ss, rpc_tid_size(m_tid), index, m_tid);
> +         } else {
> +             status = db_set_data_index(m_hDB, m_hKey, &u, rpc_tid_size(m_tid), index, m_tid);
> +         }
>           if (m_debug) {
>              std::string s;
>              u.get(s);
> 
> Likely not the best fix but otherwise I was always getting after running the test program:
> 
> [ODBEdit,INFO] Program ODBEdit on host localhost started
> [local:Default:S]/>cd Equipment/Test/Settings/Test_odb_api/
> key not found
> makoeppe@office ~/mu3e/online/online (git)-[odb++_api] % test_connect
> Created ODB key /Equipment/Test/Settings
> Created ODB key /Equipment/Test/Settings/Test_odb_api
> Created ODB key /Equipment/Test/Settings/Test_odb_api/Divider
> Set ODB key "/Equipment/Test/Settings/Test_odb_api/Divider" = 1000
> Created ODB key /Equipment/Test/Settings/Test_odb_api/Enable
> Set ODB key "/Equipment/Test/Settings/Test_odb_api/Enable" = false
> Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api"
> Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Divider"
> Get ODB key "/Equipment/Test/Settings/Test_odb_api/Divider": 1000
> Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Enable"
> Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
> Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Divider"
> Get ODB key "/Equipment/Test/Settings/Test_odb_api/Divider": 1000
> Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Enable"
> Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
> Datagenerator Enable is Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
> false
> makoeppe@office ~/mu3e/online/online (git)-[odb++_api] % odbedit
> [ODBEdit,INFO] Program ODBEdit on host localhost started
> [local:Default:S]/>cd Equipment/Test/Settings/Test_odb_api/
> [local:Default:S]Test_odb_api>ls
> Divider                         1000
> Enable                          y 
> 
> > > I am getting back false. Which looks nice but when I look into the odb via the browser the value is actually "y" meaning true which is stange. 
> > > I added my frontend where I cleaned all function leaving only the frontend_init() one where I create this key. Its a cuda program but since 
> > > I clean everything no cuda function is called anymore.
             Reply  08 Jun 2020, Marius Koeppel, Suggestion, ODB++ API - documantion updates and odb view after key creation 
Hi Stefan,

I agree with your explanation about the size of BOOL and bool. 

I checked the program also on my Raspberry and there the old code works like on your mac. I don't really understand 
why the behavior is different for my system. The initializing of the union should also work for my system. 

At the moment I am using:

Arch Linux
Linux office 5.4.42-1-lts #1 SMP Wed, 20 May 2020 20:42:53 +0000 x86_64 GNU/Linux
gcc version 10.1.0 (GCC)

One thing which makes me a bit suspicious is that if I do:

u_odb u = m_data[index];
char dest[rpc_tid_size(m_tid)];
memcpy(dest, &u, rpc_tid_size(m_tid));

Clion tells me "Clang-Tidy: Undefined behavior, source object type 'midas::u_odb' is not TriviallyCopyable". 

I am not sure if this is the problem since I am not so familiar with this TriviallyCopyable. I need to further investigate here.

So fare the update from my side.

Cheers,
Marius

> Hi Marius,
> 
> your fix is good. Thanks for digging out this deep-lying issue, which would have haunted us if we would not fix it. 
> The problem is that in midas, the "BOOL" type is 4 Bytes long, actually modelled after MS Windows. Now I realized
> that in c++, the "bool" type is only 1 Byte wide. So if we do the memcopy from a "c++ bool" to a "MIDAS BOOL", we
> always copy four bytes, meaning that we copy three Bytes beyond the one-byte value of the c++ bool. So your fix
> is absolutely correct, and I added it in one more space where we deal with bool arrays, where we need the same.
> 
> What I don't understand however is the fact why this fails for you. The ODB values are stored in the C union under
> 
> union {
>   ...
>   bool m_bool;
>   double m_double;
>   std::string *m_string;
>   ...
> }
> 
> Now the C compiler puts all values at the lowest address, so m_bool is at offset zero, and the string pointer reaches
> over all eight bytes (we are on 64-bit OS).
> 
> Now when I initialize this union in odbxx.h:66, I zero the string pointer which is the widest object:
> 
>   u_odb() : m_string{} {};
> 
> which (at least on my Mac) sets all eight bytes to zero. If I then use the wrong code to set the bool value to the ODB 
> in odbxx.cxx:756, I do 
> 
>   db_set_data_index(... &u, rpc_tid_size(m_tid), ...);
> 
> so it copies four bytes (=rpc_tid_size(TID_BOOL)) to the ODB. The first byte should be the c++ bool value (0 or 1),
> and the other three bytes should be zero from the initialization above. Apparently on your system, this is not
> the case, and I would like you to double check it. Maybe there is another underlying problem which I don't understand
> at the moment but which we better fix.
> 
> Otherwise the change is committed and your code should work. But we should not stop here! I really want to understand
> why this is not working for you, maybe I miss something.
> 
> Best,
> Stefan
> 
> > Hi Stefan,
> > 
> > your test program was only working for me after I changed the following lines inside the odbxx.cpp
> > 
> > diff --git a/src/odbxx.cxx b/src/odbxx.cxx
> > index 24b5a135..48edfd15 100644
> > --- a/src/odbxx.cxx
> > +++ b/src/odbxx.cxx
> > @@ -753,7 +753,12 @@ namespace midas {
> >           }
> >        } else {
> >           u_odb u = m_data[index];
> > -         status = db_set_data_index(m_hDB, m_hKey, &u, rpc_tid_size(m_tid), index, m_tid);
> > +         if (m_tid == TID_BOOL) {
> > +             BOOL ss = bool(u);
> > +             status = db_set_data_index(m_hDB, m_hKey, &ss, rpc_tid_size(m_tid), index, m_tid);
> > +         } else {
> > +             status = db_set_data_index(m_hDB, m_hKey, &u, rpc_tid_size(m_tid), index, m_tid);
> > +         }
> >           if (m_debug) {
> >              std::string s;
> >              u.get(s);
> > 
> > Likely not the best fix but otherwise I was always getting after running the test program:
> > 
> > [ODBEdit,INFO] Program ODBEdit on host localhost started
> > [local:Default:S]/>cd Equipment/Test/Settings/Test_odb_api/
> > key not found
> > makoeppe@office ~/mu3e/online/online (git)-[odb++_api] % test_connect
> > Created ODB key /Equipment/Test/Settings
> > Created ODB key /Equipment/Test/Settings/Test_odb_api
> > Created ODB key /Equipment/Test/Settings/Test_odb_api/Divider
> > Set ODB key "/Equipment/Test/Settings/Test_odb_api/Divider" = 1000
> > Created ODB key /Equipment/Test/Settings/Test_odb_api/Enable
> > Set ODB key "/Equipment/Test/Settings/Test_odb_api/Enable" = false
> > Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api"
> > Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Divider"
> > Get ODB key "/Equipment/Test/Settings/Test_odb_api/Divider": 1000
> > Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Enable"
> > Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
> > Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Divider"
> > Get ODB key "/Equipment/Test/Settings/Test_odb_api/Divider": 1000
> > Get definition for ODB key "/Equipment/Test/Settings/Test_odb_api/Enable"
> > Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
> > Datagenerator Enable is Get ODB key "/Equipment/Test/Settings/Test_odb_api/Enable": false
> > false
> > makoeppe@office ~/mu3e/online/online (git)-[odb++_api] % odbedit
> > [ODBEdit,INFO] Program ODBEdit on host localhost started
> > [local:Default:S]/>cd Equipment/Test/Settings/Test_odb_api/
> > [local:Default:S]Test_odb_api>ls
> > Divider                         1000
> > Enable                          y 
> > 
> > > > I am getting back false. Which looks nice but when I look into the odb via the browser the value is actually "y" meaning true which is stange. 
> > > > I added my frontend where I cleaned all function leaving only the frontend_init() one where I create this key. Its a cuda program but since 
> > > > I clean everything no cuda function is called anymore.
                Reply  16 Jun 2020, Marius Koeppel, Suggestion, ODB++ API - documantion updates and odb view after key creation 
Hi Stefan,

I played around with the code a bit more and I found out that if I do:

midas::odb test_settings = {{"Enable", false}};
test_settings.connect("/Equipment/Test/Test", true);

The correct value ends up in the odb. In this case an u_odb instance is created
with a clean m_string. But if I run the other code an odb instanceo is created and
the values of m_data are set in

odbxx.h:
        odb(std::initializer_list<std::pair<const char *, midas::odb>> list) : odb() {...

This values are comming from u_odb instances since the code does:

odbxx.h:
        auto o = new midas::odb(element.second);

and then

odbxx.h:
        odb(T v):odb() {                                                                                                                                                                                                                                   
           m_num_values = 1;                                                                                                                                                                                                             
           m_data = new u_odb[1]{v};                                                                                                                                                                                                                
           m_tid = m_data[0].get_tid();                                                            
           m_data[0].set_parent(this);                                                         
        }

and looking at 

odbxx.h:
        u_odb(bool v) : m_bool{v}, m_tid{TID_BOOL}, m_parent_odb{nullptr} {};   

only m_bool is set for this instance meaning that only the first byte gets a value 
(still having only 1 byte for bool in c++). If I check m_string inside the u_odb::get function
 of this instance I am getting for a bool (I set false) stuff like 0x7f6633f67a00 and for an int 
(I set the int to 1000) 0x7f66000003e8. Since the size of BOOL is larger I am getting the 
wrong value. I checked this also on openSUSE having the same behavior.

Like you I am not getting this problem on my Mac. What compiler flags do you use on your Mac?

Cheers,
Marius
                   Reply  23 Jun 2020, Stefan Ritt, Suggestion, ODB++ API - documantion updates and odb view after key creation 
Hi Marius,

thanks for your help, you identified the problematic location. I changed that to

   u_odb(bool v) : m_tid{TID_BOOL}, m_parent_odb{nullptr} {m_string = nullptr; m_bool = v;};

which should initialize the full 8 bytes of the u_odb union. I committed to develop. Can you 
please give it a try?

Best,
Stefan


> and looking at 
> 
> odbxx.h:
>         u_odb(bool v) : m_bool{v}, m_tid{TID_BOOL}, m_parent_odb{nullptr} {};   
> 
> only m_bool is set for this instance meaning that only the first byte gets a value 
> (still having only 1 byte for bool in c++). If I check m_string inside the u_odb::get function
>  of this instance I am getting for a bool (I set false) stuff like 0x7f6633f67a00 and for an int 
> (I set the int to 1000) 0x7f66000003e8. Since the size of BOOL is larger I am getting the 
> wrong value. I checked this also on openSUSE having the same behavior.
                      Reply  24 Jun 2020, Marius Koeppel, Suggestion, ODB++ API - documantion updates and odb view after key creation 
Hi Stefan,

now everything works well (Tested on: OpenSuse and Arch Linux) :) 

Thank you for the fix.

Cheers,
Marius

> Hi Marius,
> 
> thanks for your help, you identified the problematic location. I changed that to
> 
>    u_odb(bool v) : m_tid{TID_BOOL}, m_parent_odb{nullptr} {m_string = nullptr; m_bool = v;};
> 
> which should initialize the full 8 bytes of the u_odb union. I committed to develop. Can you 
> please give it a try?
> 
> Best,
> Stefan
> 
> 
> > and looking at 
> > 
> > odbxx.h:
> >         u_odb(bool v) : m_bool{v}, m_tid{TID_BOOL}, m_parent_odb{nullptr} {};   
> > 
> > only m_bool is set for this instance meaning that only the first byte gets a value 
> > (still having only 1 byte for bool in c++). If I check m_string inside the u_odb::get function
> >  of this instance I am getting for a bool (I set false) stuff like 0x7f6633f67a00 and for an int 
> > (I set the int to 1000) 0x7f66000003e8. Since the size of BOOL is larger I am getting the 
> > wrong value. I checked this also on openSUSE having the same behavior.
Entry  19 Jun 2020, Isaac Labrie-Boulay, Info, Building/running a Frontend Task 
To build a frontend task, the user code and system code are compiled and linked 
together with the required libraries, by running a Makefile (e.g. 
../midas/examples/experiment/Makefile in the MIDAS package).

I tried building the CAMAC example frontend and I get this error:

g++: error: /home/rcmp/packages/midas/drivers/camac/ces8210.c: No such file or 
directory
g++: error: /home/rcmp/packages/midas/linux/lib/libmidas.a: No such file or 
directory
make: *** [camac_init.exe] Error 1

Obviously, I'm running the "make all" command from the camac directory. Why 
would I get this "no such file" error? Do I need to download the MIDAS packages 
inside my experiment directory?

Thanks for helping me out.

Isaac
Entry  18 Jun 2020, Ruslan Podviianiuk, Forum, ODB key length 
Hello,

I have a question about length of the name of ODB key.
Is it possible to create an ODB key containing more than 32 characters?

Thanks.
Ruslan
    Reply  18 Jun 2020, Stefan Ritt, Forum, ODB key length 
No. But if you need more than 32 characters, you do something wrong. The 
information you want to put into the ODB key name should probably be stored in 
another string key or so.

Stefan


> Hello,
> 
> I have a question about length of the name of ODB key.
> Is it possible to create an ODB key containing more than 32 characters?
> 
> Thanks.
> Ruslan
Entry  15 Jun 2020, Isaac Labrie Boulay, Bug Report, Killing and ODB - Removed ODB client because process pid does not exists 
Hey everyone,

When I run mhttpd I get the following error message:

[mhttpd,ERROR] [odb.cxx:1720:db_open_database,ERROR] Removed ODB client 
'mhttpd', index 0 because process pid 4531 does not exists
[mhttpd,INFO] Removed open record flag from "/Experiment/Security/RPC 
hosts/Allowed hosts"
[mhttpd,INFO] Removed exclusive access mode from "/Experiment/Security/RPC 
hosts/Allowed hosts"
[mhttpd,INFO] Removed open record flag from "/Experiment/Security/mhttpd 
hosts/Allowed hosts"
[mhttpd,INFO] Removed exclusive access mode from "/Experiment/Security/mhttpd 
hosts/Allowed hosts"
[mhttpd,INFO] Removed open record flag from "/Sequencer/State"
[mhttpd,INFO] Removed exclusive access mode from "/Sequencer/State"
[mhttpd,INFO] Corrected 3 ODB entries
[mhttpd,INFO] Deleted entry '/System/Clients/4531' for client 'mhttpd' because 
it is not connected to ODB
[mhttpd,INFO] Client 'mhttpd' on buffer 'SYSMSG' removed by bm_open_buffer 
because process pid 4531 does not exist
Mongoose web server will not use password protection
mongoose web server is listening on the HTTP port 8080

So mhttpd works as I have access to it through my browser but mlogger does not 
work when I try running it (Alarm: Program Logger is not running). I've 
managed to get mlogger working before and I think that the problem might be 
from maybe having another instance of ODB running without me knowing. 

Has anyone ever had this issue?

Thanks so much for your time.

Isaac
Entry  15 Jun 2020, Martin Mueller, Bug Report, deprecated function stime() 
Hi

I had a problem with the compilation of midas after an OS update to the recent version of OpenSuse tumbleweed. The function stime() in system.cxx:3196 is no longer available. 

In the documentation it is also marked as deprecated with the suggestion to use clock_settime instead:
https://man7.org/linux/man-pages/man2/stime.2.html

replacing system.cxx:3196 with the clock_settime - method in system.cxx:3200 - 3204 also for OS_UNIX seems to solve the problem, but i'm not sure if this will cause problems on older OS's.

Martin
    Reply  15 Jun 2020, Stefan Ritt, Bug Report, deprecated function stime() 
The function stime() has been replaced by clock_settime() on Feb. 2020:

https://bitbucket.org/tmidas/midas/commits/c732120e7c68bbcdbbc6236c1fe894c401d9bbbd

Please always pull before submitting bug reports.

Best,
Stefan
Entry  09 Jun 2020, Isaac Labrie Boulay, Info, Preparing the VME hardware - VME address jumpers. VME_address_jumpers_broken_link.PNG
Hey folks,

I'm currently working on setting up a MIDAS experiment and I am following the 
"Setup MIDAS experiment at Triumf" page on 
MidasWiki(https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_
TRIUMF).

The 3rd line of the hardware checklist under the "Prepare VME hardware section" 
has a link to a page that doesn't exit anymore, I'm trying to figure out how to 
setup the VME address jumpers on the VME modules.

Does anyone know how to setup the VME modules? Or, can anyone send me a link to 
instructions?

Thanks a lot for your time.

Isaac
    Reply  10 Jun 2020, Konstantin Olchanski, Info, Preparing the VME hardware - VME address jumpers. 
Hi, if you are not using any VME hardware, then you have no VME address jumpers to 
set. https://en.wikipedia.org/wiki/VMEbus

K.O.
       Reply  12 Jun 2020, Isaac Labrie Boulay, Info, Preparing the VME hardware - VME address jumpers. 
> Hi, if you are not using any VME hardware, then you have no VME address jumpers to 
> set. https://en.wikipedia.org/wiki/VMEbus
> 
> K.O.

Hi thanks for taking the time to help me out. I am using a VME-MWS in this experiment.

Let me know what you think.

Isaac
Entry  10 Jun 2020, Ivo Schulthess, Forum, slow-control equipment crashes when running multi-threaded on a remote machine 
Dear all

To reduce the time needed by Midas between runs, we want to change some of our periodic equipment to multi-threaded slow-control equipment. To do that I wanted to start from 
the slowcont with the multi/hv class driver and the nulldev device driver and null bus driver. The example runs fine as it is on the local midas machine and also on remote 
machines. When adding the DF_MULTITHREAD flag to the device driver list, it does not run anymore on remote machines but aborts with the following assertion:

scfe: /home/neutron/packages/midas/src/midas.cxx:1569: INT cm_get_path(char*, int): Assertion `_path_name.length() > 0' failed.

Running the frontend with GDB and set a breakpoint at the exit leads to the following: 

(gdb) where
#0  0x00007ffff68d599f in raise () from /lib64/libc.so.6
#1  0x00007ffff68bfcf5 in abort () from /lib64/libc.so.6
#2  0x00007ffff68bfbc9 in __assert_fail_base.cold.0 () from /lib64/libc.so.6
#3  0x00007ffff68cde56 in __assert_fail () from /lib64/libc.so.6
#4  0x000000000041efbf in cm_get_path (path=0x7fffffffd060 "P\373g", path_size=256)
    at /home/neutron/packages/midas/src/midas.cxx:1563
#5  cm_get_path (path=path@entry=0x7fffffffd060 "P\373g", path_size=path_size@entry=256)
    at /home/neutron/packages/midas/src/midas.cxx:1563
#6  0x0000000000453dd8 in ss_semaphore_create (name=name@entry=0x7fffffffd2c0 "DD_Input", 
    semaphore_handle=semaphore_handle@entry=0x67f700 <multi_driver+96>)
    at /home/neutron/packages/midas/src/system.cxx:2340
#7  0x0000000000451d25 in device_driver (device_drv=0x67f6a0 <multi_driver>, cmd=<optimized out>)
    at /home/neutron/packages/midas/src/device_driver.cxx:155
#8  0x00000000004175f8 in multi_init(eqpmnt*) ()
#9  0x00000000004185c8 in cd_multi(int, eqpmnt*) ()
#10 0x000000000041c20c in initialize_equipment () at /home/neutron/packages/midas/src/mfe.cxx:827
#11 0x000000000040da60 in main (argc=1, argv=0x7fffffffda48)
    at /home/neutron/packages/midas/src/mfe.cxx:2757

I also tried to use the generic class driver which results in the same. I am not sure if this is a problem of the multi-threaded frontend running on a remote machine or is it 
something of our system which is not properly set up. Anyway I am running out of ideas how to solve this and would appreciate any input. 

Thanks in advance,
Ivo
    Reply  10 Jun 2020, Konstantin Olchanski, Forum, slow-control equipment crashes when running multi-threaded on a remote machine 
Yes, it is supposed to crash. On a remote frontend, cm_get_path() cannot be used
(we are on a different computer, all filesystems maybe no the same!) and is actually not set and
triggers a trap if something tries to use it. (this is the crash you see).

The caller to cm_get_path() is a MIDAS semaphore function.

And I think there is a mistake here. It is unusual for the driver framework to use a semaphore. For multithreaded
protection inside the frontend, a mutex would normally be used. (and mutexes do not use cm_get_path(), so
all is good). But if a semaphore is used, than all frontends running on the same computer become
serialized across the locked section. This is the right thing to do if you have multiple frontends
sharing the same hardware, i.e. a /dev/ttyUSB serial line, but why would a generic framework function
do this. I am not sure, I will have to take a look at why there is a semaphore and what it is locking/protecting.

(In midas, semaphores are normally used to protect global memory, such as ODB, or global resources, such as alarms,
against concurrent access by multiple programs, but of course that does not work for remote frontends,
they are on a different computer! semaphores only work locally, do not work across the network!)

K.O.

> 
> scfe: /home/neutron/packages/midas/src/midas.cxx:1569: INT cm_get_path(char*, int): Assertion `_path_name.length() > 0' failed.
> 
> Running the frontend with GDB and set a breakpoint at the exit leads to the following: 
> 
> (gdb) where
> #0  0x00007ffff68d599f in raise () from /lib64/libc.so.6
> #1  0x00007ffff68bfcf5 in abort () from /lib64/libc.so.6
> #2  0x00007ffff68bfbc9 in __assert_fail_base.cold.0 () from /lib64/libc.so.6
> #3  0x00007ffff68cde56 in __assert_fail () from /lib64/libc.so.6
> #4  0x000000000041efbf in cm_get_path (path=0x7fffffffd060 "P\373g", path_size=256)
>     at /home/neutron/packages/midas/src/midas.cxx:1563
> #5  cm_get_path (path=path@entry=0x7fffffffd060 "P\373g", path_size=path_size@entry=256)
>     at /home/neutron/packages/midas/src/midas.cxx:1563
> #6  0x0000000000453dd8 in ss_semaphore_create (name=name@entry=0x7fffffffd2c0 "DD_Input", 
>     semaphore_handle=semaphore_handle@entry=0x67f700 <multi_driver+96>)
>     at /home/neutron/packages/midas/src/system.cxx:2340
> #7  0x0000000000451d25 in device_driver (device_drv=0x67f6a0 <multi_driver>, cmd=<optimized out>)
>     at /home/neutron/packages/midas/src/device_driver.cxx:155
> #8  0x00000000004175f8 in multi_init(eqpmnt*) ()
> #9  0x00000000004185c8 in cd_multi(int, eqpmnt*) ()
> #10 0x000000000041c20c in initialize_equipment () at /home/neutron/packages/midas/src/mfe.cxx:827
> #11 0x000000000040da60 in main (argc=1, argv=0x7fffffffda48)
>     at /home/neutron/packages/midas/src/mfe.cxx:2757
> 
> I also tried to use the generic class driver which results in the same. I am not sure if this is a problem of the multi-threaded frontend running on a remote machine or is it 
> something of our system which is not properly set up. Anyway I am running out of ideas how to solve this and would appreciate any input. 
> 
> Thanks in advance,
> Ivo
       Reply  10 Jun 2020, Stefan Ritt, Forum, slow-control equipment crashes when running multi-threaded on a remote machine 
Few comments:

- As KO write, we might need semaphores also on a remote front-end, in case several programs share the same hardware. So it should work and cm_get_path() should not just exit

- When I wrote the multi-threaded device drivers, I did use semaphores instead of mutexes, but I forgot why. Might be that midas semaphores have a timeout and mutexes not, or 
something along those lines.

- I do need either semaphores or mutexes since in a multi-threaded slow-control font-end (too many dashes...) several threads have to access an internal data exchange buffer, which 
needs protection for multi-threaded environments.

So we can how either fix cm_get_path() or replace all semaphores in with mutexes in midas/src/device_driver.cxx. I have kind of a feeling that we should do both. And what about 
switching to c++ std::mutex instead of pthread mutexes?

Stefan
          Reply  12 Jun 2020, Ivo Schulthess, Forum, slow-control equipment crashes when running multi-threaded on a remote machine 
Thanks you two once again for the very fast answers. I tested the example on the local machine and it works perfectly fine. In the meantime I also created two new drivers for our devices 
and everything works with them, the improvement in time is significant and I will create drivers for all our devices where possible. If they are in a working state I can also provide 
them to add to the Midas drivers. Of course if it would be possible to run the front-end also on our remote machines this would be even better. I am not experienced in any multi-threaded 
programming but if I can provide any help or input, please let me know. 

Have a great weekend,
Ivo
Entry  04 Jun 2020, Lars Martin, Bug Report, midasodb.cxx RBA appends instead of replacing 
I am on branch develop and use the tmfe frontends. I found that a bool vector 
gets bigger every time I read it from the ODB.

Turns out in midasodb.cxx (as of commit 813f696, lines 478ff) the output vector 
"value" gets appended to without resizing.

Since after line 474 xvalue.size()==value.size() it would make sense to simply 
replace value->push_back() with value[i]= .
Entry  30 May 2020, Gennaro Tortone, Bug Report, wrong run number 
Hi,
I build MIDAS and ROOTANA using same tag (midas-2020-03-a, rootana-2020-03-a):

if I build examples in ROOTANA I got wrong run number (always 0):

[root@lxgentor examples]# ./ana.exe -r9090

Using THttpServer in read/write mode
TMidasOnline::connect: Connecting to experiment "exo" on host 
"lxgentor.na.infn.it"
MVOdb::SetMidasStatus: Error: MIDAS db_get_value() at ODB path "//runinfo/Run 
number" returned status 312
Opened output file with name : output00000000.root
TDT724Waveform done init...... 
Create Histos
Create Histos
TMidasOnline::eventRequest: Event request: buffer "SYSTEM" (2), event id 
0xffffffff, trigger mask 0xffffffff, sample 2, request id: 0

it seems that some function try to get "//runinfo/Run number" (double slash) 
instead of "/runinfo/Run number"...

Thanks in advance,
Gennaro
    Reply  30 May 2020, Thomas Lindner, Bug Report, wrong run number 
Hi,

I fixed this particular case, so that I now I get the run number correctly.

But Konstantin will need to explain how this class is supposed to be used more generally.  The example programs have a mix with sometimes needing leading slashes and other times not:

Thomass-MacBook-Pro-3:rootana lindner$ grep -s 'runinfo/Run' */*.c*
libAnalyzer/TRootanaEventLoop.cxx:   fODB->RI("runinfo/Run number", &fCurrentRunNumber);
manalyzer/manalyzer.cxx:   int run_number = midas->odbReadInt("/runinfo/Run number");
manalyzer/manalyzer_v0.cxx:   int run_number = midas->odbReadInt("/runinfo/Run number");
old_analyzer/analyzer.cxx:   gOdb->RI("runinfo/Run number", &gRunNumber);

Cheers,
Thomas

> 
> Hi,
> I build MIDAS and ROOTANA using same tag (midas-2020-03-a, rootana-2020-03-a):
> 
> if I build examples in ROOTANA I got wrong run number (always 0):
> 
> [root@lxgentor examples]# ./ana.exe -r9090
> 
> Using THttpServer in read/write mode
> TMidasOnline::connect: Connecting to experiment "exo" on host 
> "lxgentor.na.infn.it"
> MVOdb::SetMidasStatus: Error: MIDAS db_get_value() at ODB path "//runinfo/Run 
> number" returned status 312
> Opened output file with name : output00000000.root
> TDT724Waveform done init...... 
> Create Histos
> Create Histos
> TMidasOnline::eventRequest: Event request: buffer "SYSTEM" (2), event id 
> 0xffffffff, trigger mask 0xffffffff, sample 2, request id: 0
> 
> it seems that some function try to get "//runinfo/Run number" (double slash) 
> instead of "/runinfo/Run number"...
> 
> Thanks in advance,
> Gennaro
       Reply  30 May 2020, Gennaro Tortone, Bug Report, wrong run number 
Hi,

thanks a lot for your grep... I temporary fix my local ROOTANA code with this:

diff --git a/libAnalyzer/TRootanaEventLoop.cxx b/libAnalyzer/TRootanaEventLoop.cxx
index 57111b6..90cf384 100644
--- a/libAnalyzer/TRootanaEventLoop.cxx
+++ b/libAnalyzer/TRootanaEventLoop.cxx
@@ -733,7 +733,7 @@ int TRootanaEventLoop::ProcessMidasOnline(TApplication*app, const char* hostname
    /* fill present run parameters */
 
    fCurrentRunNumber = 0;
-   fODB->RI("/runinfo/Run number", &fCurrentRunNumber);
+   fODB->RI("runinfo/Run number", &fCurrentRunNumber);
 
    //   if ((fODB->odbReadInt("/runinfo/State") == 3))
    //startRun(0,gRunNumber,0);

Regards,
Gennaro

> Hi,
> 
> I fixed this particular case, so that I now I get the run number correctly.
> 
> But Konstantin will need to explain how this class is supposed to be used more generally.  The example programs have a mix with sometimes needing leading slashes and other times not:
> 
> Thomass-MacBook-Pro-3:rootana lindner$ grep -s 'runinfo/Run' */*.c*
> libAnalyzer/TRootanaEventLoop.cxx:   fODB->RI("runinfo/Run number", &fCurrentRunNumber);
> manalyzer/manalyzer.cxx:   int run_number = midas->odbReadInt("/runinfo/Run number");
> manalyzer/manalyzer_v0.cxx:   int run_number = midas->odbReadInt("/runinfo/Run number");
> old_analyzer/analyzer.cxx:   gOdb->RI("runinfo/Run number", &gRunNumber);
> 
> Cheers,
> Thomas
> 
> > 
> > Hi,
> > I build MIDAS and ROOTANA using same tag (midas-2020-03-a, rootana-2020-03-a):
> > 
> > if I build examples in ROOTANA I got wrong run number (always 0):
> > 
> > [root@lxgentor examples]# ./ana.exe -r9090
> > 
> > Using THttpServer in read/write mode
> > TMidasOnline::connect: Connecting to experiment "exo" on host 
> > "lxgentor.na.infn.it"
> > MVOdb::SetMidasStatus: Error: MIDAS db_get_value() at ODB path "//runinfo/Run 
> > number" returned status 312
> > Opened output file with name : output00000000.root
> > TDT724Waveform done init...... 
> > Create Histos
> > Create Histos
> > TMidasOnline::eventRequest: Event request: buffer "SYSTEM" (2), event id 
> > 0xffffffff, trigger mask 0xffffffff, sample 2, request id: 0
> > 
> > it seems that some function try to get "//runinfo/Run number" (double slash) 
> > instead of "/runinfo/Run number"...
> > 
> > Thanks in advance,
> > Gennaro
       Reply  03 Jun 2020, Konstantin Olchanski, Bug Report, wrong run number 
> 
> But Konstantin will need to explain how this class is supposed to be used more generally.
>

MVOdb is a replacement for VirtualOdb. It has many functions that were missing in VirtualOdb
and it implements access to both XML and JSON ODB dumps.

>  The example programs have a mix with sometimes needing leading slashes and other times not:
> 
> libAnalyzer/TRootanaEventLoop.cxx:   fODB->RI("runinfo/Run number", &fCurrentRunNumber);
> old_analyzer/analyzer.cxx:   gOdb->RI("runinfo/Run number", &gRunNumber);

RI() is MVOdb, no absolute paths, leading "/" not permitted.

> manalyzer/manalyzer.cxx:   int run_number = midas->odbReadInt("/runinfo/Run number");
> manalyzer/manalyzer_v0.cxx:   int run_number = midas->odbReadInt("/runinfo/Run number");

Hmmm... good catch. these are VirtualOdb calls, but they bypass the VirtualOdb interface (which was removed)
and call the odb access methods directly from the TMidasOnline class. They should be replaced
with MVOdb RI() calls (and leading "/" removed).

I was going to look at the TMidasOnline class next - many things need to be updated,
but it will have to wait until I update the MVOdb and the tmfe documentation and until
I update midasio to read and write the new bank32a data files.

K.O.
    Reply  03 Jun 2020, Konstantin Olchanski, Bug Report, wrong run number 
> I build MIDAS and ROOTANA using same tag (midas-2020-03-a, rootana-2020-03-a):
>
> MVOdb::SetMidasStatus: Error: MIDAS db_get_value() at ODB path "//runinfo/Run 
> number" returned status 312
>
> it seems that some function try to get "//runinfo/Run number" (double slash) 
> instead of "/runinfo/Run number"...
> 

You made a mistake somewhere.

rootana release rootana-2020-03 uses VirtualOdb, not MVOdb, so there should be no 
messages from "MVOdb". ODB path "/runinfo/run number" is correct for the 
VirtualOdb classes. MVOdb classes use relative paths, absolute path starting from 
"/" is not permitted, hence the error.

You most likely are using the master branch of rootana.

Commit switching rootana from VirtualOdb to mvodb was made after the release 2020-
03, in May:
https://bitbucket.org/tmidas/rootana/commits/522cd07181c59f557e9ef13a70223ec44be44
bc9

(I confirm the incorrect call to RI("/runinfo/..."), Thomas already fixed it in 
the repository, big thanks!).

The dust is not fully settled yet on the refactoring of rootana, until then, I 
recommend that people use the release version(s).

K.O.
       Reply  03 Jun 2020, Gennaro Tortone, Bug Report, wrong run number 
> > I build MIDAS and ROOTANA using same tag (midas-2020-03-a, rootana-2020-03-a):
> >
> > MVOdb::SetMidasStatus: Error: MIDAS db_get_value() at ODB path "//runinfo/Run 
> > number" returned status 312
> >
> > it seems that some function try to get "//runinfo/Run number" (double slash) 
> > instead of "/runinfo/Run number"...
> > 
>
> You made a mistake somewhere.

you are right !
I used rootana-2020-03-a instead of release/rootana-2020-03... my fault !

I have to (re)compile MIDAS for the same error...

Thanks !
Gennaro

> 
> rootana release rootana-2020-03 uses VirtualOdb, not MVOdb, so there should be no 
> messages from "MVOdb". ODB path "/runinfo/run number" is correct for the 
> VirtualOdb classes. MVOdb classes use relative paths, absolute path starting from 
> "/" is not permitted, hence the error.
> 
> You most likely are using the master branch of rootana.
> 
> Commit switching rootana from VirtualOdb to mvodb was made after the release 2020-
> 03, in May:
> https://bitbucket.org/tmidas/rootana/commits/522cd07181c59f557e9ef13a70223ec44be44
> bc9
> 
> (I confirm the incorrect call to RI("/runinfo/..."), Thomas already fixed it in 
> the repository, big thanks!).
> 
> The dust is not fully settled yet on the refactoring of rootana, until then, I 
> recommend that people use the release version(s).
> 
> K.O.
          Reply  04 Jun 2020, Konstantin Olchanski, Bug Report, wrong run number 
> > You made a mistake somewhere.
> 
> you are right !
> I used rootana-2020-03-a instead of release/rootana-2020-03... my fault !
> 
> I have to (re)compile MIDAS for the same error...

The MIDAS version, including what branch you have used is reported on the midas "help" page and in 
the odbedit "version" command.

For example my midas reports:
Tue Mar 24 20:54:11 2020 -0700 - midas-2020-03-a-98-g8b462cc9 on branch develop

This version string includes:
date of commit
git tag and commit number (see "git describe")
"-dirty" if you have modified sources ("git status" shows modified files)
and which git branch you have (I have "develop", you should have "release/midas-2020-03")

I am not sure how ROOTANA reports the version and build strings. I shall check..O...

K.O.
Entry  04 Jun 2020, Lukas Gerritzen, , stime() deprecated in glibc 2.31 
In glibc 2.31, the stime function was deprecated:

* The obsolete function stime is no longer available to newly linked
  binaries, and its declaration has been removed from <time.h>.
  Programs that set the system time should use clock_settime instead.

https://sourceware.org/legacy-ml/libc-announce/2020/msg00001.html

This creates a problem in src/system.cxx:3197:4
Entry  04 Jun 2020, Hisataka YOSHIDA, Forum, Template of slow control frontend 
I’m beginner of Midas, and trying to develop the slow control front-end with the latest Midas.
I found the scfe.cxx in the “example”, but not enough to refer to write the front-end for my own devices 
because it contains only nulldevice and null bus driver case...
(I could have succeeded to run the HV front-end for ISEG MPod, because there is the device driver...)

Can I get some frontend examples such as simple TCP/IP and/or RS232 devices?
Hopefully, I would like to have examples of frontend and device driver.
(if any device driver which is included in the package is similar, please tell me.)

Thanks a lot.
    Reply  04 Jun 2020, Pintaudi Giorgio, Forum, Template of slow control frontend MIDAS_frontend_sample.zip
> I’m beginner of Midas, and trying to develop the slow control front-end with the latest Midas.
> I found the scfe.cxx in the “example”, but not enough to refer to write the front-end for my own devices 
> because it contains only nulldevice and null bus driver case...
> (I could have succeeded to run the HV front-end for ISEG MPod, because there is the device driver...)
> 
> Can I get some frontend examples such as simple TCP/IP and/or RS232 devices?
> Hopefully, I would like to have examples of frontend and device driver.
> (if any device driver which is included in the package is similar, please tell me.)
> 
> Thanks a lot.

Dear Yoshida-san,
my name is Giorgio and I am a Ph.D. student working on the T2K experiment.
I had to write many MIDAS frontends recently, so I think that my code could be of some help to you.

As you might already know, the MIDAS slow control system is structured into three layers/levels.

 - The highest layer is the "class" layer that directly interfaces with the user and the ODB. It is called 
"class" layer because it refers to a class of devices (for example all the high voltage power supplies, 
etc...). The idea is that in the same experiment you can have many different models of power supplies but 
all of them can be controlled with a single class driver.

 - Then there is the "device" layer that implements the functions specific to the particular device.

 - Finally, there is the "BUS" layer that directly communicates with the device. The BUS can be Ethernet 
(TCP/IP), Serial (RS-232 / RS-422 / RS-485), USB, etc ...

You can read more about the MIDAS slow control system here:
https://midas.triumf.ca/MidasWiki/index.php/Slow_Control_System

Anyway, you need to write code for all those layers. If you are lucky you can reuse some of the already 
existing MIDAS code. Keep in mind that all the examples that you find in the MIDAS documentation and the 
MIDAS source code are written in C (even if it is then compiled with g++). But, you can write a frontend in 
C++ without any problem so choose whichever language you are familiar the most with.

I am attaching an archive with some sample code directly taken from our experiment. It is just a small 
fraction of the code not meant to be compilable. The code is disclosed with the GPL3 license, so you can use 
it as you please but if you do, please cite my name and the WAGASCI-T2K experiment somewhere visible.

In the archive, you can find two example frontends with the respective drivers. The "Triggers" frontend is 
written in C++ (or C+ if you consider that the mfe.cxx API is very C-like). The "WaterLevel" frontend is 
written in plain C. The "Triggers" frontend controls our trigger board called CCC and the "WaterLevel" 
frontend controls our water level sensors called PicoLog 1012. They share a custom implementation of the 
TCP/IP bus. Anyway, this is not relevant to you. You may just want to take a look at the code structure.

Finally, recently there have been some very interesting developments regarding the ODB C++ API. I would 
definitely take a look at that. I wish I had that when I was developing these frontends.

Good luck

--
Pintaudi Giorgio, Ph.D. student
Neutrino and Particle Physics Minamino Laboratory
Faculty of Science and Engineering, Yokohama National University
giorgio.pintaudi.kx@ynu.jp
TEL +81(0)45-339-4182
       Reply  04 Jun 2020, Hisataka YOSHIDA, Forum, Template of slow control frontend 
Dear Giorgio,

Thank you very much for your kind and quick reply!

I appreciate you giving me such a nice explanation, experience, and great sample codes (This is what I desired!).
They all are useful for me. I will try to write my frontend codes using gift from you.

Thank you again!

Best regards,
Hisataka Yoshida
    Reply  04 Jun 2020, Stefan Ritt, Forum, Template of slow control frontend 
> I’m beginner of Midas, and trying to develop the slow control front-end with the latest Midas.
> I found the scfe.cxx in the “example”, but not enough to refer to write the front-end for my own devices 
> because it contains only nulldevice and null bus driver case...
> (I could have succeeded to run the HV front-end for ISEG MPod, because there is the device driver...)
> 
> Can I get some frontend examples such as simple TCP/IP and/or RS232 devices?
> Hopefully, I would like to have examples of frontend and device driver.
> (if any device driver which is included in the package is similar, please tell me.)

Have you checked the documentation?

https://midas.triumf.ca/MidasWiki/index.php/Slow_Control_System

Basically you have to replace the nulldevice driver with a "real" driver. You find all existing drivers under 
midas/drivers/device. If your favourite is not there, you have to write it. Use one which is close to the one 
you need and modify it.

Best,
Stefan
       Reply  04 Jun 2020, Hisataka YOSHIDA, Forum, Template of slow control frontend 
Dear Stefan,

Thank you for you quick reply.


> Have you checked the documentation?
> 
> https://midas.triumf.ca/MidasWiki/index.php/Slow_Control_System

Yes, I have read the wiki, but not easy to figure out how I treat the individual case.

> Basically you have to replace the nulldevice driver with a "real" driver. You find all existing drivers under 
> midas/drivers/device. If your favourite is not there, you have to write it. Use one which is close to the one 
> you need and modify it.

Okay, I will try to write drivers for my own devices using existing drivers.
(maybe I can find some device drivers which uses TCP/IP, RS232)

Best regards,
Hisataka Yoshida
Entry  24 Apr 2020, Pintaudi Giorgio, Forum, API to read MIDAS format file 
Dear MIDAS people,
I need to borrow your wisdom for a bit.
I am developing a piece of software that should read the history data stored in a
.midas file (MIDAS format) and integrate it into the WAGASCI data quality output.
In other words, I need to read some temperature values stored in a .midas file and
compare them with the MPPC gains and check for temperature/gain dependence.
I see three possibilities:
  • write a custom parser in C++ using the instructions contained in the Mhformat page;
  • call the mhist program from within my application;
  • call the mhdump program from within my application;
Which solution do you think is the best?
Because there is no need for raw performance, if possible, I would like to write my application in Python3 but C++ is also an option.
    Reply  24 Apr 2020, Stefan Ritt, Forum, API to read MIDAS format file 
I guess all three options would work. I just tried mhist and it still works with the "FILE" history

mhist -e <equipment name> -v <variable name> -h 10

for dumping a variable for the last 10 hours.

I could not get mhdump to work with current history files, maybe it only works with "MIDAS" history and not "FILE" history (see https://midas.triumf.ca/MidasWiki/index.php/History_System#History_drivers). Maybe Konstantin who wrote mhdump has some idea.

Writing your own parser is certainly possible (even in Python), but of course more work.

Stefan
       Reply  24 Apr 2020, Pintaudi Giorgio, Forum, API to read MIDAS format file 

Stefan Ritt wrote:
I guess all three options would work. I just tried mhist and it still works with the "FILE" history

mhist -e <equipment name> -v <variable name> -h 10

for dumping a variable for the last 10 hours.

I could not get mhdump to work with current history files, maybe it only works with "MIDAS" history and not "FILE" history (see https://midas.triumf.ca/MidasWiki/index.php/History_System#History_drivers). Maybe Konstantin who wrote mhdump has some idea.

Writing your own parser is certainly possible (even in Python), but of course more work.

Stefan


Dear Stefan,
thank you very much for the quick reply. Sorry if my message was not very clear, actually we are using the "MIDAS" history format and not the "FILE" one. So both mhist and mhdump should be ok (however I have only tested mhist).
Hypothetically which one between the two lends itself the better to being "batched"? I mean to be read and controlled by a program/routine. For example, some programs give the option to have the output formatted in json, etc...
          Reply  24 Apr 2020, Stefan Ritt, Forum, API to read MIDAS format file 

Pintaudi Giorgio wrote:

Hypothetically which one between the two lends itself the better to being "batched"? I mean to be read and controlled by a program/routine. For example, some programs give the option to have the output formatted in json, etc...


Can't say on the top of my head. Both program are pretty old (written well before JSON has been invented, so there is no support for that in both). mhist was written by me mhdump was written by Konstantin. I would both give a try and see what you like more.

Stefan
    Reply  25 Apr 2020, Konstantin Olchanski, Forum, API to read MIDAS format file 
<p>[quote=&quot;Pintaudi Giorgio&quot;]Dear MIDAS people, I need to borrow your 
wisdom for a bit. I am developing a piece of software that should read the history data 
stored in a [FONT=Times New Roman].midas[/FONT] file (MIDAS format) and integrate it 
into the WAGASCI data quality output. In other words, I need to read some temperature 
values stored in a [FONT=Times New Roman].midas[/FONT] file and compare them with 
the MPPC gains and check for temperature/gain dependence. I see three possibilities: 
[LIST] [*] write a custom parser in C++ using the instructions contained in the 
[URL=https://midas.triumf.ca/MidasWiki/index.php/Mhformat]Mhformat page[/URL]; [*] call 
the mhist program from within my application; [*] call the mhdump program from within my 
application; [/LIST] [B]Which solution do you think is the best?[/B] Because there is no 
need for raw performance, if possible, I would like to write my application in Python3 but 
C++ is also an option. [/quote]</p>

(Please write messages in plain text format, thank you)

The format of .hst midas history files is pretty simple and mhdump.cxx is an easy to read 
illustration on how to read it from basic principles (without going through the midas library, 
which can be somewhat complicated). The newer "FILE" format for history is even simpler 
to read because it is just fixed-record-size binary data prepended by a text header.

You can also use the mh2sql program to import history data into an sql database (mysql 
and sqlite should work) or to convert .hst files to "FILE" format files. This works well
for "archiving" history data, because the "FILE" format works better for looking at old data,
and for looking at data in "months" or "years" timescale.

Back to your question, you can certainly use "mhdump" as is, using a pipe (popen()), or 
you can package mhdump.cxx as a c++ class and use it in your application. If you go this 
route, your contribution of such a c++ class back to midas would be very welcome.

You can also use mhist, but the mhist code cannot be trivially packaged as a c++ class
to use in your application.

You can also suggest that we write an easier to use history utility, we are always open to 
suggested improvements.

Let us know how it works out for you. Good luck!

K.O.
       Reply  03 May 2020, Pintaudi Giorgio, Forum, API to read MIDAS format file 
> The format of .hst midas history files is pretty simple and mhdump.cxx is an easy to read 
> illustration on how to read it from basic principles (without going through the midas library, 
> which can be somewhat complicated). The newer "FILE" format for history is even simpler 
> to read because it is just fixed-record-size binary data prepended by a text header.
> 
> You can also use the mh2sql program to import history data into an sql database (mysql 
> and sqlite should work) or to convert .hst files to "FILE" format files. This works well
> for "archiving" history data, because the "FILE" format works better for looking at old data,
> and for looking at data in "months" or "years" timescale.
> 
> Back to your question, you can certainly use "mhdump" as is, using a pipe (popen()), or 
> you can package mhdump.cxx as a c++ class and use it in your application. If you go this 
> route, your contribution of such a c++ class back to midas would be very welcome.
> 
> You can also use mhist, but the mhist code cannot be trivially packaged as a c++ class
> to use in your application.
> 
> You can also suggest that we write an easier to use history utility, we are always open to 
> suggested improvements.
> 
> Let us know how it works out for you. Good luck!
> 
> K.O.

Dear Konstantin,
thank you very much for the wealth of information you provided.
I have thought about it and I see two options:

- One is to convert to SQL format and then use a SQLite library to import the data in my 
application.

- The other is to encapsulate the mhdump.cxx code into a C++ class, as you say.

I am leaning towards the first option for three reasons.
1. I have never used a SQLite database so it is a good learning opportunity for me.
2, The SQLite database format is very well known and widespread, so there are tons of tools to 
handle it
3. I have taken a look at the mhdump.cxx source code and I think it is a beautiful piece of code, 
but has a very "functional" taste with little encapsulation. Basically, all the fun is happening 
inside the readHstFile function and there is no trivial way to get the data out of it. I don't mean 
that it would be difficult to wrap it around a C++ class, but I feel that I can learn more by going 
the SQL way.

PS some time ago, I don't remember if you or Stefan, recommended CLion as C++ IDE. I have tried it 
(together with PyCharm) and I must admit that it is really good. It took me years to configure Emacs 
as a IDE, while it took me minutes to have much better results in CLion. Thank you very much for 
your recommendation.
          Reply  03 May 2020, Konstantin Olchanski, Forum, API to read MIDAS format file 
>
> - One is to convert to SQL format and then use a SQLite library to import the data in my 
> application.
> 

You can also configure midas to write history directly to an SQLITE database. I have not used
it recently, but it should still work. In terms of efficiency, sqlite file size is about the same
as .hst files. sqlite file and table naming is similar to the SQL and FILE implementation.

(But note that back when I implemented the SQLITE history writer, sqlite database corruption
recovery instructions were "delete the file, restore from backup". And indeed in every test
experiment I tried, the sqlite history databases eventually corrupted themselves. You see
same thing with google-chrome, lots of sqlite errors (bad locking, corrupted table, etc)
in it's terminal output).

>
> - The other is to encapsulate the mhdump.cxx code into a C++ class, as you say.
> 

If I were to write this today, there would be a c++ class that takes a history file,
iterates over all records and calls "callback" classlets. You can see this in the history.h
(HistoryBufferInterface) and in the tmfe.h (RpcHandlerInterface, etc).

I think this style of OO programming originally comes from java. If you so desire,
an "mhdump" class could be a nice way to learn it.

> 
> PS some time ago, I don't remember if you or Stefan, recommended CLion as C++ IDE. I have tried it 
> (together with PyCharm) and I must admit that it is really good. It took me years to configure Emacs 
> as a IDE, while it took me minutes to have much better results in CLion. Thank you very much for 
> your recommendation.
>

I remember, years ago, the Borland TurboC IDE was like a gift from Gods. But today, I think IDEs have
declined in quality and usefulness. They clog the screen with too much eye candy and fluff, use hard
to read fonts and silly colours, insist on using tabs where I want spaces, reformat the text even as I type it,
and detract from productive work with distracting popups ("try this new function!", "let's upgrade now!").

For serious programming, I use emacs with minimal decorations. I can easily open 3 or 4 windows at the same
time and still have enough screen space left for a terminal to run "make". And it is the only editor that can
edit the same file in two or more windows at the same time. You do not know you need this until
you work on odb.cxx.

K.O.
             Reply  04 May 2020, Pintaudi Giorgio, Forum, API to read MIDAS format file 
> (But note that back when I implemented the SQLITE history writer, sqlite database corruption
> recovery instructions were "delete the file, restore from backup". And indeed in every test
> experiment I tried, the sqlite history databases eventually corrupted themselves. You see
> same thing with google-chrome, lots of sqlite errors (bad locking, corrupted table, etc)
> in it's terminal output).

Thank you for the info. But I do not quite understand the comment above.
Do you mean that there is something wrong with the SQLite library itself or with the way that MIDAS creates the SQLite 
database?
          Reply  03 May 2020, Stefan Ritt, Forum, API to read MIDAS format file ReferenceCardForMac.pdf
> PS some time ago, I don't remember if you or Stefan, recommended CLion as C++ IDE. I have tried it 
> (together with PyCharm) and I must admit that it is really good. It took me years to configure Emacs 
> as a IDE, while it took me minutes to have much better results in CLion. Thank you very much for 
> your recommendation.

Was probably me. I use it as my standard IDE and am quite happy with it. All the things KO likes with emacs, plus much 
more. Especially the CMake integration is nice, since you don't have to leave the IDE for editing, compiling and debugging. 
The tooltips the IDE gave me in the past months made me write code much better. So quite an opposite opinion compared 
with KO, but luckily this planet has space for all kinds of opinions. I made myself the cheat sheet attached, which lets me 
do things much faster. Maybe you can use it.

Stefan
             Reply  26 May 2020, Pintaudi Giorgio, Forum, API to read MIDAS format file 
Eventually, I have settled for the SQLite format.
I could convert the MIDAS history files .hst to SQLite
database .sqlite3 using the utility mh2sql.
It worked out nicely, thank you for the advice.

However, as Konstantine predicted I did notice some
database corruption when a couple of problematic .hst
files were read. I solved the issue by just deleting
those .hst files (I think they were empty anyway).

Now I am developing a piece of code to read the
database using the SOCI library and integrate it
into a TTree but this is not relevant for MIDAS I think.

Thank you again for the discussion.
Entry  22 May 2020, Thomas Lindner, Bug Report, More trouble with openssl on macos 
For the record, here's my report of difficulties getting mongoose to compile with macos.  This is a similar 
problem reported before, but with slightly different error messages.  So I put them here for posterity.

Setup: 
- macos: 10.13.6
- xcode: 9.2
- gcc: Thomass-MacBook-Pro-3:build lindner$ gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-
dir=/usr/include/c++/4.2.1
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin17.7.0
- midas: today's version

Start with the openssl version I had already installed.  cmake says 

-- Found OpenSSL: /usr/lib/libcrypto.dylib (found version "1.0.2s") 
-- MIDAS: Found OpenSSL version 1.0.2s

make install fails with this error:

[ 35%] Linking CXX executable mhttpd
cd /Users/lindner/packages/midas/build/progs && /usr/local/Cellar/cmake/3.15.0/bin/cmake -E 
cmake_link_script CMakeFiles/mhttpd.dir/link.txt --verbose=1
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++  -O2 -g -
DNDEBUG -Wl,-search_paths_first -Wl,-headerpad_max_install_names  CMakeFiles/mhttpd.dir/mhttpd.cxx.o 
CMakeFiles/mhttpd.dir/mongoose616.cxx.o CMakeFiles/mhttpd.dir/mgd.cxx.o 
CMakeFiles/mhttpd.dir/__/mscb/src/mscb.cxx.o  -o mhttpd ../libmidas.a /usr/lib/libssl.dylib 
/usr/lib/libcrypto.dylib -lz -lcurl -lsqlite3
Undefined symbols for architecture x86_64:
  "_SSL_CTX_set_psk_client_callback", referenced from:
      _mg_ssl_if_conn_init in mongoose616.cxx.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Use macports to upgrade to newer openssl

sudo port selfupdate
sudo port upgrade outdated

Now cmake says

-- Found OpenSSL: /usr/lib/libcrypto.dylib (found version "1.1.1g") 
-- MIDAS: Found OpenSSL version 1.1.1g

Error message is different now:

cd /Users/lindner/packages/midas/build/progs && /usr/local/Cellar/cmake/3.15.0/bin/cmake -E 
cmake_link_script CMakeFiles/mhttpd.dir/link.txt --verbose=1
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++  -O2 -g -
DNDEBUG -Wl,-search_paths_first -Wl,-headerpad_max_install_names  CMakeFiles/mhttpd.dir/mhttpd.cxx.o 
CMakeFiles/mhttpd.dir/mongoose616.cxx.o CMakeFiles/mhttpd.dir/mgd.cxx.o 
CMakeFiles/mhttpd.dir/__/mscb/src/mscb.cxx.o  -o mhttpd ../libmidas.a /usr/lib/libssl.dylib 
/usr/lib/libcrypto.dylib -lz -lcurl -lsqlite3
Undefined symbols for architecture x86_64:
  "_OPENSSL_init_ssl", referenced from:
      _mg_mgr_init_opt in mongoose616.cxx.o
      _mg_ssl_if_init in mongoose616.cxx.o
  "_SSL_CTX_set_options", referenced from:
      _mg_ssl_if_conn_init in mongoose616.cxx.o
  "_SSL_CTX_set_psk_client_callback", referenced from:
      _mg_ssl_if_conn_init in mongoose616.cxx.o
ld: symbol(s) not found for architecture x86_64

Fine.  Doing 'cmake -D NO_SSL=1 ..' to build still works fine; I will stick with that since I don't need SSL on my 
laptop.

Perhaps we should disable SSL by default on Macos?  People may only have ~50% chance of getting it to 
work.
  
    Reply  22 May 2020, Konstantin Olchanski, Bug Report, More trouble with openssl on macos 
> For the record, here's my report of difficulties getting mongoose to compile with macos. 
> -- MIDAS: Found OpenSSL version 1.0.2s
> -- MIDAS: Found OpenSSL version 1.1.1g
> ... [ all of them did not work ]

For the record, I get this on mac os 10.15.4 and it works.
-- MIDAS: Found OpenSSL version 1.1.1g

I think I am quite fed up with this openssl business, too.

What I will do in MIDAS is fix the mbedtls detection, add mbedtls instructions
in the documentation and remove openssl from mhttpd build.

Result will be:
- default build will have mhttpd without https support, and this works in 100% of our use cases at TRIUMF.
- if user do not want to use apache https proxy, they have to "git clone" mbedtls, build it, rebuild mhttpd, then
they get https support, but for https certificate management - getting them, renewing them, etc, they are
on their own.

Since mhttpd has no integration with certbot, automatic management of https certificates does not work,
so good luck again.

In theory, I can try to add certbot integration, but even the most basic tools are missing, for example, openssl
does not report certificate expiration dates (I guess I must write my own code to examine the certificate
and hope my idea of expiration matches their idea). Since I do not see certificate expiration dates, every day I could
blindly run "certbot renew" and restart openssl with the updated certificate (but I think openssl does
not have a "restart" function, so again, forget about it). Adding insult to injury, by default, certbot stores certificates
in a secret location in /etc where mhttpd cannot access them.

Bottom line is that powers-that-be messed up https certificate management and until that is sorted out and is easy
to integrate with custom web servers, I can only recommend that mhttpd must run behind the "OS support https proxy".

K.O.
Entry  16 Mar 2020, Konstantin Olchanski, Release, midas-2020-03-a 
midas-2020-03-a is here.

Accumulated changes and bug fixes since last tag midas-2019-09-i.

After this release, expect some instability on the develop branch as I commit the update of mhttpd to mongoose web server library 
version 6.16. More on that later.

To obtain this release, either checkout the top of branch release/midas-2020-03 (recommended)
or checkout the tag midas-2020-03-a.

K.O.
    Reply  22 May 2020, Konstantin Olchanski, Release, midas-2020-03-a 
> midas-2020-03-a is here.
> checkout the top of branch release/midas-2020-03 (recommended) or
> checkout the tag midas-2020-03-a.

Since the release of midas-2020-03, we are in a cycle of rapid development of midas,
with many changes made daily.

For production use, unless you rely on latest changes and/or bug fixes, please use the midas-2020-03 release branch.

Some of the recent changes broke compatibility with ROOTANA.

The current ROOTANA release 2020-03 is meant to work with and is compatible with midas-2020-03. Going forward
we will try to keep releases of midas and rootana in "lock step".

K.O.
Entry  12 May 2020, Stefan Ritt, Info, New ODB++ API odbxx_test.cxx
Since the beginning of the lockdown I have been working hard on a new object-oriented interface to the online database ODB. I have the code now in an initial state where it is ready for 
testing and commenting. The basic idea is that there is an object midas::odb, which represents a value or a sub-tree in the ODB. Reading, writing and watching is done through this 
object. To get started, the new API has to be included with

   #include <odbxx.hxx>

To create ODB values under a certain sub-directory, you can either create one key at a time like:

   midas::odb o;
   o.connect("/Test/Settings", true);   // this creates /Test/Settings
   o.set_auto_create(true);            // this turns on auto-creation
   o["Int32 Key"] = 1;                 // create all these keys with different types
   o["Double Key"] = 1.23;
   o["String Key"] = "Hello";

or you can create a whole sub-tree at once like:

  midas::odb o = {
    {"Int32 Key", 1},
    {"Double Key", 1.23},
    {"String Key", "Hello"},
    {"Subdir", {
      {"Another value", 1.2f}
    }
  };
  o.connect("/Test/Settings");

To read and write to the ODB, just read and write to the odb object

   int i = o["Int32 Key];
   o["Int32 Key"] = 42;
   std::cout << o << std::endl;

This works with basic types, strings, std::array and std::vector. Each read access to this object triggers an underlying read from the ODB, and each write access triggers a write to the 
ODB. To watch a value for change in the odb (the old db_watch() function), you can use now c++ lambdas like:

   o.watch([](midas::odb &o) {
      std::cout << "Value of key \"" + o.get_full_path() + "\" changed to " << o << std::endl;
   });

Attached is a full running example, which is now also part of the midas repository. I have tested most things, but would not yet use it in a production environment. Not 100% sure if there 
are any memory leaks. If someone could valgrind the test program, I would appreciate (currently does not work on my Mac).

Have fun!

Stefan

  
    Reply  20 May 2020, Konstantin Olchanski, Info, New ODB++ API 
>    midas::odb o;
>    o["foo"] = 1;

This is an excellent development.

ODB is a tree-structured database, JSON is a tree-structured data format,
and they seem to fit together like hand and glove. For programming
web pages, Javascript and JSON-style access to ODB seems to work really well.

And now with modern C++ we can have a similar API for working with ODB tree data,
as if it were Javascript JSON tree data.

Let's see how well it works in practice!

K.O.
    Reply  20 May 2020, Stefan Ritt, Info, New ODB++ API odbxx_test.cxx
In meanwhile, there have been minor changes and improvements to the API:

Previously, we had:

>    midas::odb o;
>    o.connect("/Test/Settings", true);   // this creates /Test/Settings
>    o.set_auto_create(true);            // this turns on auto-creation
>    o["Int32 Key"] = 1;                 // create all these keys with different types
>    o["Double Key"] = 1.23;
>    o["String Key"] = "Hello";

Now, we only need:

      o.connect("/Test/Settings");
      o["Int32 Key"] = 1;                 // create all these keys with different types
      ...

no "true" needed any more. If the ODB tree does not exist, it gets created. Similarly, set_auto_create() can be dropped, it's on by default (thought this makes more sense). Also the iteration over subkeys has 
been changed slightly.

The full example attached has been updated accordingly. 

Best,
Stefan
    Reply  20 May 2020, Pintaudi Giorgio, Info, New ODB++ API 
All this is very good news. I really wish this were available some months ago: it would have helped me immensely. The old C API was clunky at best.
I really like the idea and looking forward to using it (even if at the moment I do not have the need to) ...
       Reply  20 May 2020, Konstantin Olchanski, Info, New ODB++ API 
> All this is very good news. I really wish this were available some months ago: it would have helped me immensely. The old C API was clunky at best.
> I really like the idea and looking forward to using it (even if at the moment I do not have the need to) ...

Yes, I have designed new C-style MIDAS ODB APIs twice now (VirtualOdb in ROOTANA and MVOdb in ROOTANA and MIDAS),
and I was never happy with the results. There is too many corner cases and odd behaviour. Let's see how
this C++ interface shakes out.

For use in analyzers, Stefan's C++ interface still need to be virtualized - right now it has only one implementation
with the MIDAS ODB backend. In analyzers, we need XML, JSON (and a NULL ODB) backends. The API looks
to be clean enough to add this, but I have not looked at the implementation yet. So "watch this space" as they say.

K.O.
Entry  12 May 2020, Ruslan Podviianiuk, Forum, List of sequencer files 
Hello,

We are going to implement a list of sequencer files to allow users to select one 
of them. The name of this file will be transferred to 
/ODB/Sequencer/State/Filename field of ODB. 

Is it possible to get a list of Sequencer files from MIDAS? Is there a jrpc 
command for this?

Thanks.

Best,
Ruslan
    Reply  13 May 2020, Stefan Ritt, Forum, List of sequencer files Screenshot_2020-05-13_at_9.11.55_.png
If you load a file into the sequencer from the web interface, you get a list of all files in that directory. 
This basically gives you a list of possible sequencer files. It's even more powerful, since you can 
create subdirectories and thus group the sequencer files. Attached an example from our 
experiment.

Stefan
       Reply  18 May 2020, Ruslan Podviianiuk, Forum, List of sequencer files 
> If you load a file into the sequencer from the web interface, you get a list of all files in that directory. 
> This basically gives you a list of possible sequencer files. It's even more powerful, since you can 
> create subdirectories and thus group the sequencer files. Attached an example from our 
> experiment.
> 
> Stefan


Dear Stefan,

Thank you for the explanation.

Ruslan
       Reply  19 May 2020, Ruslan Podviianiuk, Forum, List of sequencer files 
> If you load a file into the sequencer from the web interface, you get a list of all files in that directory. 
> This basically gives you a list of possible sequencer files. It's even more powerful, since you can 
> create subdirectories and thus group the sequencer files. Attached an example from our 
> experiment.
> 
> Stefan

Dear Stefan,

Could you please answer one more question:

We have a custom webpage and trying to get list of files from the custom webpage and need jrpc command to show it 
in custom page. Is there a jrpc command to get this file list?

Thanks,
          Reply  20 May 2020, Konstantin Olchanski, Forum, List of sequencer files 
> 
> We have a custom webpage and trying to get list of files from the custom webpage and need jrpc command to show it 
> in custom page. Is there a jrpc command to get this file list?
> 

The rpc method used by sequencer web pages is "seq_list_files". How to use it, see resources/load_script.html.

To see list of all rpc methods implemented by your mhttpd, see "help"->"json-rpc schema, text format".

As general explanation, so far we have successfully resisted the desire to turn mhttpd into a generic NFS file
server - if we automatically give all web pages access to all files accessible to the midas user account, it is easy
to lose control of system security (i.e. bad things will happen if web pages can read the ssh private keys ~/.ssh/id_rsa and
modify ~/.ssh/authorized_keys). Generally it is impossible to come up with a whitelist or blacklist of "secrets" that
need to be "hidden" from web pages. But we did implement methods to access files from specific subdirectory trees
defined in ODB which hopefully do not contain any "secrets".

K.O.
Entry  07 May 2020, Estelle, Bug Report, Conflic between Rootana and midas about the redefinition of TID_xxx data types  
Dear Midas and Rootana people,

We have tried to update our midas DAQ with the new TID definitions describe in https://midas.triumf.ca/elog/Midas/1871 

And we have noticed an incompatibility of this new definitions with Rootana when reading an XmlOdb in our offline analyzer. 

The problem comes from  the function FindArrayPath in XmlOdb.cxx and the comparison of bank type as strings.
Ex: comparing the strings "DWORD" and "UNINT32"

An naive solution would be to print the number associated to the type (ex: '6' for DWORD/UNINT32), but that would mean changing Rootana and Midas source code. Moreover, it does decrease the readability of the XmlOdb file. 


Thanks for your time.
Estelle
    Reply  20 May 2020, Konstantin Olchanski, Bug Report, Conflic between Rootana and midas about the redefinition of TID_xxx data types  
> Dear Midas and Rootana people,
> 
> We have tried to update our midas DAQ with the new TID definitions describe in https://midas.triumf.ca/elog/Midas/1871 
> 
> And we have noticed an incompatibility of this new definitions with Rootana when reading an XmlOdb in our offline analyzer. 
> 
> The problem comes from  the function FindArrayPath in XmlOdb.cxx and the comparison of bank type as strings.
> Ex: comparing the strings "DWORD" and "UNINT32"
> 
> An naive solution would be to print the number associated to the type (ex: '6' for DWORD/UNINT32), but that would mean changing Rootana and Midas source code. Moreover, it does decrease the readability of the XmlOdb file. 
> 

Hi, it is unfortunate that a change was made in MIDAS that is incompatible with existing analysis software. I shall update the ROOTANA package to deal with this ASAP.

K.O.
Entry  01 May 2020, Joseph McKenna, Forum, Taking MIDAS beyond 64 clients 

Hi all,

I have been experimenting with a frontend solution for my experiment 
(ALPHA). The intention to replace how we log data from PCs running LabVIEW. 

I am at the proof of concept stage. So far I have some promising 
performance, able to handle 10-100x more data in my test setup (current 
limitations now are just network bandwith, MIDAS is impressively efficient). 

==========================================================================

Our experiment has many PCs using LabVIEW which all log to MIDAS, the 
experiment has grown such that we need some sort of load balancing in our 
frontend.

The concept was to have a 'supervisor frontend' and an array of 'worker 
frontend' processes. 
-A LabVIEW client would connect to the supervisor, then be referred to a 
worker frontend for data logging. 
-The supervisor could start a 'worker frontend' process as the demand 
required.

To increase accountability within the experiment, I intend to have a 'worker 
frontend' per PC connecting. Then any rouge behavior would be clear from the 
MIDAS frontpage.

Presently there around 20-30 of these LabVIEW PCs, but given how the group 
is growing, I want to be sure that my data logging solution will be viable 
for the next 5-10 years. With the increased use of single board computers, I 
chose the target of benchmarking upto 1000 worker frontends... but I quickly 
hit the '64 MAX CLIENTS' and '64 RPC CONNECTION' limit. Ok...

branching and updating these limits:
https://bitbucket.org/tmidas/midas/branch/experimental-beyond_64_clients

I have two commits. 
1. update the memory layout assertions and use MAX_CLIENTS as a variable
https://bitbucket.org/tmidas/midas/commits/302ce33c77860825730ce48849cb810cf
366df96?at=experimental-beyond_64_clients
2. Change the MAX_CLIENTS and MAX_RPC_CONNECTION
https://bitbucket.org/tmidas/midas/commits/f15642eea16102636b4a15c8411330969
6ce3df1?at=experimental-beyond_64_clients

Unintended side effects:
I break compatibility of existing ODB files... the database layout has 
changed and I read my old ODB as corrupt. In my test setup I can start from 
scratch but this would be horrible for any existing experiment.

Edit: I noticed 'make testdiff' pipeline is failing... also fails locally... 
investigating

Early performance results:
In early tests, ~700 PCs logging 10 unique arrays of 10 doubles into 
Equipment variables in the ODB seems to perform well... All transactions 
from client PCs are finished within a couple of ms or less

==========================================================================

Questions:

Does the community here have strong opinions about increasing the 
MAX_CLIENTS and MAX_RPC_CONNECTION limits? 
Am I looking at this problem in a naive way?


Potential solutions other than increasing the MAX_CLIENTS limit:
-Make worker threads inside the supervisor (not a separate process), I am 
using TMFE, so I can dynamically create equipment. I have not yet taken a 
deep dive into how any multithreading is implemented
-One could have a round robin system to load balance between a limited pool 
of 'worker frontend' proccesses. I don't like this solution as I want to 
able to clearly see which client PCs have been setup to log too much data

==========================================================================
    Reply  01 May 2020, Stefan Ritt, Forum, Taking MIDAS beyond 64 clients 
Hi Joseph,

here some thoughts from my side:

- Breaking ODB compatibility in the master/develop midas branch is very bad, since almost all experiments worldwide are affected if they just do blindly a pull and want to recompile and rerun. Currently, 
even during our Corona crisis, still some experiments are running and monitored remotely.

- On the other hand, if we have to break compatibility, now is maybe a good time since most accelerators worldwide are off. But before doing so, I would like to get feedback from the main experiments 
around the world (MEG, T2K, g-2, DEAP besides ALPHA).

- Having a maximum of 64 clients was originally decided when memory was scarce. In the early days one had just a couple of megabytes of share memory. Now this is not an issue any more, but I see 
another problem. The main status page gives a nice overview of the experiment. This only works because there is a limited number of midas clients and equipments. If we blow up to 1000+, the status 
page would be rather long and we have to scroll up and down forever. In such a scenario one would have at least to redesign the status and program pages. To start your experiment, you would have to 
click 1000 times to start each front-end, also not very practicable.

- Having 100's or 1000's of front-ends calls rather for a hierarchical design, like the LHC experiments have. That would be a major change of midas and cannot be done quickly. It would also result in 
much slower run start/stops.

- If you see limitations with your LabVIEW PCs, have you considered multi-threading on your front-ends? Note that the standard midas slow control system supports multithreaded devices 
(DF_MULTITHREAD). In MEG, we use about 800 microcontrollers via the MSCB protocol. They are grouped together and each group is a multithreaded device in the midas slow control lingo, meaning the 
group gets its own thread for control and readout in the midas frontend. This way, one group cannot slow down all other groups. There is one front-end for all groups, which can be started/stopped with 
a single click, it shows up just as one line in the status page, and still it's pretty fast. Have you considered such a scheme? So your LabVIEW PCs would not be individual front-ends, but just make a 
network connection to the midas front-end which then manages all LabVEIW PCs. The midas slow control system allows to define custom commands (besides the usual read/write command for slow 
control data), so you could maybe integrate all you need into that scheme.

Best,
Stefan

> 
> Hi all,
> I have been experimenting with a frontend solution for my experiment 
> (ALPHA). The intention to replace how we log data from PCs running LabVIEW. 
> I am at the proof of concept stage. So far I have some promising 
> performance, able to handle 10-100x more data in my test setup (current 
> limitations now are just network bandwith, MIDAS is impressively efficient). 
> ==========================================================================
> Our experiment has many PCs using LabVIEW which all log to MIDAS, the 
> experiment has grown such that we need some sort of load balancing in our 
> frontend.
> The concept was to have a 'supervisor frontend' and an array of 'worker 
> frontend' processes. 
> -A LabVIEW client would connect to the supervisor, then be referred to a 
> worker frontend for data logging. 
> -The supervisor could start a 'worker frontend' process as the demand 
> required.
> To increase accountability within the experiment, I intend to have a 'worker 
> frontend' per PC connecting. Then any rouge behavior would be clear from the 
> MIDAS frontpage.
> Presently there around 20-30 of these LabVIEW PCs, but given how the group 
> is growing, I want to be sure that my data logging solution will be viable 
> for the next 5-10 years. With the increased use of single board computers, I 
> chose the target of benchmarking upto 1000 worker frontends... but I quickly 
> hit the '64 MAX CLIENTS' and '64 RPC CONNECTION' limit. Ok...
> branching and updating these limits:
> https://bitbucket.org/tmidas/midas/branch/experimental-beyond_64_clients
> I have two commits. 
> 1. update the memory layout assertions and use MAX_CLIENTS as a variable
> https://bitbucket.org/tmidas/midas/commits/302ce33c77860825730ce48849cb810cf
> 366df96?at=experimental-beyond_64_clients
> 2. Change the MAX_CLIENTS and MAX_RPC_CONNECTION
> https://bitbucket.org/tmidas/midas/commits/f15642eea16102636b4a15c8411330969
> 6ce3df1?at=experimental-beyond_64_clients
> Unintended side effects:
> I break compatibility of existing ODB files... the database layout has 
> changed and I read my old ODB as corrupt. In my test setup I can start from 
> scratch but this would be horrible for any existing experiment.
> Edit: I noticed 'make testdiff' is failing... also fails lok
> Early performance results:
> In early tests, ~700 PCs logging 10 unique arrays of 10 doubles into 
> Equipment variables in the ODB seems to perform well... All transactions 
> from client PCs are finished within a couple of ms or less
> ==========================================================================
> Questions:
> Does the community here have strong opinions about increasing the 
> MAX_CLIENTS and MAX_RPC_CONNECTION limits? 
> Am I looking at this problem in a naive way?
> 
> Potential solutions other than increasing the MAX_CLIENTS limit:
> -Make worker threads inside the supervisor (not a separate process), I am 
> using TMFE, so I can dynamically create equipment. I have not yet taken a 
> deep dive into how any multithreading is implemented
> -One could have a round robin system to load balance between a limited pool 
> of 'worker frontend' proccesses. I don't like this solution as I want to 
> able to clearly see which client PCs have been setup to log too much data
> ==========================================================================
       Reply  01 May 2020, Pierre Gorel, Forum, Taking MIDAS beyond 64 clients 
> - On the other hand, if we have to break compatibility, now is maybe a good time since most accelerators worldwide are off. But before doing so, I would like to get feedback from the main experiments 
> around the world (MEG, T2K, g-2, DEAP besides ALPHA).

Hello Stefan,
For what is worth, DEAP will not be impacted: as we have been taking data around the clock for the last few years, we froze the code running on the computers. We may have some window of opportunity for upgrade in few months but such a move has not been discussed yet.

Best regards,
Pierre
          Reply  02 May 2020, Stefan Ritt, Forum, Taking MIDAS beyond 64 clients 
TRIUMF stayed quiet, probably they have other things to do.

I allowed myself to move the maximum number of clients back to its original value, in order not to break running experiments. 
This does not mean that the increase is a bad idea, we just have to be careful not to break running experiments. Let's discuss it
more thoroughly here before we make a decision in that direction.

Best regards,
Stefan
             Reply  02 May 2020, Joseph McKenna, Forum, Taking MIDAS beyond 64 clients 
Thank you very much for feedback.

I am satisfied with not changing the 64 client limit. I will look at re-writing my frontend to spawn threads rather than 
processses. The load of my frontend is low, so I do not anticipate issues with a threaded implementation. 

In this threaded scenario, it will be a reasonable amount of time until ALPHA bumps into the 64 client limit.

If it avoids confusion, I am happy for my experimental branch 'experimental-beyond_64_clients' to be deleted.

Perhaps a item for future discussion would be for the odbinit program to be able to 'upgrade' the ODB and enable some backwards 
compatibility.

Thanks again
Joseph
                Reply  02 May 2020, Stefan Ritt, Forum, Taking MIDAS beyond 64 clients 
> Perhaps a item for future discussion would be for the odbinit program to be able to 'upgrade' the ODB and enable some backwards 
> compatibility.

We had this discussion already a few times. There is an ODB version number (DATABSE_VERSION 3 in midas.h) which is intended for that. If we break teh 
binary compatibility, programs should complain "ODB version has changed, please run ...", then odbinit (written by KO) should have a well-defined 
procedure to upgrade existing ODBs by re-creating them, but keeping all old contents. This should be tested on a few systems.

Stefan
    Reply  02 May 2020, Konstantin Olchanski, Forum, Taking MIDAS beyond 64 clients 
> 
> Does the community here have strong opinions about increasing the 
> MAX_CLIENTS and MAX_RPC_CONNECTION limits? 
> Am I looking at this problem in a naive way?
> 

I think MAX_CLIENTS set at 64 is on the low side for today.

And in the past, we did have experiments that did not work without increasing MAX_CLIENTS. I 
think T2K/ND280 needed MAX_CLIENTS bumped to about 100 (200?).

If ALPHA needs MAX_CLIENTS bigger than the default 64, nothing stops the experiment
from changing this number in the local copy of MIDAS.

It is not necessary to change it in the central repository for everybody.

K.O.
       Reply  02 May 2020, Konstantin Olchanski, Forum, Taking MIDAS beyond 64 clients 
> > 
> > Does the community here have strong opinions about increasing the 
> > MAX_CLIENTS and MAX_RPC_CONNECTION limits? 
> > Am I looking at this problem in a naive way?
> > 

The issue is binary compatibility.

MIDAS has been binary compatible with itself for a long time, 20 years now, easily.

If we are to give this up, we must gain more than we lose.

On the technical level, bumping MAX_CLIENTS from 64 to 100 gives us nothing, Tomorrow an experiment
will come along asking for 101 clients. Any number you pick, it is too small for somebody. And MIDAS
already has a solution for this: edit midas.h, hit make, done.

If we are to break binary compatibility, we should go big. Remove these limits completely!

Move the MAX_CLIENTS & co fixed size arrays out of the headers in ODB and in event buffers, put
them where they can be resized as needed.

That's a binary-compatibility breaking solution I would vote for.

K.O.
          Reply  02 May 2020, Konstantin Olchanski, Forum, Taking MIDAS beyond 64 clients 
> > > 
> > > Does the community here have strong opinions about increasing the 
> > > MAX_CLIENTS and MAX_RPC_CONNECTION limits? 
> > > Am I looking at this problem in a naive way?
> > > 

The issue is: how to organize an experiment? how many frontends should I have?

There are two extremes:

- collect all data in 1 frontend (and today with c++ threads and c++ ring buffers, this is trivial)
- instantiate 1 frontend for each data source. (for example, ALPHA-g detector has 8 ADCs, 64 PWBs plus some
small fish. No that's wrong. Each ADC looks like 48 individual data sources, each PWB looks like 4 data sources,
so this would be 8*48+4*64=640 data sources, could be 640 frontends easily, plus small fish).

Which way is best? Every experiment is different, but consider simple things:

640 frontends writing into 1 event buffer will probably cause large contention for the event buffer lock. bad.
640 frontends running on a 4 core CPU will probably cause unhappiness in the OS. bad.
starting and stopping 640 frontends requires some scripting, monitoring that they all still run, etc. extra work. bad.
640 frontends on the midas status page? your cell phone web browser will explode. bad.

What I am saying is - arbitrary limits are good for you. Make you think about what is going on before throwing 
resources at the problem.

K.O.
Entry  03 Apr 2020, Francesco Renga, Info, CLOCK_REALTIME on MacOS 
Dear all,
       I'm trying to compile MIDAS on MacOS 10.10 and I get this error:

/Users/francesco/MIDAS/midas/src/system.cxx:3187:18: error: use of undeclared identifier 
'CLOCK_REALTIME'
   clock_settime(CLOCK_REALTIME, &ltm);

Is it related to my (old) version of MacOS? Can I fix it somehow?

Thank you,
      Francesco
    Reply  03 Apr 2020, Stefan Ritt, Info, CLOCK_REALTIME on MacOS 
> Dear all,
>        I'm trying to compile MIDAS on MacOS 10.10 and I get this error:
> 
> /Users/francesco/MIDAS/midas/src/system.cxx:3187:18: error: use of undeclared identifier 
> 'CLOCK_REALTIME'
>    clock_settime(CLOCK_REALTIME, &ltm);
> 
> Is it related to my (old) version of MacOS? Can I fix it somehow?
> 
> Thank you,
>       Francesco

If I see this correctly, you need at least MacOSX 10.12. If you can't upgrade, you can just remove line 3187 
from system.cxx. This function is only used in an online environment, where you would run a frontend on your 
Mac, which you probably don't do. So removing it does not hurt you.

Stefan
       Reply  25 Apr 2020, Konstantin Olchanski, Info, CLOCK_REALTIME on MacOS 
> > /Users/francesco/MIDAS/midas/src/system.cxx:3187:18: error: use of undeclared identifier 
> > 'CLOCK_REALTIME'
> >    clock_settime(CLOCK_REALTIME, &ltm);
> > 
> > Is it related to my (old) version of MacOS? Can I fix it somehow?

I think the "set clock" function is a holdover from embedded operating systems
that did not keep track of clock time, i.e. VxWorks, and similar. Here a midas program
will get the time from the mserver and set it on the local system. Poor man's ntp,
poor man's ntpd/chronyd.

We should check if this function is called by anything, and if nothing calls it, maybe remove it?

K.O.
          Reply  26 Apr 2020, Stefan Ritt, Info, CLOCK_REALTIME on MacOS 
> > > /Users/francesco/MIDAS/midas/src/system.cxx:3187:18: error: use of undeclared identifier 
> > > 'CLOCK_REALTIME'
> > >    clock_settime(CLOCK_REALTIME, &ltm);
> > > 
> > > Is it related to my (old) version of MacOS? Can I fix it somehow?
> 
> I think the "set clock" function is a holdover from embedded operating systems
> that did not keep track of clock time, i.e. VxWorks, and similar. Here a midas program
> will get the time from the mserver and set it on the local system. Poor man's ntp,
> poor man's ntpd/chronyd.
> 
> We should check if this function is called by anything, and if nothing calls it, maybe remove it?
> 
> K.O.

It's called in mfe.cxx via cm_synchronize:

/* set time from server */
#ifdef OS_VXWORKS
   cm_synchronize(NULL);
#endif

This was for old VxWorks systems which had no ntp/crond. Was asked for by Pierre long time ago. I don't use it 
(have no VxWorks). We can either remove it completely, or remove just the MacOSX part and just exit the program 
if called with an error message "not implemented on this OS".

Stefan
Entry  26 Apr 2020, Yu Chen (SYSU), Forum, Questions and discussions on the Frontend ODB tree structure. 
Dear MIDAS developers and colleagues,

    This is Yu CHEN of School of Physics, Sun Yat-sen University, China, working in the PandaX-III collaboration, an experiment under development to search the neutrinoless double 
beta decay. We are working on the DAQ and slow control systems and would like to use Midas framework to communicate with the custom hardware systems, generally via Ethernet 
interfaces. So currently we are focusing on the development of the FRONTEND program of Midas and have some questions and discussions on its ODB structure. Since I’m still not 
experienced in the framework, it would be precious that you can provide some suggestions on the topic.
    The current structure of the frontend ODB tree we have designed, together with our understanding on them, is as follows:
      /Equipment/<frontend_name>/
        ->  /Common/: Basic controls and left unchanged.
        ->  /Variables/: (ODB records with MODE_READ) Monitored values that are AUTOMATICALLY updated from the bank data within each packed event. It is done by the update_odb() 
function in mfe.cxx.
        ->  /Statistics/: (ODB records with MODE_WRITE) Default status values that are AUTOMATICALLY updated by the calling of db_send_changed_records() within the mfe.cxx.
        ->  /Settings/: All the user defined data can be located here. 
            -> /stat/: (ODB records with MODE_WRITE) All the monitored values as well as program internal status. The update operation is done simultaneously when 
db_send_changed_records() is called within the mfe.cxx.
            -> /set/: (ODB records with MODE_READ) All the “Control” data used to configure the custom program and custom hardware.

    For our application, some of the our detector equipment outputs small amount of status and monitored data together with the event data, so we currently choose not to use EQ_SLOW 
and 3-layer drivers for the readout. Our solution is to create two ODB sub-trees in the /Settings/ similar to what the device_driver does. However, this could introduce two 
troubles:
    1) For /Settings/stat/: To prevent the potential destroy on the hot-links of /Variables/ and /Statistics/ sub-trees, all our status and monitored data are stored separately in 
the /Settings/stat/ sub-tree. Another consideration is that some monitored data are not directly from the raw data, so packaging them into the Bank for later showing in /Variables/ 
could somehow lead to a complicated readout() function. However, this solution may be complicated for history loggings. I have find that the ANALYZER modules could provide some 
processes for writing to the /Variables/ sub-tree, so I would like to know whether an analyzer should be used in this case.
    2) For /Settings/set/: The “control” data (similar to the “demand” data in the EQ_SLOW equipment) are currently put in several /Settings/set/ sub-trees where each key in 
them is hot-linked to a pre-defined hardware configuration function. However, some control operations are not related to a certain ODB key or related to several ODB keys (e.g. 
configuration the Ethernet sockets), so the dispatcher function should be assigned to the whole sub-tree, which I think can slow the response speed of the software. What we are 
currently using is to setup a dedicated “control key”, and then the input of different value means different operations (e.g. 1 means socket opening, 2 means sending the UDP 
packets to the target hardware, et al.). This “control key” is also used to develop the buttons to be shown on the Status/Custom webpage. However, we would like to have your 
suggestions or better solutions on that, considering the stability and fast response of the control.

    We are not sure whether the above understanding and troubles on the Midas framework are correct or they are just due to our limits on the knowledge of the framework, so we 
really appreciate your knowledge and help for a better using on Midas. Thank you so much!
    Reply  26 Apr 2020, Stefan Ritt, Forum, Questions and discussions on the Frontend ODB tree structure. 
Dear Yu Chen,

in my opinion, you can follow two strategies:

1) Follow the EQ_SLOW example from the distribution, and write some device driver for you hardware to control. There are dozens of experiments worldwide which use that scheme and it works ok for lots of 
different devices. By doing that, you get automatically things like write cache (only write a value to the actual device if it really has chande) and dead band (only write measured data to the ODB if it changes more 
than a certain value to suppress noise). Furthermore, you get events created automatically and history for all your measured variables. This scheme might look complicated on the first look, and it's quite old-
fashioned (no C++ yet) but it has proven stable over the last ~20 years or so. Maybe the biggest advantage of this system is that each device gets its own readout and control thread, so one slow device cannot 
block other devices. Once this has been introduced more than 10 years ago, we saw a big improvement in all experiments.

2) Do everything by yourself. This is the way you describe below. Of course you are free to do whatever you like, you will find "special" solutions for all your problems. But then you move away from the "standard 
scheme" and you loose all the benefits I described under 1) and of course you are on your own. If something does not work, it will be in your code and nobody from the community can help you.

So choose carefully between 1) and 2).

Best regards,
Stefan

> Dear MIDAS developers and colleagues,
> 
>     This is Yu CHEN of School of Physics, Sun Yat-sen University, China, working in the PandaX-III collaboration, an experiment under development to search the neutrinoless double 
> beta decay. We are working on the DAQ and slow control systems and would like to use Midas framework to communicate with the custom hardware systems, generally via Ethernet 
> interfaces. So currently we are focusing on the development of the FRONTEND program of Midas and have some questions and discussions on its ODB structure. Since I’m still not 
> experienced in the framework, it would be precious that you can provide some suggestions on the topic.
>     The current structure of the frontend ODB tree we have designed, together with our understanding on them, is as follows:
>       /Equipment/<frontend_name>/
>         ->  /Common/: Basic controls and left unchanged.
>         ->  /Variables/: (ODB records with MODE_READ) Monitored values that are AUTOMATICALLY updated from the bank data within each packed event. It is done by the update_odb() 
> function in mfe.cxx.
>         ->  /Statistics/: (ODB records with MODE_WRITE) Default status values that are AUTOMATICALLY updated by the calling of db_send_changed_records() within the mfe.cxx.
>         ->  /Settings/: All the user defined data can be located here. 
>             -> /stat/: (ODB records with MODE_WRITE) All the monitored values as well as program internal status. The update operation is done simultaneously when 
> db_send_changed_records() is called within the mfe.cxx.
>             -> /set/: (ODB records with MODE_READ) All the “Control” data used to configure the custom program and custom hardware.
> 
>     For our application, some of the our detector equipment outputs small amount of status and monitored data together with the event data, so we currently choose not to use EQ_SLOW 
> and 3-layer drivers for the readout. Our solution is to create two ODB sub-trees in the /Settings/ similar to what the device_driver does. However, this could introduce two 
> troubles:
>     1) For /Settings/stat/: To prevent the potential destroy on the hot-links of /Variables/ and /Statistics/ sub-trees, all our status and monitored data are stored separately in 
> the /Settings/stat/ sub-tree. Another consideration is that some monitored data are not directly from the raw data, so packaging them into the Bank for later showing in /Variables/ 
> could somehow lead to a complicated readout() function. However, this solution may be complicated for history loggings. I have find that the ANALYZER modules could provide some 
> processes for writing to the /Variables/ sub-tree, so I would like to know whether an analyzer should be used in this case.
>     2) For /Settings/set/: The “control” data (similar to the “demand” data in the EQ_SLOW equipment) are currently put in several /Settings/set/ sub-trees where each key in 
> them is hot-linked to a pre-defined hardware configuration function. However, some control operations are not related to a certain ODB key or related to several ODB keys (e.g. 
> configuration the Ethernet sockets), so the dispatcher function should be assigned to the whole sub-tree, which I think can slow the response speed of the software. What we are 
> currently using is to setup a dedicated “control key”, and then the input of different value means different operations (e.g. 1 means socket opening, 2 means sending the UDP 
> packets to the target hardware, et al.). This “control key” is also used to develop the buttons to be shown on the Status/Custom webpage. However, we would like to have your 
> suggestions or better solutions on that, considering the stability and fast response of the control.
> 
>     We are not sure whether the above understanding and troubles on the Midas framework are correct or they are just due to our limits on the knowledge of the framework, so we 
> really appreciate your knowledge and help for a better using on Midas. Thank you so much!
Entry  25 Apr 2020, Konstantin Olchanski, Info, new mac! 
I received my new 2020 mac book air, so between Stefan and myself, MacOS support for 
MIDAS is assured for 5 more years at the least. K.O.
Entry  07 Apr 2020, Ivo Schulthess, Suggestion, Sequencer loop break 
I am using the Midas sequencer to run subsequent measurements in a loop, without 
knowing how many iterations in advance. Therefore, I am using the "infinity" 
option. Since I have other commands after the loop, it would be nice to have the 
possibility to break the loop, but let the sequencer then finish the rest of the 
commands. 
Cheers,
Ivo
    Reply  21 Apr 2020, Stefan Ritt, Suggestion, Sequencer loop break 
> I am using the Midas sequencer to run subsequent measurements in a loop, without 
> knowing how many iterations in advance. Therefore, I am using the "infinity" 
> option. Since I have other commands after the loop, it would be nice to have the 
> possibility to break the loop, but let the sequencer then finish the rest of the 
> commands. 
> Cheers,
> Ivo

You can do that with the "GOTO" statement, jumping to the first line after the loop.

Here is a working example:


LOOP runs, 5
     WAIT Seconds 3
     IF $runs > 2
         GOTO 7
     ENDIF
ENDLOOP
MESSAGE "Finished", 1

Best,
Stefan
       Reply  23 Apr 2020, Ivo Schulthess, Suggestion, Sequencer loop break 
> You can do that with the "GOTO" statement, jumping to the first line after the loop.
> 
> Here is a working example:
> 
> 
> LOOP runs, 5
>      WAIT Seconds 3
>      IF $runs > 2
>          GOTO 7
>      ENDIF
> ENDLOOP
> MESSAGE "Finished", 1
> 
> Best,
> Stefan

Hoi Stefan

Thanks for your answer. As I understand it, this has to be in the sequence script before 
running. So, in the end, it is not different than just saying "LOOP runs, 2" and 
therefore the number of runs has do be known in advance as well. Or is there an option to 
change the script on runtime? What I would like, is to start a sequence with "LOOP runs, 
infinite" and when I come back to the experiment after falling asleep being able to break 
the loop after the next iteration, but still execute everything after ENDLOOP, i.e. the 
MESSAGE statement in your example. Because if I do a "Stop after current run", this seems 
not to happen. 

Best, Ivo
          Reply  23 Apr 2020, Stefan Ritt, Suggestion, Sequencer loop break 
> > You can do that with the "GOTO" statement, jumping to the first line after the loop.
> > 
> > Here is a working example:
> > 
> > 
> > LOOP runs, 5
> >      WAIT Seconds 3
> >      IF $runs > 2
> >          GOTO 7
> >      ENDIF
> > ENDLOOP
> > MESSAGE "Finished", 1
> > 
> > Best,
> > Stefan
> 
> Hoi Stefan
> 
> Thanks for your answer. As I understand it, this has to be in the sequence script before 
> running. So, in the end, it is not different than just saying "LOOP runs, 2" and 
> therefore the number of runs has do be known in advance as well. Or is there an option to 
> change the script on runtime? What I would like, is to start a sequence with "LOOP runs, 
> infinite" and when I come back to the experiment after falling asleep being able to break 
> the loop after the next iteration, but still execute everything after ENDLOOP, i.e. the 
> MESSAGE statement in your example. Because if I do a "Stop after current run", this seems 
> not to happen. 
> 
> Best, Ivo

First, you have the sequencer button "Stop after current run", but that does of course ot
execute anything after ENDLOOP. 

Second, you can put anything in the IF statement. Like create a variable on the ODB like
/Experiment/Run parameters/Stop loop and set this to zero. Then put in your script:

...
ODBGET /Experiment/Run parameters/Stop loop, flag
IF $flag == 1
   GOTO 7
...

So once you want to stop the loop, set the flag in the ODB to one.

Best,
Stefan
       Reply  25 Apr 2020, Konstantin Olchanski, Suggestion, Sequencer loop break 
> LOOP runs, 5
> ...
> ENDLOOP

Classical loop constructs usually include "break" (exit the loop) and "continue" (loop again!) 
constructs. I would say it is an unfortunate omission if they are not present in the midas sequencer.

But Stefan is right, of course, both commands are just funny names for "goto".

K.O.
Entry  16 Mar 2020, Konstantin Olchanski, Info, mhttpd mongoose 6.16 update 
the update of mhttpd to mongoose version 6.16 was committed to the develop branch of midas. If you do not want to use this 
updated code or if it causes problems, please use the mhttpd6 executable or midas from the midas-2020-03 release branch.

new features:

- IPv6 support
- built-in http proxy
- fine grain locking - serving "resource" files (html, css, etc) and serving json-rpc requests no longer takes the global lock
- reduced number of DNS queries when checking host list access (DNS replies are cached)
- (I decided to not implement caching of password requests and dynamic reload of password file - it is too hard).

internal changes:

Recent versions of the mongoose web server library have removed all their internal multithreading,
leaving the library fully single-threaded. This resulted in major simplification of many things. An improvement.
(the civetweb fork of mongoose retains the old multithreading code, that model seems to work better
which used inside ROOT). As implemented in mhttpd, all network connections are handled by the main thread,
all midas http requests are handled by worker threads that are started on the as-needed basis.

The old mongoose 6.4 based mhttpd code survived almost without changes - as a compile-time
option - so now I build 2 mhttpd executables: mhttpd with the new code and mhttpd6 with the old code
so people have something to run in case the new code bombs.

http proxy:

Experiments that use private networks usually configure the apache httpd as a web proxy to allow
access from the outside to the web-controlled devices on the private network. Making changes
to this proxy requires root access, requires restarting httpd, etc. To make things simpler, mhttpd now
includes a web proxy (almost the complete implementation is provided by the mongoose library). Configuration
is done from ODB, restarting mhttpd is not needed.

improved multithreading:

Since most of the MIDAS library is now thread-safe, mhttpd no longer needs to take the "big midas lock"
to service most web requests. Access to files, access to ODB, etc is now fully threaded. Some parts
of MIDAS are not thread-safe, i.e. access to history and log files, so a flag was added to the mjsonrpc library
to mark which RPC methods are not thread-safe.

Note that despite these improvements, mhttpd still suffers from "http head-of-queue blocking"
https://en.wikipedia.org/wiki/Head-of-line_blocking
because (i.e. the google chrome web browser) tends to use just 1 TCP connection for all JSONRPC requests,
after a request for a history read (can take a long time), all subsequent requests for web page updates, etc
will have to wait until it completes, causing unresponsive user experience. (it looks as if mhttpd is single-threaded!).

A solution for this problem is HTTP/2, which is not yet implemented by mongoose and is not quite yet available
for apache httpd.

More later...
K.O.
    Reply  16 Mar 2020, Konstantin Olchanski, Info, mhttpd mongoose 6.16 update 
> the update of mhttpd to mongoose version 6.16 was committed to the develop branch of midas.

The new code implements 3 http ports:

- localhost port 8080 - enabled by default - suitable for "I want to test midas on my laptop" and for connecting from the apache httpd
https password protected gateway.

- insecure http port 8081 - disabled by default - with optional password protection (HTTP Digest auth), and optional hostlist access
control - for the case when the https gateway is running on a different computer (i.e. ALPHA at CERN).

(My reading of "internet opinions" about HTTP Digest authentication over unencrypted HTTP is
that while considered very obsolete, there are no specific security problems and exploits
against it - other than the usual - man-in-the-middle and "steal the password file" attacks.
So while I do not recommend using it, I do not feel justified to remove/disable it on security grounds.
It provides an alternative password protection when use of SSL/HTTPS is too difficult).

- https port 8443 - disabled by default - also with optional password protection (HTTP Digest auth), and optional hostlist access
control. HTTP Digest password protection over HTTPS is deemed as secure at "HTTP Basic" password protection over HTTPS and
that is what is used by apache httpd password protection.

(The main problem with mhttpd support of HTTPS is obtaining an https certificate. Right now mhttpd
instructs the user to generate a self-signed certificate. But there is 2 problems: modern browsers dislike self-signed
certificates (even when explicitely marked "trust it!") and there is no check for certificate expiration.
I guess one could try to integrate mhttpd with certbot and the let's-encrypt system, but there
is problems, i.e. the certificate files live in readable-only-by-root directories, etc. I would rather
wait until mongoose implement certbot integration in their code).

More later...
K.O.
       Reply  16 Mar 2020, Konstantin Olchanski, Info, mhttpd mongoose 6.16 update 
> > the update of mhttpd to mongoose version 6.16 was committed to the develop branch of midas.

Configuration is done by ODB /WebServer:

---------------------------------------------------------------------------
[local:javascript1:S]/WebServer>ls -l
Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
mime.types                      DIR
Enable localhost port           BOOL    1     4     2h   0   RWD  y
localhost port                  INT     1     4     2h   0   RWD  8080
localhost port passwords        BOOL    1     4     2h   0   RWD  n
Enable insecure port            BOOL    1     4     12h  0   RWD  n
insecure port                   INT     1     4     2h   0   RWD  8081
insecure port passwords         BOOL    1     4     2h   0   RWD  y
insecure port host list         BOOL    1     4     2h   0   RWD  y
Enable https port               BOOL    1     4     12h  0   RWD  n
https port                      INT     1     4     2h   0   RWD  8443
https port passwords            BOOL    1     4     2h   0   RWD  y
https port host list            BOOL    1     4     2h   0   RWD  y
Host list                       STRING  10    32    2h   0   RWD
                                        [0]             localhost
                                        [1]
                                        [2]
                                        [3]
                                        [4]
                                        [5]
                                        [6]
                                        [7]
                                        [8]
                                        [9]
Enable IPv6                     BOOL    1     4     2h   0   RWD  y
Proxy                           DIR
---------------------------------------------------------------------------

Most entries are self-obvious, but note:

- mime.types contains the mapping of file extensions of file content-type telling browser what to do:

---------------------------------------------------------------------------
[local:javascript1:S]/WebServer>ls -l mime.types/
Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
.HTML                           STRING  1     10    2h   0   RWD  text/html
.HTM                            STRING  1     10    2h   0   RWD  text/html
.CSS                            STRING  1     9     2h   0   RWD  text/css
---------------------------------------------------------------------------

- Proxy directory configures the http proxy (as implemented by mongoose, I am
not sure if I understand all limitations):

---------------------------------------------------------------------------
[local:javascript1:S]/WebServer>ls -l Proxy/
Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
example                         STRING  1     27    17h  0   RWD  #http://localhost:8080
---------------------------------------------------------------------------

("#" means - commented-out)

http://localhost:8080/proxy/example/foo/bar/baz proxies to http://localhost:8080/foo/bar/baz

- "Enable IPv6" tells mhttpd to also listen on the IPv6 ports. The best I can tell IPv6 works on the Mac,
and with luck will get some testing at CERN where IPv6 is in use.

Documentation on the midas wiki still needs to be updated for this.
K.O.
          Reply  17 Mar 2020, Konstantin Olchanski, Info, mbedtls, mhttpd mongoose 6.16 update 
> > > the update of mhttpd to mongoose version 6.16 was committed to the develop branch of midas.

current code looks for the mbedtls library in ../mbedtls (next to midas)

if cmake misdetects it, turn it off by setting NO_MBEDTLS (same as NO_ROOT & co)

if you do want to build mhttpd with mbedtls, do this:

cd .../midas
cd ../
git clone https://github.com/ARMmbed/mbedtls.git
cd mbedtls
git submodule update --init ### this will populate the "crypto" directory
make ### if "python2" is missing, building of test suite programs will fail, but the libraries needed for midas will be built

cd ../midas
make cmake...

K.O.
             Reply  30 Mar 2020, Stefan Ritt, Info, mbedtls, mhttpd mongoose 6.16 update 
I had some quick look at the new mongoose code and didn't find anything I dislike. Did a quick test of the proxy which worked and is nice to have. 
Agree with all KO said about authentication.

So if there are no complaints, I would suggest that we move the summary of this thread into the official documentation.

Stefan
Entry  25 Mar 2020, Andreas Suter, Forum, mlogger: misleading error messages for ROOT  
Dear All,

At our experiment we write ROOT files. When starting/stopping runs we get the following error messages:

[Logger,ERROR] [mlogger.cxx:3358:root_write,ERROR] Cannot write system event into ROOT file, event_id 0xffff8000

[Logger,ERROR] [mlogger.cxx:3358:root_write,ERROR] Cannot write system event into ROOT file, event_id 0xffff8001

Looking into the source code I found that log_write (line 4248) sends these Midas System Events (BOR,EOR) to root_write without filtering them. root_write() checks in a first step if it gets such Midas System Events and if yes, moans.

Wouldn't it be better just to filter these events in log_write, before calling root_write, avoiding unnecessary error messages?

Is there something I miss?

Thanks,
  Andreas
    Reply  25 Mar 2020, Konstantin Olchanski, Forum, mlogger: misleading error messages for ROOT  
> [Logger,ERROR] [mlogger.cxx:3358:root_write,ERROR] Cannot write system event into ROOT file, event_id 0xffff8000

Hi, Andreas, please open a bug report for this problem on bitbucket, there is now at least 2 bugs against
the ROOT writer (some events are written in duplicate sometimes), and I hope to fix this next time i review
the mlogger (RSN!). Biggest problem is that I do not use the ROOT output myself, so I have no way
to know if ROOT files produced by mlogger are correct or make sense. (without setting up some kind
of test environment with a ROOT file reader.

Thank you for reporting this problem here, so more people know about it.

If somebody has a patch to fix this, please send it in!

K.O.
    Reply  27 Mar 2020, Stefan Ritt, Forum, mlogger: misleading error messages for ROOT  
Dear simplest solution seems to me to just remove the error message generation and silently ignore the BOE EOR events. 

Committed that change.

Stefan
       Reply  27 Mar 2020, Andreas Suter, Forum, mlogger: misleading error messages for ROOT  
Hi Stefan,

I think this only partially resolves the issue, in log_write:

#ifdef HAVE_ROOT
   } else if (log_chn->format == FORMAT_ROOT) {
      status = root_write(log_chn, pevent, pevent->data_size + sizeof(EVENT_HEADER));
#endif
   }

   actual_time = ss_millitime();
   if ((int) actual_time - (int) start_time > 3000)
      cm_msg(MINFO, "log_write", "Write operation on \'%s\' took %d ms", log_chn->path.c_str(), actual_time - start_time);

   if (status != SS_SUCCESS && !stop_requested) {
      cm_msg(MTALK, "log_write", "Error writing output file, stopping run");
      cm_msg(MERROR, "log_write", "Cannot write \'%s\', error %d, stopping run", log_chn->path.c_str(), status);
      stop_the_run(0);

      return status;
   }

In your solution root_write returns quietly but status == SS_INVALID_FORMAT (not SS_SUCCESS) and hence I get another misleading error message "Error writing output file, stopping run".

In order to prevent this you also would need to change the return value to SS_SUCCESS.
          Reply  27 Mar 2020, Stefan Ritt, Forum, mlogger: misleading error messages for ROOT  
Ok, changed.

Stefan
Entry  23 Mar 2020, Ivo Schulthess, Forum, Save data to FTP 
Dear all
I try to save data to an FTP server but don't get any data on the server. Midas does not complain or message any error but also nothing gets saved. Does somebody have experience with this? I use the following settings for the ODB mlogger channel settings: Type: FTP, Filename: server.com, 21, user, pw, ., run%06d.mid, Format: MIDAS, Output: FILE. What would be the Output: FTP setting for? I tried this but it does not work at all. 
Thanks in advance,
Ivo
    Reply  23 Mar 2020, Konstantin Olchanski, Forum, Save data to FTP 
> I try to save data to an FTP server but don't get any data on the server. Midas does not complain or message any error but also nothing gets saved. Does somebody have experience with this? I use the following settings for the ODB mlogger channel settings: Type: FTP, Filename: server.com, 21, user, pw, ., run%06d.mid, Format: MIDAS, Output: FILE. What would be the Output: FTP setting for? I tried this but it does not work at all. 

Hi, Ivo, good to hear from a midas user in these difficult times.

We do not use FTP at TRIUMF, but Stefan asked us to keep FTP alive and working, so we should be able
to get you going. I will try to find the FTP instructions for you, I am pretty sure I have them somewhere.

In the mean time, I am very curious why you are using a FTP to record data, is it some kind
of data appliance where simplest input for data is FTP? Using NFS does not work or is too hard?

Also for example at CERN, we write data to Castor and EOS, for this mlogger writes data to local disk,
then the lazylogger runs a script to move the data to Castor and EOS. The example lazylogger
scripts for this are in the MIDAS "progs" directory. But maybe you do not have a local disk and this would
not work for you.

In other news, I hope to work on mlogger and lazylogger support for cloud storage (swift and s3 apis?),
would that be useful as replacement for FTP?

K.O.
       Reply  24 Mar 2020, Ivo Schulthess, Forum, Save data to FTP 
> > I try to save data to an FTP server but don't get any data on the server. Midas does not complain or message any error but also nothing gets saved. Does somebody have experience with this? I use the following settings for the ODB mlogger channel settings: Type: FTP, Filename: server.com, 21, user, pw, ., run%06d.mid, Format: MIDAS, Output: FILE. What would be the Output: FTP setting for? I tried this but it does not work at all. 
> 
> Hi, Ivo, good to hear from a midas user in these difficult times.
> 
> We do not use FTP at TRIUMF, but Stefan asked us to keep FTP alive and working, so we should be able
> to get you going. I will try to find the FTP instructions for you, I am pretty sure I have them somewhere.
> 
> In the mean time, I am very curious why you are using a FTP to record data, is it some kind
> of data appliance where simplest input for data is FTP? Using NFS does not work or is too hard?
> 
> Also for example at CERN, we write data to Castor and EOS, for this mlogger writes data to local disk,
> then the lazylogger runs a script to move the data to Castor and EOS. The example lazylogger
> scripts for this are in the MIDAS "progs" directory. But maybe you do not have a local disk and this would
> not work for you.
> 
> In other news, I hope to work on mlogger and lazylogger support for cloud storage (swift and s3 apis?),
> would that be useful as replacement for FTP?
> 
> K.O.
>
Good Morning Konstantin

Thanks for the fast reply.  Yes, it is, Midas is one of the things we can at least improve from home. 

Our experiment is planned to measure (soon) at ILL. Now since we don't use the equipment/detector from the 
beamline but our own, all the data from Midas is saved on the local drive. This is fine in the first instance
but then we also need proper backup. Since our experiment is quite small, the easiest solution I came up with
is to copy all of our data to the ILL storage which has enough space and is properly backed up. The ILL data 
storage allows only SFTP connections, nothing else. Since Midas has the FTP feature, having a separate FTP 
logger channel seemed the easiest way to go. 

Thanks for your input, I will look into how to mount SFTP and then this would also be a solution. 

Since ILL only provides access via SFTP and everything else is not existent or blocked (not even ssh is possible),
this is the only thing we can work with by now. 

Best regards,
Ivo
          Reply  24 Mar 2020, Stefan Ritt, Forum, Save data to FTP 
Logging directly from the midas logger to FTP is a bit cumbersome. In case of delays during login etc. this can throttle the whole DAQ chain. 
What we use in our lab is to write to local disk, then use the lazylogger (https://midas.triumf.ca/MidasWiki/index.php/Lazylogger) to copy the 
local files to a remote FTP server. This way we de-couple data taking from backup, making the system much more swift.

Best,
Stefan
             Reply  24 Mar 2020, Ivo Schulthess, Forum, Save data to FTP 
> Logging directly from the midas logger to FTP is a bit cumbersome. In case of delays during login etc. this can throttle the whole DAQ chain. 
> What we use in our lab is to write to local disk, then use the lazylogger (https://midas.triumf.ca/MidasWiki/index.php/Lazylogger) to copy the 
> local files to a remote FTP server. This way we de-couple data taking from backup, making the system much more swift.
> 
> Best,
> Stefan

Yes, see this now too. I will, therefore, try to set up the lazylogger properly. 
          Reply  24 Mar 2020, Konstantin Olchanski, Forum, Save data to FTP 
> 
> Since ILL only provides access via SFTP and everything else is not existent or blocked (not even ssh is possible),
> this is the only thing we can work with by now. 
> 

Oops. SFTP != FTP.

SFTP uses SSH for data transport, so we cannot do it directly from C++ code in MIDAS. (we could use libssh, etc, but...)

I suggest you use lazylogger with the lazy_dache script, replace "dccp" with "sftp", replace "nsls" with an sftp "ls" command.

If you get it working, please consider contributing your lazylogger script to midas. (and does not have to be written in perl, python should work equally well).

For setting up lazylogger with the script method, I am pretty sure I posted the instructions to the forum (ages ago),
let me know if you cannot find them.

Good luck.

K.O.
Entry  08 Aug 2019, Konstantin Olchanski, Info, MIDAS will use C++11 
After much discussion, and following the MIDAS workshop at TRIUMF, we made the decision to use C++11 in MIDAS.

There are many benefits, and only one drawback - no c++11 compilers in the default OS install on older computers (i.e. 
RHEL/SL/CentOS before el7). (the same applies to our use of cmake).

Specifically for el6, the solution is to use c++11 compatible gcc-8 from devtoolset-8, see 
https://midas.triumf.ca/elog/Midas/1649

The c++11 features we most welcome - initialization of class members at declaration time (no more forgetting to add initialization to 
each and every constructor), c++ threads and mutexes, lambdas and "auto".

K.O.
    Reply  16 Mar 2020, Konstantin Olchanski, Info, MIDAS will use C++11 
> After much discussion, and following the MIDAS workshop at TRIUMF, we made the decision to use C++11 in MIDAS.
> 
> There are many benefits, and only one drawback - no c++11 compilers in the default OS install on older computers (i.e. 
> RHEL/SL/CentOS before el7). (the same applies to our use of cmake).
>

It turns out that support for the c++11 "regex" feature is missing on el7 (CentOS-7, our most common platform at TRIUMF).

According to https://stackoverflow.com/questions/12530406/is-gcc-4-8-or-earlier-buggy-about-regular-expressions
gcc 4.9.0 is the first one to implement c++11 regular expressions. el7 comes with gcc-4.8.5 and I confirm
that examples of using std::regex_replace() do not compile. I was looking to use std::regex_replace to implement URL rewriting
in the reverse proxy code in mhttpd.

I do not need this feature immediately, but I am surprised that such a thing can happen, thought others should know.

K.O.
       Reply  16 Mar 2020, Pintaudi Giorgio, Info, MIDAS will use C++11 
About the boost library, that is exactly 
what I did for a project of mine (the 
calibration software for the WAGASCI 
experiment). It turned out not so easy to 
mantain because different Linux distros 
package different versions of boost.

The reason I went down the "c++11 plus 
boost" road is that the official T2K OS 
is CentOS7 as well.

Looking back I think that using c++17 and 
requiring a more recent version of the 
compiler is much easier to maintain than 
the combo c++11 + boost. In CentOS is 
just a matter of installing a recent 
devtool package ...

Another solution might be too repackage 
boost into MIDAS so you have full control 
of the environment.

> > After much discussion, and following 
the MIDAS workshop at TRIUMF, we made the 
decision to use C++11 in MIDAS.
> > 
> > There are many benefits, and only one 
drawback - no c++11 compilers in the 
default OS install on older computers 
(i.e. 
> > RHEL/SL/CentOS before el7). (the same 
applies to our use of cmake).
> >
> 
> It turns out that support for the c++11 
"regex" feature is missing on el7 
(CentOS-7, our most common platform at 
TRIUMF).
> 
> According to 
https://stackoverflow.com/questions/12530
406/is-gcc-4-8-or-earlier-buggy-about-
regular-expressions
> gcc 4.9.0 is the first one to implement 
c++11 regular expressions. el7 comes with 
gcc-4.8.5 and I confirm
> that examples of using 
std::regex_replace() do not compile. I 
was looking to use std::regex_replace to 
implement URL rewriting
> in the reverse proxy code in mhttpd.
> 
> I do not need this feature immediately, 
but I am surprised that such a thing can 
happen, thought others should know.
> 
> K.O.
Entry  10 Mar 2020, Konstantin Olchanski, Info, MIDAS vs JSROOT web pages 
Just FYI, I am looking at the ROOT web programming component JSROOT and I notice that the RPC mechanism quite different from the JSON-
RPC I implemented for MIDAS.

https://github.com/root-project/jsroot/blob/master/docs/HttpServer.md (explanation of JSROOT RPC and server side machinery)
https://github.com/root-project/jsroot/blob/master/docs/JSROOT.md (explanation of JSROOT javascript library)

Then I looked at the dates:
MIDAS mjsonrpc was done at the end of 2013
JSROOT main development started at the end of 2014.

The web server component in both projects is (almost) the same - vanilla mongoose in mhttpd
and civetweb, a fork of an older version of mongoose, in ROOT/JSROOT.

The web server in both projects is partially multithreaded:
- ROOT THttpServer/TCivetWeb uses multiple threads to handle the network connections and some file access,
but interaction with ROOT is done in the main thread of ROOT. (The main thread must periodically call ProcessRequests()).
- mhttpd uses a single thread to multiplex the network connections (it is a change from old mongoose/civetweb to current mongoose 6.16),
but all requests are farmed to a pool of threads and execute in parallel (unless not thread-safe, i.e. accessing history files).

Both implementations suffer from "head of queue" blocking, a "slow" request i.e. a slow file read, will
delay subsequent quick requests, see https://en.wikipedia.org/wiki/Head-of-line_blocking#In_HTTP
Solution for this problem is to use HTTP/2 when it becomes supported in mongoose/civetweb/apache httpd (in el7).

It will be interesting to see which on of the two systems works better for building "user facing" web pages... especially
hybrid pages that have to pull data both from midas (using mjsonrpc) and from online ROOT analyzers (using jsroot).

K.O.
Entry  06 Mar 2020, Lars Martin, Forum, RPC error 
I ported a bunch of frontends to C++ and now I'm occasionally getting this RPC 
error message:

http error: readyState: 4, HTTP status: 502 (Proxy Error), batch request: method: 
"db_get_values", params: [object Object], id: 1583456958869 method: "get_alarms", 
params: null, id: 1583456958869 method: "cm_msg_retrieve", params: [object 
Object], id: 1583456958869 method: "cm_msg_retrieve", params: [object Object], 
id: 1583456958869

I'm assuming I'm doing wrong something somewhere, but does this message contain 
information where to look? Does the ID mean something?
    Reply  08 Mar 2020, Konstantin Olchanski, Forum, RPC error 
I do not see this error, but there was one more report (they did not clearly say what http errors 
they see) https://bitbucket.org/tmidas/midas/issues/209/get-rid-of-mjsonrpc-dialogs-put-it-to-
the

To debug this, I need to know: what version of MIDAS, what version of what web browser, what 
computer is mhttpd running on? (so I can go look at the log files).

Also can you say more when you see these errors? Every time from every midas web page, or only 
some pages or only when you do something specific (push some button, etc?).

> I ported a bunch of frontends to C++ and now I'm occasionally getting this RPC 
> error message:
> 
> http error: readyState: 4, HTTP status: 502 (Proxy Error), batch request: method: 
> "db_get_values", params: [object Object], id: 1583456958869 method: "get_alarms", 
> params: null, id: 1583456958869 method: "cm_msg_retrieve", params: [object 
> Object], id: 1583456958869 method: "cm_msg_retrieve", params: [object Object], 
> id: 1583456958869
> 
> I'm assuming I'm doing wrong something somewhere, but does this message contain 
> information where to look? Does the ID mean something?

It is unlikely that this error has anything to do with the frontends: usually web page interaction 
goes through: web browser - network - apache httpd - localhost - mhttpd - midas odb.

http error 502 is very generic, does not tell us much about what happened, there may be more 
information in the httpd log files.

the json-rpc request "id" is generated by midas code in the web browser and it currently is a 
timestamp. it is not used for anything. but it is required by the json-rpc standard.

K.O.


K.O.
Entry  29 Jan 2020, Berta Beltran, Bug Report, Compiling Midas in OS 10.15 Catalina  
Hi all, 

I have updated our daq computer to the latest OS 10.15 with the idea that then I will get all our daq 
programs running and will not need to update again in years to come. 
But now I am running unto even more problems trying to compile Midas. 
OS 10.15.3, Xcode 11.3 with commad line tools 

Darrens-Mac-mini:~ betacage$ cd packages/midas/build/
Darrens-Mac-mini:build betacage$ cmake .. 
-- MIDAS: cmake version: 3.16.3
-- The C compiler identification is AppleClang 11.0.0.11000033
-- The CXX compiler identification is AppleClang 11.0.0.11000033
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc
-- Check for working C compiler: /Library/Developer/CommandLineTools/usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++
-- Check for working CXX compiler: /Library/Developer/CommandLineTools/usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- MIDAS: CMAKE_INSTALL_PREFIX: /Users/betacage/packages/midas
-- MIDAS: Found ROOT version 6.18/04
-- Found ZLIB: /usr/lib/libz.dylib  
-- MIDAS: Found ZLIB version 
-- MIDAS: Found OpenSSL version 1.1.1d
-- MIDAS: MySQL not found
-- MIDAS: ODBC not found
-- MIDAS: SQLITE not found
-- MIDAS: nvidia-smi not found
-- Found Git: /usr/bin/git (found version "2.21.0 (Apple Git-122.2)") 
-- MIDAS example/experiment: MIDAS in-tree-build
-- MIDAS: Found ZLIB version 
-- MIDAS example/experiment: Found ROOT version 6.18/04
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/betacage/packages/midas/build
Darrens-Mac-mini:build betacage$ make install
[  1%] Building CXX object CMakeFiles/mfeo.dir/src/mfe.cxx.o
In file included from /Users/betacage/packages/midas/src/mfe.cxx:15:
In file included from /Users/betacage/packages/midas/include/mfe.h:13:
/Users/betacage/packages/midas/include/midas.h:159:10: fatal error: 'pthread.h' file not found
#include <pthread.h>
         ^~~~~~~~~~~
1 error generated.
make[2]: *** [CMakeFiles/mfeo.dir/src/mfe.cxx.o] Error 1
make[1]: *** [CMakeFiles/mfeo.dir/all] Error 2


I guess that I am maybe the first one trying to install MIDAS in this OS, so I am willing to help as much as I 
can with getting this going. I have found a solution to this by downgrading Xcode to version 10 
https://stackoverflow.com/questions/58524715/failing-to-compile-a-c-application-under-macos-catalina-10-15
If you don't have other solutions I will try to do that tomorrow.

Thanks 

Berta 
    Reply  02 Feb 2020, Konstantin Olchanski, Bug Report, Compiling Midas in OS 10.15 Catalina  
> I have updated our daq computer to the latest OS 10.15 ...

FWIW, I do not have macos 10.15. I have 10.13 at home and 10.14 in the office. Maybe Stefan has it?

> In file included from /Users/betacage/packages/midas/include/mfe.h:13:
> /Users/betacage/packages/midas/include/midas.h:159:10: fatal error: 'pthread.h' file not found
> #include <pthread.h>
>          ^~~~~~~~~~~

Hmm... pthread.h should be there, I do not see any notice of it's removal in macos.

So I suspect a mis-installation of c++ compilers. In the past we had problems with this,
in addition to installing Xcode from the app store, one most install some missing stuff
manually ("XCode command line tools"), if I have it right, the command is "xcode-select --install".

If you figure out how to build it, I think midas will work just fine on the latest macos.

K.O.
       Reply  03 Feb 2020, Stefan Ritt, Bug Report, Compiling Midas in OS 10.15 Catalina  
> > I have updated our daq computer to the latest OS 10.15 ...
> 
> FWIW, I do not have macos 10.15. I have 10.13 at home and 10.14 in the office. Maybe Stefan has it?

No, I'm stuck with 10.14 due to the current lack of 64-bit support of some programs.

I would try KOs proposal to do a "xcode-select --install".

Stefan
          Reply  04 Feb 2020, Konstantin Olchanski, Bug Report, Compiling Midas in OS 10.15 Catalina  
> > > I have updated our daq computer to the latest OS 10.15 ...
> > 
> > FWIW, I do not have macos 10.15. I have 10.13 at home and 10.14 in the office. Maybe Stefan has it?
> 
> No, I'm stuck with 10.14 due to the current lack of 64-bit support of some programs.
> 

Ok, in this case, I will update my office mac mini to 10.15.

K.O.
             Reply  06 Feb 2020, Berta Beltran, Bug Report, Compiling Midas in OS 10.15 Catalina  
> 
> Ok, in this case, I will update my office mac mini to 10.15.
> 
> K.O.


Hi Konstantin, 

Any luck with Midas in OS 10.15? 
I have downgraded to Xcode 10.2.1 as suggested in the post linked in my first post, without any luck, I keep getting the same pthead.h error. 
MY gcc uses the 10.15 sdk despite having v 10.2.1 for Xcode and command line tools 

Darrens-Mac-mini:~ betacage$ gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-
dir=/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/c++/4.2.1
Apple LLVM version 10.0.1 (clang-1001.0.46.4)
Target: x86_64-apple-darwin19.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin


and that is where I suspect the problem is. Will keep investigating next week. Thanks for your help.

Berta 
                Reply  10 Feb 2020, Konstantin Olchanski, Bug Report, Compiling Midas in OS 10.15 Catalina  
> Any luck with Midas in OS 10.15? 

Best I can tell, the problem is not in midas: pthread.h should be there, somewhere.

K.O.
                   Reply  11 Feb 2020, Berta Beltran, Bug Report, Compiling Midas in OS 10.15 Catalina  
> > Any luck with Midas in OS 10.15? 
> 
> Best I can tell, the problem is not in midas: pthread.h should be there, somewhere.
> 
> K.O.


HI Konstain, 

Thanks for your reply. I have been investigating this issue of pthread in OS 10.15.  Found this forum post 
https://github.com/wjakob/nori/issues/16

that seem to suggest that pthread.h is in a different directory from OS 10.14 and on. And the end of the post the 
developer suggests that he has found a way to fix the Cmake to work so I have checked his code updates in here 
https://github.com/wjakob/nori/commit/be46cccc01a75b21dad1a3f61baa108fe644fc4b

I have added his lines 


if(APPLE)
   # Try to auto-detect a suitable SDK
   execute_process(COMMAND bash -c "xcodebuild -version -sdk | grep MacOSX | grep Path | head -n 1 | cut -f 2 -
d ' '" OUTPUT_VARIABLE CMAKE_OSX_SYSROOT)
   string(REGEX REPLACE "(\r?\n)+$" "" CMAKE_OSX_SYSROOT "${CMAKE_OSX_SYSROOT}")
   string(REGEX REPLACE "^.*X([0-9.]*).sdk$" "\\1" CMAKE_OSX_DEPLOYMENT_TARGET 
"${CMAKE_OSX_SYSROOT}")
 endif()

to my local pakages/midas/CMakeLists.txt and run cmake .., make install again but now I get a different error 

[  1%] Building CXX object CMakeFiles/mfeo.dir/src/mfe.cxx.o
clang: error: invalid version number in '-mmacosx-version-
min=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/DriverKit19.0.sdk'
clang: warning: using sysroot for 'DriverKit' but targeting 'MacOSX' [-Wincompatible-sysroot]


I will continue investigating this issue tomorrow, but please let me know if you have any suggestions.

Also my version of cmake is 3.16.3

Thanks 

Berta 
                      Reply  11 Feb 2020, Stefan Ritt, Bug Report, Compiling Midas in OS 10.15 Catalina  
For your reference, here on my MacOSX 10.14.6 with XCode 11.3.1 the pthread.h file is present in locations listed below.

Did you execute "xcode-select --install" ?


~$ locate pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Developer/SDKs/AppleTVOS.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Developer/SDKs/AppleTVOS.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVSimulator.platform/Developer/SDKs/AppleTVSimulator.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVSimulator.platform/Developer/SDKs/AppleTVSimulator.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/WatchOS.platform/Developer/SDKs/WatchOS.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/WatchOS.platform/Developer/SDKs/WatchOS.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/WatchSimulator.platform/Developer/SDKs/WatchSimulator.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/WatchSimulator.platform/Developer/SDKs/WatchSimulator.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk/usr/include/pthread.h
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/pthread/pthread.h
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/pthread.h
                         Reply  12 Feb 2020, Berta Beltran, Bug Report, Compiling Midas in OS 10.15 Catalina  
> For your reference, here on my MacOSX 10.14.6 with XCode 11.3.1 the pthread.h file is present in locations listed below.
> 
> Did you execute "xcode-select --install" ?
> 
> 
> ~$ locate pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Developer/SDKs/AppleTVOS.sdk/usr/include/pthread/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Developer/SDKs/AppleTVOS.sdk/usr/include/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/AppleTVSimulator.platform/Developer/SDKs/AppleTVSimulator.sdk/usr/include/pthread/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/AppleTVSimulator.platform/Developer/SDKs/AppleTVSimulator.sdk/usr/include/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/pthread/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/WatchOS.platform/Developer/SDKs/WatchOS.sdk/usr/include/pthread/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/WatchOS.platform/Developer/SDKs/WatchOS.sdk/usr/include/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/WatchSimulator.platform/Developer/SDKs/WatchSimulator.sdk/usr/include/pthread/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/WatchSimulator.platform/Developer/SDKs/WatchSimulator.sdk/usr/include/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/usr/include/pthread/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/usr/include/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk/usr/include/pthread/pthread.h
> /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk/usr/include/pthread.h
> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/pthread/pthread.h
> /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/pthread.h

Hi Stefan,

Thanks for looking into this. Yes, I have done  "xcode-select --install"

This is my output when I try to locate pthread.h

Darrens-Mac-mini:~ betacage$ locate pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Developer/SDKs/AppleTVOS.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform/Developer/SDKs/AppleTVOS.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVSimulator.platform/Developer/SDKs/AppleTVSimulator.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/AppleTVSimulator.platform/Developer/SDKs/AppleTVSimulator.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/WatchOS.platform/Developer/SDKs/WatchOS.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/WatchOS.platform/Developer/SDKs/WatchOS.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/WatchSimulator.platform/Developer/SDKs/WatchSimulator.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/WatchSimulator.platform/Developer/SDKs/WatchSimulator.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/usr/include/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk/usr/include/pthread/pthread.h
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator.sdk/usr/include/pthread.h
/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/pthread/pthread.h
/Library/Developer/CommandLineTools/SDKs/MacOSX10.14.sdk/usr/include/pthread.h
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/pthread/pthread.h
/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/pthread.h

and this is what I see in my SDKs folder 

Darrens-Mac-mini:SDKs betacage$ pwd
/Library/Developer/CommandLineTools/SDKs
Darrens-Mac-mini:SDKs betacage$ ls
MacOSX.sdk	MacOSX10.14.sdk	MacOSX10.15.sdk

So it looks like in this computer one needs to read the version of the SDK as pthread.h is not in MacOSX.sdk but in MacOSX10.15.sdk

Thanks 

Berta 
                            Reply  12 Feb 2020, Stefan Ritt, Bug Report, Compiling Midas in OS 10.15 Catalina  
Another thought: Can you delete the midas build directory and run cmake again? Like

$ cd midas/build
$ rm -rf *
$ cmake ..
$ make VERBOSE=1

If you have the old build cache and upgraded your OS in meantime, the cache needs to be rebuild. The VERBOSE 
flag tells you the compiler options, and you see where the compiler looks for the SDK directory.

Cheers,
Stefan
                               Reply  12 Feb 2020, Berta Beltran, Bug Report, Compiling Midas in OS 10.15 Catalina  
> Another thought: Can you delete the midas build directory and run cmake again? Like
> 
> $ cd midas/build
> $ rm -rf *
> $ cmake ..
> $ make VERBOSE=1
> 
> If you have the old build cache and upgraded your OS in meantime, the cache needs to be rebuild. The VERBOSE 
> flag tells you the compiler options, and you see where the compiler looks for the SDK directory.
> 
> Cheers,
> Stefan

Hi Stefan, 

Thanks again for your reply. Yes! the suggestion of rebuilding the cache was the right one, this is the output of cmake ..

Darrens-Mac-mini:~ betacage$ cd packages/midas/build/
Darrens-Mac-mini:build betacage$ rm -rf *
Darrens-Mac-mini:build betacage$ cmake ..
-- MIDAS: cmake version: 3.16.3
-- The C compiler identification is AppleClang 11.0.0.11000033
-- The CXX compiler identification is AppleClang 11.0.0.11000033
-- Check for working C compiler: /Applications/Xcode.app/C