Back Midas Rome Roody Rootana
  Midas DAQ System, Page 12 of 49  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
Entry  29 Sep 2021, Richard Longland, Bug Report, nstall clash between MIDAS 2020-08 and mscb 
Thank you, Stefan.

I found these instructions under
1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)

I do see reference to updating the submodules under the TRIUMF install 
instructions 
(https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
all_MIDAS) although perhaps it can be clarified.

Cheers,
Richard
    Reply  29 Sep 2021, Stefan Ritt, Bug Report, nstall clash between MIDAS 2020-08 and mscb 
> Thank you, Stefan.
> 
> I found these instructions under
> 1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
> 2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)
> 
> I do see reference to updating the submodules under the TRIUMF install 
> instructions 
> (https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
> all_MIDAS) although perhaps it can be clarified.
> 
> Cheers,
> Richard

Hi Richard,

I updated the documentation at

https://midas.triumf.ca/MidasWiki/index.php/Changelog#Updating_midas

by putting the submodule update command everywhere.

Best,
Stefan
Entry  28 Sep 2021, Richard Longland, Bug Report, Install clash between MIDAS 2020-08 and mscb 
All,

I am performing a fresh install of MIDAS on an Ubuntu linux box. I follow the 
usual installation procedure:

1) git clone https://bitbucket.org/tmidas/midas --recursive
2) cd midas
3) git checkout release/midas-2020-08
4) mkdir build
5) cd build
6) cmake ..
7) make

Step 3 warns me that 
"warning: unable to rmdir 'manalyzer': Directory not empty" and 
"warning: unable to rmdir 'midasio': Directory not empty"

Step 7 fails.
Compilation fails with an mhttp error related to mscb:
mhttpd.cxx:8224:59: error: too few arguments to function 'int mscb_ping(int, 
short unsigned int, int, int)'
 8224 |             status = mscb_ping(fd, (unsigned short) ind, 1);


I was able to get around this by rolling mscb back to some old version (commit 
74468dd), but am extremely nervous about mix-and-matching the code this way.

Any advice would be greatly appreciated.

Cheers,
Richard 
    Reply  28 Sep 2021, Stefan Ritt, Bug Report, Install clash between MIDAS 2020-08 and mscb 
> 1) git clone https://bitbucket.org/tmidas/midas --recursive
> 2) cd midas
> 3) git checkout release/midas-2020-08
> 4) mkdir build
> 5) cd build
> 6) cmake ..
> 7) make

When you do step 3), you get

~/tmp/midas$ git checkout release/midas-2020-08
warning: unable to rmdir 'manalyzer': Directory not empty
warning: unable to rmdir 'midasio': Directory not empty
M	mjson
M	mscb
M	mvodb
M	mxml

The 'M' in front of the submodules like mscb tell you that you
have an older version of midas (namely midas-2020-08), but the 
*current* submodules, which won't match. So you have to roll back
also the submodules with:

3.5) git submodule update --recursive

This fetched those versions of the submodules which match the
midas version 2020-08. See here for details: 

https://git-scm.com/book/en/v2/Git-Tools-Submodules

From where did you get the command

git checkout release/xxxx ???

If you tell me the location of that documentation, I will take
care that it will be amended with the command

git submodule update --recursive

Best,
Stefan
Entry  19 Sep 2021, Stefan Ritt, Bug Fix, Chat working again Screenshot_2021-09-19_at_21.27.19_.png
Not sure how many people are using it, but the Chat facility in midas was broken 
for some time now and got fixed today again.

Just for your information: Chat can be used like WhatsApp & Co, and connects all 
people who access a midas experiment through their browser. It's good to 
communicate between shift crew members located at different places. One advantage 
is that the chat messages can get 'spoken' by the text-to-speech engine of your 
browser, so it can be used to "wake up" shifters. Can be configured through the 
"Config" page.

Stefan
Entry  06 Sep 2021, Andreas Suter, Forum, mhttpd crash 
midas version used: midas-2019-05-cxx-1461-g906be8b

I find in the systemd log every couple of days/weeks the following error message related to the mhttpd:

[mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!

with various socket numbers of course.

Can anybody hint me what is going wrong here?

The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.

Help would be very much appreciated!

Andreas 
    Reply  06 Sep 2021, Konstantin Olchanski, Forum, mhttpd crash 
> [mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!
> Can anybody hint me what is going wrong here?
> The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.

This is my code. I am the culprit. I had a bit of discussion about this with Stefan.

Bottom line is something is rotten in the multithreading code inside mhttpd and under conditions unknown,
it sends the wrong data into the wrong socket. This causes midas web pages to be really confused (RPC replies
processed as CSS file, HTML code processed at RPC replies, a mess), this wrong data is cached by the browser,
so restarting mhttpd does not fix the web pages. So a mess.

I find this is impossible to replicate, and so cannot debug it, cannot fix it. Best I was able to do
is to add a check for socket numbers, and thankfully it catches the condition before web browser caches
become poisoned. So, broken web pages replaced by mhttpd crash.

This situation reinforces my opinion that multi-threading and C++ classes "do not mix" (like H2 and O2 do not mix).
If you write a multithreaded C++ program and it works, good for you, if there is a malfunction, good luck with it,
C++ just does not have any built-in support for debugging typical multithreading problems. I think others have come
to the same conclusion and invented all these new "safe" programming languages, like Rust and Go.

Back to your troubles.

1) If you see a way to replicate this crash, or some way to reliably cause
the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
and I wish to fix this problem very much.

2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
are useless for debugging the present problem. There is just no way to examine the state of each
thread and of each http request using gdb by hand.

3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
the "wrong socket" check, but most likely you will not like the result).

4) run mhttpd inside a script: "while (1) { start mhttpd; sleep 1 sec; rinse, repeat; }" (run mhttpd without "-D", yes?)

In other news, the mongoose web server library have a new version available, they again changed their
multithreading scheme (I think it is an improvement). If I update mhttpd to this new version, it is very
likely the code with the "wrong socket" bug will be deleted. (with new bugs added to replace old bugs, of course).

K.O.
       Reply  07 Sep 2021, Andreas Suter, Forum, mhttpd crash 
Dear Konstantin,

thanks for the prompt response, this helps a lot!

> 1) If you see a way to replicate this crash, or some way to reliably cause
> the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
> and I wish to fix this problem very much.

I wished I could! This happens 3-4 times per year only, so close to impossible to trigger.

> 2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
> are useless for debugging the present problem. There is just no way to examine the state of each
> thread and of each http request using gdb by hand.
> 
> 3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
> other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
> to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
> the "wrong socket" check, but most likely you will not like the result).
> 

I changed now to exit() rather than abort on the production machine. Perhaps this should be the default?

Andreas
          Reply  17 Sep 2021, Stefan Ritt, Forum, mhttpd crash mhttpdScreenshot_2021-09-17_at_21.11.15_.png
To limit the impact of the numerous crashes of mhttpd, I installed the monit tool at MEG at PSI 
(https://en.wikipedia.org/wiki/Monit). It monitors mhttpd, and if it cannot connect to it for a certain
time, it kills the process and restarts it. This covers endless loops, simple crashes (caused by the
known multi-threading issue in mongoose), and also cases where mhttpd develops a memory leak and becomes
unresponsive. 

To configure monit for mhttpd, first install the package, make sure the daemon gets started automatically
after reboot (typically "sysemctl enable monit"), and put the attached file into

/etc/monit.d/mhttpd

You have to adjust the <path-to-midas> according to your midas installation, and probably also the port
under which mhttpd is listening (8082 in my case). Put 

set daemon 10

into /etc/monitrc if you want monit to check mhttpd every 10 seconds (default is 30 seconds). Then, every
10 seconds monit request "midas.css" from mhttpd, and if it cannot obtain it after 30 seconds, it kills
mhttpd and restarts it.

Loading long history plots taking more than 30 seconds should probably not be an issue since mhttpd is 
multi-threaded, but I haven't tested this in detail.

Attached below is a typical status page produced by monit, which has its own built-in web server (normally
listening at port 2812, accessible only from localhost by default).

I hope this helps some of you.

Stefan
Entry  24 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
I am updating the history plots. Main changes:

- the old history display code should again be easily usable (use the "open in old history display" checkbox)
- the history plot editor has an "edit in ODB" button that takes as to the plot definition in ODB (sometimes it is 
easier to editing things in the ODB editor)
- error in history plot editor that created "formula" entry of incorrect size should be fixed
- "reorder" (and "delete entry") functions in the history plot editor should work again (plus added explanation text)
- "factor" and "offset" restored in the history plot editor
- added the long desired "voffset" to simplify plot scaling and positioning
- (factor, offset and voffset do not yet work in the new history plots, TBI ASAP)
- history plot editor and generate_hist_graph() now use the same code to read plot definitions from ODB. There should 
be no more confusion about content of history plot entries in ODB and what each entry is supposed to do.

These changes have been precipitated by our inability to plot high voltage voltage and current on the same plot,
see bug https://bitbucket.org/tmidas/midas/issues/308/history-plot-formula-cannot-be-used-to

Voltage is in the range 0..1000 (volts) and current is in the range 0..50 and 0..0.100, autoscaling on voltage
makes the currents invisible at the zero line. In the past, we used the "factor" setting to scale
the graphs so we can see both voltage and currents at the same time (currents scaled up by factor 25 and 600,
as example).

The new "formula" feature was supposed to replace (and improve upon) the "factor" and "offset". But if I use
the formula "x*25", suddenly the plot is telling us that current values are not 50 uA, but 1250 uA (50*25),
and this is just wrong. We do not want to scale the micro-amps, we want to better position the plot on the graph,
like the old "factor" and "offset" allowed us to do.

So the idea is to use this computation:

y_position_on_plot = offset + factor*(formula(history_value) - voffset)

- "formula" is to transform history values into physical values (i.e. pressure meter reports bars, but we want atm, or 
voltmeter is reading in discrete units of 0.125V, we want to see volts)
- "factor" and "offset" is to position the graphs on the plot for best visual presentation of data
- I also added is the much desired "voffset", you only know it is needed if you have a non-zero "offset" and you need 
to change the "factor", surprise, "offset" has ot be changed, too, and good luck recalculating it correctly in one 
try.

The way to use this stuff:
- adjust "voffset" to bring the graph to around y=0
- increase the "factor" to zoom-in on features and stuff
- adjust "offset" to move the graph up and down relative to all the other graphs on the plot
- now one can zoom in and out as needed by changing the "factor" and the plot will stay roughly in the right place 
without having to readjust the offsets.

K.O.
    Reply  24 Jun 2021, Stefan Ritt, Bug Fix, changes in history plots 
I disagree with the proposed change to scale the HV current for a "nice" display. If values are scaled, the axis should be 
scaled in the same way. Otherwise people might read the current from the plot, look at the axis, and again get the wrong 
value (the factor of 25x you mention). Sure you can hover with the cursor over the graph, and see the right value, but think 
of taking a screen shot, putting this into a publication, and get complaints from the reviewer.

The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a 
new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For 
the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.

Stefan
       Reply  25 Jun 2021, Marco Francesconi, Bug Fix, changes in history plots 
We are using the new history formula as a quick way to convert signals from sensors to actual physical values (for example Voltage->Temperature, Voltage->relative humidity 
...), so it is great that the shown voltage is the calculated one.

I would like to add a point to this discussion.
In our collaboration people attach images of history plots to elogs, meeting presentation and/or physical logbooks.
The proposed scaling formula may work fine online using the cursors, but, once an image is created, I do not understand how it is possible to extract the value for a scaled 
variables.
Suppose you see a graph in a presentation with a current increase by some PSU and the current was scaled to be in the same plot of the voltage.
Looking at the delta in the image, how can you judge the current increase without any axis/grid to refer to?

So I support Stefan proposal for a secondary axis, as long as it is clear which value belong to which axis.
Maybe marking the channels in the description or using different line styles/thickness?

Best,
Marco
          Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
I will have to post an example of a scaled plot. I figure everybody forgot how they look like.

K.O.


> We are using the new history formula as a quick way to convert signals from sensors to actual physical values (for example Voltage->Temperature, Voltage->relative humidity 
> ...), so it is great that the shown voltage is the calculated one.
> 
> I would like to add a point to this discussion.
> In our collaboration people attach images of history plots to elogs, meeting presentation and/or physical logbooks.
> The proposed scaling formula may work fine online using the cursors, but, once an image is created, I do not understand how it is possible to extract the value for a scaled 
> variables.
> Suppose you see a graph in a presentation with a current increase by some PSU and the current was scaled to be in the same plot of the voltage.
> Looking at the delta in the image, how can you judge the current increase without any axis/grid to refer to?
> 
> So I support Stefan proposal for a secondary axis, as long as it is clear which value belong to which axis.
> Maybe marking the channels in the description or using different line styles/thickness?
> 
> Best,
> Marco
       Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> I disagree ...

I am happy with disagreement and differences of opinions. Zest of life, driver of progress and improvements, etc.

I am even more happy with solutions to problems. The current problem is that the offset and factor feature
of history plots has been removed without much discussion.

I stress, we have been using this feature to run experiments for the last 20 years.

I do not understand objections to it being restored. If you do not want to use it, do not use it.

K.O.

> with the proposed change to scale the HV current for a "nice" display. If values are scaled, the axis should be 
> scaled in the same way. Otherwise people might read the current from the plot, look at the axis, and again get the wrong 
> value (the factor of 25x you mention). Sure you can hover with the cursor over the graph, and see the right value, but think 
> of taking a screen shot, putting this into a publication, and get complaints from the reviewer.
> 
> The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a 
> new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For 
> the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.
> 
> Stefan
          Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> > The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a 
> > new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For 
> > the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.

In the past, we have done some useful plots with maybe 10 variables plotted
at the same time with different scaling and positioning on the graph.

Having 2 vertical axis is maybe useful for the specific case of plotting high voltages,
but not in the general case.

Actually, just 2 vertical axis will not work to plot high voltages in ALPHA-g, because
we have anode currents on the scale 0..0.1 uA and cathode currents on the scale 50..60 uA.

K.O.
    Reply  25 Jun 2021, Stefan Ritt, Bug Fix, changes in history plots 
A general warning: With the recent history changes implemented in the develop branch, starting from a fresh ODB and editing 
any history panel, on gets tons of errors and debug output from mhttpd:

MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Log axis" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
Load from ODB History/Display/Default/Trigger rate: hist plot: 2 variables
timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, show_run_markers: 1, show_values: 1, 
show_fill: 1
var[0] event [System][Trigger per sec.] formula [], colour [#00AAFF] label [] factor 1.000000 offset 0.000000 voffset 
0.000000 order 10
var[1] event [System][Trigger kB per sec.] formula [], colour [#FF9000] label [] factor 1.000000 offset 0.000000 voffset 
0.000000 order 20



This has to be fixed by the original author. I strongly recommend to make such modifications on a separate branch not to 
break running experiments.

Stefan
       Reply  25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> A general warning: With the recent history changes implemented in the develop branch, starting from a fresh ODB and editing 
> any history panel, on gets tons of errors and debug output from mhttpd: ...

This is the reason most projects have separate development and production branches.

I recommend everybody to use the released tagged versions of midas for production.

> I strongly recommend to make such modifications on a separate branch not to 
> break running experiments.

Is there something that does not work anymore? Did I break something? The debug messages I am still
tuning.

K.O.


> 
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Log axis" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
> Load from ODB History/Display/Default/Trigger rate: hist plot: 2 variables
> timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, show_run_markers: 1, show_values: 1, 
> show_fill: 1
> var[0] event [System][Trigger per sec.] formula [], colour [#00AAFF] label [] factor 1.000000 offset 0.000000 voffset 
> 0.000000 order 10
> var[1] event [System][Trigger kB per sec.] formula [], colour [#FF9000] label [] factor 1.000000 offset 0.000000 voffset 
> 0.000000 order 20
> 
> 
> 
> This has to be fixed by the original author. I strongly recommend to make such modifications on a separate branch not to 
> break running experiments.
> 
> Stefan
    Reply  30 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> I am updating the history plots.
> So the idea is to use this computation:
> y_position_on_plot = offset + factor*(formula(history_value) - voffset)

Stefan and myself did some brain storming on zoom. Writing it down the way I remember it.

- we distilled the gist of the problem - the numerical values we show in the plot labels and in hover-over-the-graph
are before formula is applied or after the formula is applied?

- I suggested a universal solution using a double formula: use formula1 for one case;
  use formula2 for the other case;
  use formula1 for "physics calibration", use formula2 for factor and offset for composite plots:
     numeric_value = formula1(history_value)
     plotted_value = formula2(numeric_value)

- we agree that this is way too complicated, difficult to explain and difficult to coherently present in the history editor

- Stefan suggested a simple solution, a checkbox labeled "show raw value" next to each history variable. by default, the 
value after the formula is plotted and displayed. if checked, the raw value (before the formula) is displayed, and the 
value after the formula is plotted. (so this works the same as the factor and offset on the old history plots).

- if "show raw value" is enabled, the numerical values shown will be inconsistent against the labels on the vertical axis. 
Our solution it to turn the axis labels off. (for composite plots, like oscillator frequency in Hz vs oscillator 
temperature in degC, both scaled to see their correlation, the vertical axis is unit-less "arbitrary units", of course)

- to simplify migration of old history plots that use custom factor and offset settings, we think in the direction of 
automatically moving them to the "formula". (factor=2, offset=10 automatically populates formula with "2*x+10", "show raw 
value" checked/enabled). Thus we can avoid implementing factor and offset in the new history code (an unwelcome 
complication).

- I think this covers all the use cases I have seen in the past, so we will move in this direction.

K.O.
       Reply  14 Jul 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
Moving in the direction of this proposal. History plot editor is updated according to it. Remaining missing piece is the "show 
raw value" buttons and code behind them.

Changes:

- "show factor and offset" moved to the top of the page, "off" by default
- factor and offset (if not zero) are automatically migrated to the formula field (if it is empty), one needs to save the panel 
for this to take effect.

K.O.


> > I am updating the history plots.
> > So the idea is to use this computation:
> > y_position_on_plot = offset + factor*(formula(history_value) - voffset)
> 
> Stefan and myself did some brain storming on zoom. Writing it down the way I remember it.
> 
> - we distilled the gist of the problem - the numerical values we show in the plot labels and in hover-over-the-graph
> are before formula is applied or after the formula is applied?
> 
> - I suggested a universal solution using a double formula: use formula1 for one case;
>   use formula2 for the other case;
>   use formula1 for "physics calibration", use formula2 for factor and offset for composite plots:
>      numeric_value = formula1(history_value)
>      plotted_value = formula2(numeric_value)
> 
> - we agree that this is way too complicated, difficult to explain and difficult to coherently present in the history editor
> 
> - Stefan suggested a simple solution, a checkbox labeled "show raw value" next to each history variable. by default, the 
> value after the formula is plotted and displayed. if checked, the raw value (before the formula) is displayed, and the 
> value after the formula is plotted. (so this works the same as the factor and offset on the old history plots).
> 
> - if "show raw value" is enabled, the numerical values shown will be inconsistent against the labels on the vertical axis. 
> Our solution it to turn the axis labels off. (for composite plots, like oscillator frequency in Hz vs oscillator 
> temperature in degC, both scaled to see their correlation, the vertical axis is unit-less "arbitrary units", of course)
> 
> - to simplify migration of old history plots that use custom factor and offset settings, we think in the direction of 
> automatically moving them to the "formula". (factor=2, offset=10 automatically populates formula with "2*x+10", "show raw 
> value" checked/enabled). Thus we can avoid implementing factor and offset in the new history code (an unwelcome 
> complication).
> 
> - I think this covers all the use cases I have seen in the past, so we will move in this direction.
> 
> K.O.
          Reply  14 Jul 2021, Konstantin Olchanski, Bug Fix, changes in history plots 
> Moving in the direction of this proposal. Remaining missing piece is the "show 
> raw value" buttons and code behind them.

added "show raw value" button, updated on-page instructions.

I think this is the final layout of the history panel editor, conversion
to html+javascript will be done "as is". If you have suggestions to improve
the layout (add/remove/move things around, etc), please shoult out (on the elog
here or by direct email to me).

I am thinking in the direction of changing the control flow of the history editor:

- midas "history" manu button click redirects to
- current history panel selection (with checkbox to open old history plots), click on "new plot" button redirects to
- new page for creating new plots. this will present a list of all history variables, click on variable name creates a new history 
panel containing just this one variable and redirects to it.

In other words, to see the history for any history variable:
- click on "history" menu button
- click on "new"
- click on desired history variable
- see this history plot

From here, click on the "wheel" button to open the existing history panel editor and add any additional variables, change settings, 
etc.

In the history panel editor, I am thinking in the direction of replacing the existing drop-down selection of history variables (now 
very workable for large experiments) with an overlay dialog to show all history variables, with checkboxes to select them, basically 
the same history variable select page as described above. Not sure yet how this will work visually.

K.O.
             Reply  24 Aug 2021, Stefan Ritt, Bug Fix, changes in history plots 
One addition I would be in favour of is to remove the "Order" and replace it with drag&drop handles, because this is what people are more 
used to today. Only the old guys like us remember the /etc/init.d/xx_yy scheme where one uses an integer number in the file name to 
determine an order. 

See for example: https://jsbin.com/hijetos/edit?js,output

But instead of relying on a foreign library, I would rather implement that myself, since I need the same thing later for the to-be-
implemented ODB editor (next year? next lockdown?)

Stefan
Entry  19 Aug 2021, Konstantin Olchanski, Bug Report, select() FD_SETSIZE overrun 
I am looking at the mlogger in the ALPHA anti-hydrogen experiment at CERN. It is 
mysteriously misbehaving during run start and stop.

The problem turns out to be with the select() system call.

The corresponding FD_SET(), FD_ISSET() & co operate on a an array of fixed size 
FD_SETSIZE, value 1024, in my case. But the socket number is 1409, so we overrun 
the FD_SET() array. Ouch.

I see that all uses of select() in midas have no protection against this.

(we should probably move away from select() to newer poll() or whatever it is)

Why does mlogger open so many file descriptors? The usual, scaling problems in the 
history. The old midas history does not reuse file descriptors, so opens the same 
3 history files (.hst, .idx, etc) for each history event. The new FILE history 
opens just one file per history event. But if the number of events is bigger than 
1024, we run into same trouble.

(BTW, the system limit on file descriptors is 4096 on the affected machine, 1024 
on some other machines, see "limit" or "ulimit -a").

K.O.
    Reply  20 Aug 2021, Stefan Ritt, Bug Report, select() FD_SETSIZE overrun 
> I am looking at the mlogger in the ALPHA anti-hydrogen experiment at CERN. It is 
> mysteriously misbehaving during run start and stop.
> 
> The problem turns out to be with the select() system call.
> 
> The corresponding FD_SET(), FD_ISSET() & co operate on a an array of fixed size 
> FD_SETSIZE, value 1024, in my case. But the socket number is 1409, so we overrun 
> the FD_SET() array. Ouch.
> 
> I see that all uses of select() in midas have no protection against this.
> 
> (we should probably move away from select() to newer poll() or whatever it is)
> 
> Why does mlogger open so many file descriptors? The usual, scaling problems in the 
> history. The old midas history does not reuse file descriptors, so opens the same 
> 3 history files (.hst, .idx, etc) for each history event. The new FILE history 
> opens just one file per history event. But if the number of events is bigger than 
> 1024, we run into same trouble.
> 
> (BTW, the system limit on file descriptors is 4096 on the affected machine, 1024 
> on some other machines, see "limit" or "ulimit -a").
> 
> K.O.

I cannot imagine that you have more than 1024 different events in ALPHA. That wouldn't 
fit on your status page. 

I have some other suspicion: The logger opens a history file on access, then closes it 
again after writing to it. In the old days we had a case where we had a return from the 
write function BEFORE the file has been closed. This is kind of a memory leak, but with 
file descriptors. After some time of course you run out of file descriptors and crash. 
Now that bug has been fixed many years ago, but it sounds to me like there is another 
"fd leak" somewhere. You should add some debugging in the history code to print the 
file descriptors when you open a file and when you leave that routine. The leak could 
however also be somewhere else, like writing to the message file, ODB dump, ...

The right thing of course would be to rewrite everything with std::ofstream which 
closes automatically the file when the object gets out of scope.

Stefan
Entry  12 May 2021, Mathieu Guigue, Bug Report, mhttpd WebServer ODBTree initialization 
Hi,

Using midas version 12-2020,  I am trying to run mhttpd from within a docker container using docker-compose.
Starting from an empty ODB, I simply run `mhttpd` and this is the output I have:
midas_hatfe_1  | <Warning> Starting mhttpd...
midas_hatfe_1  | [mhttpd,INFO] ODB subtree /Runinfo corrected successfully
midas_hatfe_1  | MVOdb::SetMidasStatus: Error: MIDAS db_find_key() at ODB path "/WebServer/Host list" returned status 312
midas_hatfe_1  | Mongoose web server will not use password protection
midas_hatfe_1  | Mongoose web server will not use the hostlist, connections from anywhere will be accepted
midas_hatfe_1  | Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF
midas_hatfe_1  | [mhttpd,ERROR] [mhttpd.cxx:19160:mongoose_listen,ERROR] Cannot mg_bind address "[::1]:8080"

According to the documentation, the WebServer tree should be created automatically when starting the mhttpd; but it seems not as it doesn't find the entry "/WebServer/Host list".
If I create it by end (using "create STRING /WebServer/Host list"), I still get the error message that mhttpd didn't bind properly to the local port 8080.
I am not sure what it wrong, as mhttpd is working perfectly well in this exact container for midas 03-2020.

Any idea what difference makes it not possible anymore to run into these container?

Thanks very much for your help.
Cheers
Mathieu
    Reply  12 May 2021, Ben Smith, Bug Report, mhttpd WebServer ODBTree initialization 
> midas_hatfe_1  | Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF
> midas_hatfe_1  | [mhttpd,ERROR] [mhttpd.cxx:19160:mongoose_listen,ERROR] Cannot mg_bind address "[::1]:8080"

It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false.
       Reply  12 May 2021, Stefan Ritt, Bug Report, mhttpd WebServer ODBTree initialization 
> It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false.

We had this issue already several times. This info should be put into the documentation at a prominent location.

Stefan
          Reply  13 May 2021, Mathieu Guigue, Bug Report, mhttpd WebServer ODBTree initialization 
> > It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false.
> 
> We had this issue already several times. This info should be put into the documentation at a prominent location.
> 
> Stefan

Thanks a lot, this solved my issue!
             Reply  14 May 2021, Stefan Ritt, Bug Report, mhttpd WebServer ODBTree initialization 
> Thanks a lot, this solved my issue!

... or we should turn IPv6 off by default, since not many people use this right now.
                Reply  02 Jun 2021, Konstantin Olchanski, Bug Report, mhttpd WebServer ODBTree initialization 
> > Thanks a lot, this solved my issue!
> 
> ... or we should turn IPv6 off by default, since not many people use this right now.

IPv6 certainly works and is used at CERN.

But I am not sure why people see this message. I do not see it on any machines at 
TRIUMF, even those with IPv6 turned off.

K.O.
                   Reply  05 Aug 2021, Stefan Ritt, Bug Report, mhttpd WebServer ODBTree initialization 
Well, we all see it here at PSI, so this is enough reason to turn this off by default. Shall 
I do it?
Entry  04 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
Hi,

if I check out midas and try to configure it with 

cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas

I do get the error messages:

  Target "midas" INTERFACE_INCLUDE_DIRECTORIES property contains path:

    "<path>/tmidas/midas/include"

  which is prefixed in the source directory.

Is the cmake setup not relocatable? This is new and was working until recently:

MIDAS version:      2.1
GIT revision:       Thu May 27 12:56:06 2021 +0000 - midas-2020-08-a-295-gfd314ca8-dirty on branch HEAD
ODB version:        3
    Reply  04 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas

good timing, I am working on cmake for manalyzer and rootana and I have not tested
the install prefix business.

now I know to test it for all 3 packages.

I will also change find_package(Midas) slightly, (see my other message here),
I hope you can confirm that I do not break it for you.

K.O.
    Reply  04 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> Is the cmake setup not relocatable? This is new and was working until recently:

Indeed. Not relocatable. This is because we do not install the header files.

When you use the CMAKE_INSTALL_PREFIX, you get MIDAS "installed" in:

prefix/lib
prefix/bin
$MIDASSYS/include <-- this is the source tree and so not "relocatable"!

Before, this was kludged and cmake did not complain about it.

Now I changed cmake to handle the include path "the cmake way", and now it knows to complain about it.

I am not sure how to fix this: we have a conflict between:

- our normal way of using midas (include $MIDASSYS/include, link $MIDASSYS/lib, run $MIDASSYS/bin)
- the cmake way (packages *must be installed* or else! but I do like install(EXPORT)!)
- and your way (midas include files are in $MIDASSYS/include, everything else is in your special location)

I think your case is strange. I am curious why you want midas libraries to be in prefix/lib instead of in 
$MIDASSYS/lib (in the source tree), but are happy with header files remaining in the source tree.

K.O.
       Reply  04 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > Is the cmake setup not relocatable? This is new and was working until recently:
> 
> Indeed. Not relocatable. This is because we do not install the header files.
> 
> When you use the CMAKE_INSTALL_PREFIX, you get MIDAS "installed" in:
> 
> prefix/lib
> prefix/bin
> $MIDASSYS/include <-- this is the source tree and so not "relocatable"!
> 
> Before, this was kludged and cmake did not complain about it.
> 
> Now I changed cmake to handle the include path "the cmake way", and now it knows to complain about it.
> 
> I am not sure how to fix this: we have a conflict between:
> 
> - our normal way of using midas (include $MIDASSYS/include, link $MIDASSYS/lib, run $MIDASSYS/bin)
> - the cmake way (packages *must be installed* or else! but I do like install(EXPORT)!)
> - and your way (midas include files are in $MIDASSYS/include, everything else is in your special location)
> 
> I think your case is strange. I am curious why you want midas libraries to be in prefix/lib instead of in 
> $MIDASSYS/lib (in the source tree), but are happy with header files remaining in the source tree.
> 
> K.O.

We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
If I think an all other packages I am working with, e.g. ROOT, the includes are also installed under CMAKE_INSTALL_PREFIX. 
Up until recently there was no issue to work with CMAKE_INSTALL_PREFIX, accepting that the includes stay under 
$MIDASSYS/include, even though this is not quite the standard way, but no problem here. Anyway, since CMAKE_INSTALL_PREFIX 
is a standard option from cmake, I think things should not "break" if you want to use it.

A.S.
          Reply  08 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > Is the cmake setup not relocatable? This is new and was working until recently:
> > Not relocatable. This is because we do not install the header files.
> 
> We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 

hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
mean by this? you install midas in a secret location to prevent somebody from linking to it?

> If I think an all other packages I am working with, e.g. ROOT, the includes are also installed under CMAKE_INSTALL_PREFIX.

cmake and other frameworks tend to be like procrustean beds (https://en.wikipedia.org/wiki/Procrustes),
pre-cmake packages never quite fit perfectly, and either the legs or the heads get cut off. post-cmake packages
are constructed to fit the bed, whether it makes sense or not.

given how this situation is known since antiquity, I doubt we will solve it today here.

(I exercise my freedom of speech rights to state that I object being put into
such situations. And I would like to have it clear that I hate cmake (ask me why)).

>
> Up until recently there was no issue to work with CMAKE_INSTALL_PREFIX, accepting that the includes stay under 
> $MIDASSYS/include, even though this is not quite the standard way, but no problem here.
>

I think a solution would be to add install rules for include files. There will be a bit of trouble,
normal include path is $MIDASSYS/include,$MIDASSYS/mxml,$MIDASSYS/mjson,etc, after installing
it will be $CMAKE_INSTALL_PREFIX/include (all header files from different git submodules all
dumped into one directory). I do not know what problems will show up from that.

I think if midas is used as a subproject of a bigger project, this is pretty much required
(and I have seen big experiments, like STAR and ND280, do this type of stuff with CMT,
another horror and the historical precursor of cmake)

The problem is that we do not have any super-project like this here, so I cannot ever
be sure that I have done everything correctly. cmake itself can be helpful, like
in the current situation where it told us about a problem. but I will never trust
cmake completely, I see cmake do crazy and unreasonable things way too often.

One solution would be for you or somebody else to contribute such a cmake super-project,
that would build midas as a subproject, install it with a CMAKE_INSTALL_PREFIX and
try to link some trivial frontend or analyzer to check that everything is installed
correctly. It would become an example for "how to use midas as a subproject").
Ideally, it should be usable in a bitbucket automatic build (assuming bitbucket
has correct versions of cmake, which it does not half the time).

P.S. I already spent half-a-week tinkering with cmake rules, only to discover
that I broke a kludge that allows you to do something strange (if I have it right,
the CMAKE_PREFIX_INSTALL code is your contribution). This does not encourage
 me to tinker with cmake even more. who knows against what other
kludge I bump into. (oh, yes, I know, I already bumped into the nonsense
find_package(Midas) implementation).

K.O.
             Reply  09 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > > Is the cmake setup not relocatable? This is new and was working until recently:
> > > Not relocatable. This is because we do not install the header files.
> > 
> > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
> 
> hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
> mean by this? you install midas in a secret location to prevent somebody from linking to it?
> 

This was a wrong wording from my side. We do not want the the users have write access to the midas installation libs and bins.
I have submitted the pull request which should resolve this without interfere with your usage.
Hope this will resolve the issue.
                Reply  10 Jun 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > > > Is the cmake setup not relocatable? This is new and was working until recently:
> > > > Not relocatable. This is because we do not install the header files.
> > > 
> > > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
> > 
> > hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
> > mean by this? you install midas in a secret location to prevent somebody from linking to it?
> > 
> 
> This was a wrong wording from my side. We do not want the the users have write access to the midas installation libs and bins.
> I have submitted the pull request which should resolve this without interfere with your usage.
> Hope this will resolve the issue.

Excellent. I think it is good to have midas "install" in a sane manner.

But I still struggle to understand what you do. Presumably you can "install" midas
in the "midas account", which is not writable by the experiment and user accounts.
Then it does not matter if you "install" it in it's build directory (like we do)
or in some other location (like you do now).

This does not work of course if you only have one account, so do you build midas
as root? or install it as root?

I do ask because in the current computing world, doing things as root requires
a certain amount of trust, which may not be there anymore, see the recent "supply side" attacks
against python packages, solar winds hack, linux kernel malicious patches from umn, etc.

Personally, I do not want to answer questions "is midas safe to run as root?",
"can I trust the midas install scripts to run as root?" and certainly I do not want to hear
about "I installed midas and 100 other packages as root and got hacked 7 days later".

(and running midas as root was never safe. neither mhttpd nor mserver will pass
a security audit).

Anyhow, looks like I will look at cmake again next week. Right now I have a major
breakthrough in the ALPHA-g experiment, my big 96-port Juniper switch suddenly
has working ethernet flow control and I can record data at 600 Mbytes/sec without
any UDP packet loss. Above that, my event builder explodes. I want to fix it and get
it up to 1000 Mbytes/sec, the limit of my 10gige network link. (In this system I do not
have the disk subsystem to record data at this rate, but I have build 8-disk ZFS arrays
that would sink it, no problem). And the day has come when I ran out of CPU cores.
The UDP packet receivers are multithreaded, the event builder is multithreaded and I am using
all 4 of the available cores (intel cpu). As soon as I can get a rackmounted AMD Ryzen
or Threadripper machine, we will likely upgrade. (need at least one more CPU core to run
the online analyzer!). Exciting.

K.O.
                   Reply  10 Jun 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
> > > > > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas
> > > > > > Is the cmake setup not relocatable? This is new and was working until recently:
> > > > > Not relocatable. This is because we do not install the header files.
> > > > 
> > > > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. 
> > > 
> > > hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you
> > > mean by this? you install midas in a secret location to prevent somebody from linking to it?
> > > 
> > 
> > This was a wrong wording from my side. We do not want the the users have write access to the midas installation libs and bins.
> > I have submitted the pull request which should resolve this without interfere with your usage.
> > Hope this will resolve the issue.
> 
> Excellent. I think it is good to have midas "install" in a sane manner.
> 
> But I still struggle to understand what you do. Presumably you can "install" midas
> in the "midas account", which is not writable by the experiment and user accounts.
> Then it does not matter if you "install" it in it's build directory (like we do)
> or in some other location (like you do now).
> 
> This does not work of course if you only have one account, so do you build midas
> as root? or install it as root?
> 

We work the following way: there is a production Midas under let's say /usr/local/midas (make install as sudo/root). This is for the running experiment. Since we are doing muSR, we 
have experiments on a daily base, rather than month and years as it is the case for a particle physics experiment. Now, still we would like to test updates, new features of Midas on 
the same machine. For this we us the repo directly. If we are happy with the new feature, and fixes, we again do a 'make install' and hence freeze for the production a specific 
snapshot. Of course we could use various local copies of the Midas repo, but over the last years this approach was very convenient and productive. Hope this explains a bit better 
why we want to work with a CMAKE_INSTALL_PREFIX.

AS
                      Reply  11 Jul 2021, Konstantin Olchanski, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
big thanks to Andreas S. for getting most of this figured out. I now understand
much better how cmake installs things and how it generates config files, both
find_package(midas) style and install(export) style.

with the latest updates, CMAKE_INSTALL_PREFIX should work correctly. I now understand how it works,
how to use it and how to test it, it should not break again.

for posterity, my commends to Andreas's pull request:

thank you for providing this code, it was very helpful. at the end I implemented things slightly differently. It took me a while to understand that I have to provide 2 “install” modes, for your case, I need to 
“install” the header files and everything works “the cmake way”, for our normal case, we use include files in-place and have to include all the git submodules to the include path. I am quite happy with the 
result. K.O.

K.O.
                         Reply  02 Aug 2021, Andreas Suter, Bug Report, cmake with CMAKE_INSTALL_PREFIX fails 
Dear Konstantin,

I have tried your adopted version. You did already quite a job which is more consistent than what I was suggesting.
Yet, I still have a problem (git sha2 2d3872dfd31) when starting on a clean system (i.e. no midas present yet): 
Without CMAKE_INSTALL_PREFIX set, everything is fine. 
However, when setting CMAKE_INSTALL_PREFIX, I get the following error message on the build level (cmake --build ./ -- VERBOSE=1) from the manalyzer:

[ 32%] Building CXX object manalyzer/CMakeFiles/manalyzer.dir/manalyzer.cxx.o
cd /home/l_musr_tst/Tmp/midas/build/manalyzer && /usr/bin/c++  -DHAVE_FTPLIB -DHAVE_MIDAS -DHAVE_ROOT_HTTP -DHAVE_THTTP_SERVER -DHAVE_TMFE -DHAVE_ZLIB -D_LARGEFILE64_SOURCE -I/home/l_musr_tst/Tmp/midas/manalyzer -I/usr/local/root/include  -O2 -g -Wall -Wformat=2 -Wno-format-nonliteral -Wno-strict-aliasing -Wuninitialized -Wno-unused-function -std=c++11 -pipe -fsigned-char -pthread -DHAVE_ROOT -std=gnu++11 -o CMakeFiles/manalyzer.dir/manalyzer.cxx.o -c /home/l_musr_tst/Tmp/midas/manalyzer/manalyzer.cxx
In file included from /home/l_musr_tst/Tmp/midas/manalyzer/manalyzer.cxx:14:0:
/home/l_musr_tst/Tmp/midas/manalyzer/manalyzer.h:13:21: fatal error: midasio.h: No such file or directory
 #include "midasio.h"
                     ^
compilation terminated.

Obviously, still some include paths are missing. I tried quickly to see if an easy fix is possible, but I failed.

Question: is it possible to use manalyzer without midas? I am asking since the MIDAS_FOUND flag is confusing me.

> big thanks to Andreas S. for getting most of this figured out. I now understand
> much better how cmake installs things and how it generates config files, both
> find_package(midas) style and install(export) style.
> 
> with the latest updates, CMAKE_INSTALL_PREFIX should work correctly. I now understand how it works,
> how to use it and how to test it, it should not break again.
> 
> for posterity, my commends to Andreas's pull request:
> 
> thank you for providing this code, it was very helpful. at the end I implemented things slightly differently. It took me a while to understand that I have to provide 2 “install” modes, for your case, I need to 
> “install” the header files and everything works “the cmake way”, for our normal case, we use include files in-place and have to include all the git submodules to the include path. I am quite happy with the 
> result. K.O.
> 
> K.O.
Entry  31 Jul 2021, Peter Kunz, Bug Report, ss_shm_name: unsupported shared memory type, bye! 
I ran into a problem trying to compile the latest MIDAS version on a Fedora 
system.

mhttpd and odbedit return:
ss_shm_name: unsupported shared memory type, bye!

check_shm_type: preferred POSIXv4_SHM got SYSV_SHM

The check returns SYSV_SHM which doesn't seem to be supported in ss_shm_name.

Is there an easy solution for this?

Thanks.
Entry  09 Jul 2021, Konstantin Olchanski, Bug Report, cmake question 
cmake check and mate in 1 move. please help.

the midas cmake file has a typo in the ROOT_CXX_FLAGS, I fixed it and now I am dead in the 
water, need help from cmake experts and pushers.

On Ubuntu:
ROOT_CXX_FLAGS has -std=c++14
midas cmake defines -std=gnu++11 (never mind that I asked for c++11, not "c++11 with GNU 
extensions")

the two compiler flags collide and the build explodes, the best I can tell c++11 prevails 
and ROOT header files blow up because they expect c++14.

if I remove the midas cmake request for c++11, -std=gnu++11 is gone, there is no conflict 
with ROOT C++14 request and the build works just fine.

but now it explodes on CentOS-7 because by default, c++11 is not enabled. (include <mutex> 
blows up).

what a mess.

K.O.
    Reply  13 Jul 2021, Konstantin Olchanski, Bug Report, cmake question 
> cmake check and mate in 1 move. please help.
> -std=c++11 and -std=c++14 collision...

I have a solution implemented for this, I am not happy with it, Stefan is not happy with it. See 
discussion: https://bitbucket.org/tmidas/midas/commits/50a15aa70a4fe3927764605e8964b55a3bb1732b

K.O.
       Reply  14 Jul 2021, Konstantin Olchanski, Bug Report, cmake question 
> > cmake check and mate in 1 move. please help.
> > -std=c++11 and -std=c++14 collision...
> 
> I have a solution implemented for this, I am not happy with it, Stefan is not happy with it. See 
> discussion: https://bitbucket.org/tmidas/midas/commits/50a15aa70a4fe3927764605e8964b55a3bb1732b
>

I figured it out, solution is to use:

target_compile_features(midas PUBLIC cxx_std_11)

this is how it works:

- centos-7 (g++ has c++11 off by default): -std=gnu++11 is added automatically (not -std=c++11, but 
probably correct, as some c++11 functions were available as gnu extensions)
- ubuntu-20.04 LTS without ROOT: nothing added (I guess correct, g++ has c++11 is enabled by default)
- ubuntu-20.04 LTS with -std=c++14 from ROOT: nothing added, c++14 as requested by ROOT is in affect.
- macos without ROOT: -std=gnu++11 is added automatically
- macos with -std=c++11 from ROOT: ditto, so both -std=c++11 and -std=gnu++11 are present in this order, 
wrong-ish, but works.

and good luck figuring this out just from cmake documentation:
https://cmake.org/cmake/help/latest/command/target_compile_features.html

K.O.
Entry  10 Aug 2020, Mathieu Guigue, Info, MidasConfig.cmake usage 
As the Midas software is installed using CMake, it can be easily integrated into 
other CMake projects using the MidasConfig.cmake file produced during the Midas 
installation.

This file points to the location of the include and libraries of Midas using three 
variables:
- MIDAS_INCLUDE_DIRS
- MIDAS_LIBRARY_DIRS
- MIDAS_LIBRARIES

Then the CMakeLists file of the new project can use the CMake find_package 
functionalities like:
```
find_package (Midas REQUIRED)
if (MIDAS_FOUND)
    MESSAGE(STATUS "Found midas: libraries ${MIDAS_LIBRARIES}")
    pbuilder_add_ext_libraries (${MIDAS_LIBRARIES})
else (MIDAS_FOUND)
    message(FATAL "Unable to find midas")
endif (MIDAS_FOUND)
include_directories (${MIDAS_INCLUDE_DIR})
```
pbuilder_add_ext_libraries is a CMake macro allowing to automatically add the 
libraries into the project: this macro can be found here: 
https://github.com/project8/scarab/blob/master/cmake/PackageBuilder.cmake
If such macro doesn't exist, the linkage to each executable/library can be done 
similarly to https://midas.triumf.ca/elog/Midas/1964 using: 

```
target_link_libraries(crfe ${MIDAS_LIBARIES} ${LIBS})
```

The current version of the MidasConfig.cmake is minimal and could for example 
include a version number: this would allow to define a e.g. minimal version of 
Midas needed by the new project.
    Reply  28 May 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
How does "find_package (Midas REQUIRED)" find the location of MIDAS?

The best I can tell from the current code, the package config files are installed
inside $MIDASSYS somewhere and I see "find_package MIDAS" never find them (indeed,
find_package() does not know about $MIDASSYS, so it has to use telepathy or something).

Does anybody actually use "find_package(midas)", does it actually work for anybody?

Also it appears that "the cmake way" of importing packages is to use
the install(EXPORT) method.

In this scheme, the user package does this:

include(${MIDASSYS}/lib/midas-targets.cmake)
target_link_libraries(myprogram PUBLIC midas)

this causes all the midas include directories (including mxml, etc)
and dependancy libraries (-lutil, -lpthread, etc) to be automatically
added to "myprogram" compilation and linking.

of course MIDAS has to generate a sensible targets export file,
working on it now.

K.O.
       Reply  28 May 2021, Marius Koeppel, Info, MidasConfig.cmake usage 
> Does anybody actually use "find_package(midas)", does it actually work for anybody?

What we do is to include midas as a submodule and than we call find_package:

    add_subdirectory(midas)
    list(APPEND CMAKE_PREFIX_PATH ${CMAKE_CURRENT_SOURCE_DIR}/midas)
    find_package(Midas REQUIRED)

For us it works fine like this but we kind of always compile Midas fresh and don't use a version on our system (keeping the newest version). 

Without the find_package the build does not work for us.
          Reply  28 May 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> > Does anybody actually use "find_package(midas)", does it actually work for anybody?
> 
> What we do is to include midas as a submodule and than we call find_package:
> 
>     add_subdirectory(midas)
>     list(APPEND CMAKE_PREFIX_PATH ${CMAKE_CURRENT_SOURCE_DIR}/midas)
>     find_package(Midas REQUIRED)
> 
> For us it works fine like this but we kind of always compile Midas fresh and don't use a version on our system (keeping the newest version). 
> 
> Without the find_package the build does not work for us.

Ok, I see. I now think that for us, this "find_package" business an unnecessary complication:

since one has to know where midas is in order to add it to CMAKE_PREFIX_PATH,
one might as well import the midas targets directly by include(.../midas/lib/midas-targets.cmake).

From what I see now, the cmake file is much simplifed by converting
it from "find_package(midas)" style MIDAS_INCLUDES & co to more cmake-ish
target_link_libraries(myexe midas) - all the compiler switches, include paths,
dependant libraires and gunk are handled by cmake automatically.

I am not touching the "find_package(midas)" business, so it should continue to work, then.

K.O.
             Reply  31 May 2021, Stefan Ritt, Info, MidasConfig.cmake usage 
MidasConfig.cmake might at some point get included in the standard Cmake installation (or some add-on). It will then reside in the Cmake system path 
and you don't have to explicitly know where this is. Just the find_package(Midas) will then be enough. 

Even if it's not there, the find_package() is the "traditional" way CMake discovers external packages and users are used to that (like ROOT does the 
same). In comparison, your "midas-targets.cmake" way of doing things, although this works certainly fine, is not the "standard" way, but a midas-
specific solution, other people have to learn extra.

Stefan
                Reply  02 Jun 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> MidasConfig.cmake might at some point get included in the standard Cmake installation (or some add-on). It will then reside in the Cmake system path 
> and you don't have to explicitly know where this is. Just the find_package(Midas) will then be enough.

Hi, Stefan, can you say more about this? If MidasConfig.cmake is part of the cmake distribution,
(did I understand you right here?) and is installed into a system-wide directory,
how can it know to use midas from /home/agmini/packages/midas or from /home/olchansk/git/midas?

Certainly we do not do system wide install of midas (into /usr/local/bin or whatever) because
typically different experiments running on the same computer use different versions of midas.

For ROOT, it looks as if for find_package(ROOT) to work, one has to add $ROOTSYS to the Cmake package
search path. This is what we do in our cmake build.

As for find_package() vs install(EXPORT), we may have the same situation as with my "make cmake",
where my one line solution is no good for people who prefer to type 3 lines of commands.

Specifically, the install(EXPORT) method defines the "midas" target which brings with it
all it's dependent include paths, libraries and compile flags. So to link midas you need
two lines:

include(.../midas/lib/midas-targets.cmake)
target_link_libraries(myexe midas)
target_link_libraries(myfrontend mfe)

whereas find_package() defines a bunch of variables (the best I can tell) and one has
to add them to the include paths and library paths and compile flags "by hand".

I do not know how find_package() handles the separate libmidas, libmfe and librmana. (and
the separate libmanalyzer and libmanalyzer_main).

K.O.
                   Reply  04 Jun 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> find_package(Midas)

I am testing find_package(Midas). There is a number of problems:

1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".

This seem to be an incomplete list of all libraries build by midas (rmana is missing).

This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):

- we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
- midas-c-compat is for building python interfaces, not for linking midas programs
- mfe contains a main() function, it will collide with the user main() function

So I think this should be changed to just "midas" and midas linking dependancy
libraries (-lutil, -lrt, -lpthread) should also be added to this list.

Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)

2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories

Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.

Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.

I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
this second method seems to be much simpler, everything is exported automatically into one file,
and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).

So how much time should I spend in fixing find_package(Midas) to make it generally usable?

- include path is incomplete
- library list is nonsense
- compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
- dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)

K.O.
                      Reply  04 Jun 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> > find_package(Midas)
> 
> So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> 
> - include path is incomplete
> - library list is nonsense
> - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> 

I think I give up on find_package(Midas). It seems like a lot of work to straighten
all this out, when install(EXPORT) does it all automatically and is easier to use
for building user frontends and analyzers.

K.O.
                      Reply  20 Jun 2021, Lukas Gerritzen, Suggestion, MidasConfig.cmake usage 
I agree that those two things are problems, but I don't see why it is preferable to leave the MidasConfig.cmake in this "broken" state. For us 
problem 1 is less of an issue, becaues we run "link_directories(${MIDAS_LIBRARY_DIRS})" in the top CMakeLists.txt and then just link against "midas", 
not "${MIDAS_LIBRARIES}". However, number 2 would be nice, to not manually hack in target_include_directories(target ${MIDASSYS}/mscb/include), 
especially because ${MIDASSYS} is not set in cmake. 

I see two solutions for problem 2: Treat mscb as a submodule and compile and install it together with midas, or add the include directory to 
${MIDAS_INCLUDE_DIRS} (same applies to the other submodules, mscb is the one that made me open this elog just now)

Cheers
Lukas
 
> > find_package(Midas)
> 
> I am testing find_package(Midas). There is a number of problems:
> 
> 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".
> 
> This seem to be an incomplete list of all libraries build by midas (rmana is missing).
> 
> This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):
> 
> - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
> - midas-c-compat is for building python interfaces, not for linking midas programs
> - mfe contains a main() function, it will collide with the user main() function
> 
> So I think this should be changed to just "midas" and midas linking dependancy
> libraries (-lutil, -lrt, -lpthread) should also be added to this list.
> 
> Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)
> 
> 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories
> 
> Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.
> 
> Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
> of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.
> 
> I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
> this second method seems to be much simpler, everything is exported automatically into one file,
> and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).
> 
> So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> 
> - include path is incomplete
> - library list is nonsense
> - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> 
> K.O.
                         Reply  20 Jun 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage 
> I agree that those two things are problems, but I don't see why it is preferable to leave the MidasConfig.cmake in this "broken" state. For us 
> problem 1 is less of an issue, becaues we run "link_directories(${MIDAS_LIBRARY_DIRS})" in the top CMakeLists.txt and then just link against "midas", 
> not "${MIDAS_LIBRARIES}". However, number 2 would be nice, to not manually hack in target_include_directories(target ${MIDASSYS}/mscb/include), 
> especially because ${MIDASSYS} is not set in cmake.

So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.

Problem still remains with required auxiliary libraries for linking "-lmidas". Sometimes you
need "-lutil" and "-lrt" and "-lpthread", sometimes not. Some way to pass this information
automatically would be nice.

Problem still remains that I cannot do these changes because I have no test harness
for any of this. Would be great if you could contribute this and post the documentation
blurb that we can paste into the midas wiki documentation.

And I still do not understand why we have to do all this work when cmake "import(EXPORT)"
already does all of this automatically. What am I missing?

K.O.

> 
> I see two solutions for problem 2: Treat mscb as a submodule and compile and install it together with midas, or add the include directory to 
> ${MIDAS_INCLUDE_DIRS} (same applies to the other submodules, mscb is the one that made me open this elog just now)
> 
> Cheers
> Lukas
>  
> > > find_package(Midas)
> > 
> > I am testing find_package(Midas). There is a number of problems:
> > 
> > 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".
> > 
> > This seem to be an incomplete list of all libraries build by midas (rmana is missing).
> > 
> > This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):
> > 
> > - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
> > - midas-c-compat is for building python interfaces, not for linking midas programs
> > - mfe contains a main() function, it will collide with the user main() function
> > 
> > So I think this should be changed to just "midas" and midas linking dependancy
> > libraries (-lutil, -lrt, -lpthread) should also be added to this list.
> > 
> > Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)
> > 
> > 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories
> > 
> > Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.
> > 
> > Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
> > of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.
> > 
> > I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
> > this second method seems to be much simpler, everything is exported automatically into one file,
> > and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).
> > 
> > So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> > 
> > - include path is incomplete
> > - library list is nonsense
> > - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> > - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> > 
> > K.O.
                            Reply  22 Jun 2021, Lukas Gerritzen, Suggestion, MidasConfig.cmake usage 
> So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.

A more moderate option would be to remove mfe from ${MIDAS_LIBRARIES}, but as far as I understand mfe is not the only problem, so nuking might be the 
better option after all. In addition, setting ${MIDASSYS} in MidasConfig.cmake would probably improve compatibility.

>Sometimes you need "-lutil" and "-lrt" and "-lpthread", sometimes not. 
>Some way to pass this information automatically would be nice.

I do not properly understand when you need this and when not, but can't this be communicated with the PUBLIC keyword of target_link_libraries()? If I 
understand if we can use PUBLIC for -lutil, -lrt and -lpthread, I can write something, test it here and create a pull request.

> And I still do not understand why we have to do all this work when cmake "import(EXPORT)"
> already does all of this automatically. What am I missing?

Does this not require midas to be built every time you import it? I know, it's a bit the "billions of flies can't be wrong" argument, but I've never seen 
any package that uses import(EXPORT) over find_package().

> > I agree that those two things are problems, but I don't see why it is preferable to leave the MidasConfig.cmake in this "broken" state. For us 
> > problem 1 is less of an issue, becaues we run "link_directories(${MIDAS_LIBRARY_DIRS})" in the top CMakeLists.txt and then just link against "midas", 
> > not "${MIDAS_LIBRARIES}". However, number 2 would be nice, to not manually hack in target_include_directories(target ${MIDASSYS}/mscb/include), 
> > especially because ${MIDASSYS} is not set in cmake.
> 
> So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> 
> Problem still remains with required auxiliary libraries for linking "-lmidas". Sometimes you
> need "-lutil" and "-lrt" and "-lpthread", sometimes not. Some way to pass this information
> automatically would be nice.
> 
> Problem still remains that I cannot do these changes because I have no test harness
> for any of this. Would be great if you could contribute this and post the documentation
> blurb that we can paste into the midas wiki documentation.
> 
> And I still do not understand why we have to do all this work when cmake "import(EXPORT)"
> already does all of this automatically. What am I missing?
> 
> K.O.
> 
> > 
> > I see two solutions for problem 2: Treat mscb as a submodule and compile and install it together with midas, or add the include directory to 
> > ${MIDAS_INCLUDE_DIRS} (same applies to the other submodules, mscb is the one that made me open this elog just now)
> > 
> > Cheers
> > Lukas
> >  
> > > > find_package(Midas)
> > > 
> > > I am testing find_package(Midas). There is a number of problems:
> > > 
> > > 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".
> > > 
> > > This seem to be an incomplete list of all libraries build by midas (rmana is missing).
> > > 
> > > This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):
> > > 
> > > - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
> > > - midas-c-compat is for building python interfaces, not for linking midas programs
> > > - mfe contains a main() function, it will collide with the user main() function
> > > 
> > > So I think this should be changed to just "midas" and midas linking dependancy
> > > libraries (-lutil, -lrt, -lpthread) should also be added to this list.
> > > 
> > > Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)
> > > 
> > > 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories
> > > 
> > > Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.
> > > 
> > > Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
> > > of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.
> > > 
> > > I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
> > > this second method seems to be much simpler, everything is exported automatically into one file,
> > > and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).
> > > 
> > > So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> > > 
> > > - include path is incomplete
> > > - library list is nonsense
> > > - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> > > - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> > > 
> > > K.O.
                               Reply  24 Jun 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage 
> > So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> A more moderate option ...

For the record, I did not disappear. I have a very short time window
to complete commissioning the alpha-g daq (now that the network
and the event builder are cooperating). To add to the fun, our high voltage
power supply turned into a pumpkin, so plotting voltages and currents 
on the same history plot at the same time (like we used to be able to do)
went up in priority. 

K.O.
                                  Reply  11 Jul 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage 
> > > So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> > A more moderate option ...
> 
> For the record, I did not disappear. I have a very short time window
> to complete commissioning the alpha-g daq (now that the network
> and the event builder are cooperating). To add to the fun, our high voltage
> power supply turned into a pumpkin, so plotting voltages and currents 
> on the same history plot at the same time (like we used to be able to do)
> went up in priority. 
> 

in the latest update, find_package(midas) should work correctly, the include path is right, 
the library list is right.

please test.

I find that the cmake install(export) method is simpler on the user side (just one line of 
code) and is easier to support on the midas side (config file is auto-generated).

I request that proponents of the find_package(midas) method contribute the documentation and 
example on how to use it. (see my other message).

K.O.
    Reply  13 Jul 2021, Stefan Ritt, Info, MidasConfig.cmake usage 
Thanks for the contribution of MidasConfig.cmake. May I kindly ask for one extension:

Many of our frontends require inclusion of some midas-supplied drivers and libraries 
residing under

$MIDASSYS/drivers/class/
$MIDASSYS/drivers/device
$MIDASSYS/mscb/src/
$MIDASSYS/src/mfe.cxx

I guess this can be easily added by defining a MIDAS_SOURCES in MidasConfig.cmake, so 
that I can do things like:

add_executable(my_fe
  myfe.cxx
  $(MIDAS_SOURCES}/src/mfe.cxx
  ${MIDAS_SOURCES}/drivers/class/hv.cxx
  ...)

Does this make sense or is there a more elegant way for that?

Stefan
       Reply  13 Jul 2021, Konstantin Olchanski, Info, MidasConfig.cmake usage 
> $MIDASSYS/drivers/class/
> $MIDASSYS/drivers/device
> $MIDASSYS/mscb/src/
> $MIDASSYS/src/mfe.cxx
> 
> I guess this can be easily added by defining a MIDAS_SOURCES in MidasConfig.cmake, so 
> that I can do things like:
> 
> add_executable(my_fe
>   myfe.cxx
>   $(MIDAS_SOURCES}/src/mfe.cxx
>   ${MIDAS_SOURCES}/drivers/class/hv.cxx
>   ...)

1) remove $(MIDAS_SOURCES}/src/mfe.cxx from "add_executable", add "mfe" to 
target_link_libraries() as in examples/experiment/frontend:

add_executable(frontend frontend.cxx)
target_link_libraries(frontend mfe midas)

2) ${MIDAS_SOURCES}/drivers/class/hv.cxx surely is ${MIDASSYS}/drivers/...

If MIDAS is built with non-default CMAKE_INSTALL_PREFIX, "drivers" and co are not 
available, as we do not "install" them. Where MIDASSYS should point in this case is
anybody's guess. To run MIDAS, $MIDASSYS/resources is needed, but we do not install
them either, so they are not available under CMAKE_INSTALL_PREFIX and setting
MIDASSYS to same place as CMAKE_INSTALL_PREFIX would not work.

I still think this whole business of installing into non-default CMAKE_INSTALL_PREFIX
location has not been thought through well enough. Too much thinking about how cmake works
and not enough thinking about how MIDAS works and how MIDAS is used. Good example
of "my tool is a hammer, everything else must have the shape of a nail".

K.O.
Entry  09 Jul 2021, Konstantin Olchanski, Info, cannot push to bitbucket 
the day has arrived when I cannot git push to bitbucket. cloud computing rules!

I have never seen this error before and I do not think we have any hooks installed,
so it must be some bitbucket stuff. their status page says some kind of maintenance
is happening, but the promised error message is "repository is read only" or something 
similar.

I hope this clears out automatically. I am updating all the cmake crud and I have no idea 
which changes I already pushed and which I did not, so no idea if anything will work for 
people who pull from midas until this problem is cleared out.

daq00:mvodb$ git push
X11 forwarding request failed on channel 0
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Delta compression using up to 12 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (2/2), 247 bytes | 247.00 KiB/s, done.
Total 2 (delta 1), reused 0 (delta 0)
remote: null value in column "attempts" violates not-null constraint
remote: DETAIL:  Failing row contains (13586899, 2021-07-10 01:13:28.812076+00, 1970-01-01 
00:00:00+00, 1970-01-01 00:00:00+00, 65975727, null).
To bitbucket.org:tmidas/mvodb.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'git@bitbucket.org:tmidas/mvodb.git'
daq00:mvodb$

K.O.
Entry  08 Jul 2021, Francesco Renga, Forum, Problem with python file reader 
Dear experts,
       while trying to readout a MIDAS file from a python script. I get the error below at the very first event. Any hint?

Thank you very much,
            Francesco

  File "/home/cygno/DAQ/offline/file_reader.py", line 9, in <module>
    for event in mfile:
  File "/home/cygno/DAQ/python/midas/file_reader.py", line 159, in __next__
    ev = self.read_next_event()
  File "/home/cygno/DAQ/python/midas/file_reader.py", line 264, in read_next_event
    return self.read_this_event_body()
  File "/home/cygno/DAQ/python/midas/file_reader.py", line 307, in read_this_event_body
    self.event.unpack_body(body_data, 0, self.use_numpy)
  File "/home/cygno/DAQ/python/midas/event.py", line 648, in unpack_body
    bank.fill_header_from_bytes(bank_header_data, self.is_bank_32(), self.is_bank_data_64bit_aligned())
  File "/home/cygno/DAQ/python/midas/event.py", line 298, in fill_header_from_bytes
    self.name = "".join(x.decode('ascii') for x in unpacked[:4])
  File "/home/cygno/DAQ/python/midas/event.py", line 298, in <genexpr>
    self.name = "".join(x.decode('ascii') for x in unpacked[:4])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 0: ordinal not in range(128)
    Reply  09 Jul 2021, Ben Smith, Forum, Problem with python file reader 
Hi Francesco,

Can you send me an example file to look at please? Either attached to the elog or sent directly to bsmith@triumf.ca

Thanks,
Ben
Entry  29 Jun 2021, Lukas Gerritzen, Bug Report, modbcheckbox behaves erroneous with UINT32 variables 
For boolean and INT32 variables, modbcheckbox works as expected. You click, it 
sets the variable to true or 1, the checkbox stays checked until you click again 
and it's being set back to 0.

For UINT32 variables, you can turn the variable "on", but the checkbox visually 
becomes unchecked immediately. Clicking again does not set the variable to 
0/false and the tick visually appears for a fraction of a second, but vanishes 
again.
    Reply  30 Jun 2021, Stefan Ritt, Bug Report, modbcheckbox behaves erroneous with UINT32 variables 
> For boolean and INT32 variables, modbcheckbox works as expected. You click, it 
> sets the variable to true or 1, the checkbox stays checked until you click again 
> and it's being set back to 0.
> 
> For UINT32 variables, you can turn the variable "on", but the checkbox visually 
> becomes unchecked immediately. Clicking again does not set the variable to 
> 0/false and the tick visually appears for a fraction of a second, but vanishes 
> again.

Thanks for reporting that bug. Fixed in

https://bitbucket.org/tmidas/midas/commits/4ef26bdc5a32716efe8e8f0e9ce328bafad6a7bf

Stefan
       Reply  30 Jun 2021, Lukas Gerritzen, Bug Report, modbcheckbox behaves erroneous with UINT32 variables 
Thanks for the quick fix.
Entry  28 Jun 2021, Marco Francesconi, Suggestion, ODB Load in Sequencer 
Hi all,
for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
Let me know if you think this is a good approach.

Marco F
    Reply  28 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
> Hi all,
> for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> Let me know if you think this is a good approach.
> 
> Marco F

How can people judge your modification if they cannot see it? Why don't you make a pull request, so it can be properly reviewed.

Stefan
    Reply  28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer 
> Hi all,
> for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> Let me know if you think this is a good approach.
> 

Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
should be named "ODBLoadJSON" to be clear about this.

(JSON is preferred over .odb and .xml for many reasons (ask me))

K.O.
       Reply  28 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
> > Hi all,
> > for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> > The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> > I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> > Let me know if you think this is a good approach.
> > 
> 
> Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
> should be named "ODBLoadJSON" to be clear about this.
> 
> (JSON is preferred over .odb and .xml for many reasons (ask me))

What if some experiment keep some files in .xml format (ask me!). The routine should check for the extension and support all three formats.

Stefan
          Reply  28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer 
> > > Hi all,
> > > for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> > > The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> > > I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> > > Let me know if you think this is a good approach.
> > > 
> > 
> > Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
> > should be named "ODBLoadJSON" to be clear about this.
> > 
> > (JSON is preferred over .odb and .xml for many reasons (ask me))
> 
> What if some experiment keep some files in .xml format (ask me!). The routine should check for the extension and support all three formats.
> 

Yes, hard to tell without seeing his full proposal, including the code. If it is load from file,
sure we look at the file extension, I think the existing code already would do this and support all 3 formats.

But if he wants to load ODB data from a text literal or from a string,
we might as well stick to json. I guess we could support the other formats, but I do not see anybody
using anything other than json for new code like this.

ODBPasteJSON("/foo/bar/baz", '{"var1":1, "var2":"somestr"}');

K.O.
             Reply  28 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
> > > > Hi all,
> > > > for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> > > > The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> > > > I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> > > > Let me know if you think this is a good approach.
> > > > 
> > > 
> > > Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
> > > should be named "ODBLoadJSON" to be clear about this.
> > > 
> > > (JSON is preferred over .odb and .xml for many reasons (ask me))
> > 
> > What if some experiment keep some files in .xml format (ask me!). The routine should check for the extension and support all three formats.
> > 
> 
> Yes, hard to tell without seeing his full proposal, including the code. If it is load from file,
> sure we look at the file extension, I think the existing code already would do this and support all 3 formats.
> 
> But if he wants to load ODB data from a text literal or from a string,
> we might as well stick to json. I guess we could support the other formats, but I do not see anybody
> using anything other than json for new code like this.
> 
> ODBPasteJSON("/foo/bar/baz", '{"var1":1, "var2":"somestr"}');

I agree that if one would paste a string to the ODB, then JSON would be best.

But at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.

Stefan
                Reply  28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer 
> ... at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.

same here, lots of historical .odb and .xml files.

I think the .odb and .xml support is here to stay. Best I remember, latest things I fixed in both
was support for unlimited string length (and removal of associated buffer overflows). Right now,
I am not sure if both are UTF-8 clean and if they properly escape all control characters,
something to fix as we go or as we bump into problems.

K.O.
                   Reply  28 Jun 2021, Marco Francesconi, Suggestion, ODB Load in Sequencer 
My idea was to collect some feedback instead of blindly submitting code for a pull request.

Currently I'm just calling db_load() with a given file, so it is only supporting .odb formatting.
It is pretty easy to extend to json by calling the db_load_json() depending on the file extension.
I do not see a similar call for the .xml format, maybe I can study tomorrow how it is implemented in odbedit and port it to the sequencer.

I guess that the ODBPasteJSON can be a solution as well but I find it a bit too technical.
Anyway it is easy to implement just by calling db_paste_json(), I will keep this in mind.

I'll try to sort this out and make a commit soon.
Best,

Marco



> > ... at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.
> 
> same here, lots of historical .odb and .xml files.
> 
> I think the .odb and .xml support is here to stay. Best I remember, latest things I fixed in both
> was support for unlimited string length (and removal of associated buffer overflows). Right now,
> I am not sure if both are UTF-8 clean and if they properly escape all control characters,
> something to fix as we go or as we bump into problems.
> 
> K.O.
                      Reply  29 Jun 2021, Marco Francesconi, Suggestion, ODB Load in Sequencer 
I just submitted a pull request for this feature, I did quite a lot of testing and it looks good to me.
Let me know if something is not clear.

I'll take care of adding the relevant informations to the wiki once it is merged.
Best,

Marco


> My idea was to collect some feedback instead of blindly submitting code for a pull request.
> 
> Currently I'm just calling db_load() with a given file, so it is only supporting .odb formatting.
> It is pretty easy to extend to json by calling the db_load_json() depending on the file extension.
> I do not see a similar call for the .xml format, maybe I can study tomorrow how it is implemented in odbedit and port it to the sequencer.
> 
> I guess that the ODBPasteJSON can be a solution as well but I find it a bit too technical.
> Anyway it is easy to implement just by calling db_paste_json(), I will keep this in mind.
> 
> I'll try to sort this out and make a commit soon.
> Best,
> 
> Marco
> 
> 
> 
> > > ... at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.
> > 
> > same here, lots of historical .odb and .xml files.
> > 
> > I think the .odb and .xml support is here to stay. Best I remember, latest things I fixed in both
> > was support for unlimited string length (and removal of associated buffer overflows). Right now,
> > I am not sure if both are UTF-8 clean and if they properly escape all control characters,
> > something to fix as we go or as we bump into problems.
> > 
> > K.O.
                         Reply  30 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer 
I quickly checked the pull request and could not find any obvious problem, so I merged it.
Entry  18 Jun 2021, Konstantin Olchanski, Bug Report, my html modbvalue thing is not working? 
I have a web page and I try to use modbvalue, but nothing happens. The best I can tell, I follow the documentation 
(https://midas.triumf.ca/MidasWiki/index.php/Custom_Page#modbvalue).

<td id=setv0><div class="modbvalue" data-odb-path="/Equipment/CAEN_hvps01/Settings/VSET[0]" data-odb-editable="1">(ch0)</div></td>

I suppose I could add debug logging to the javascript framework for modbvalue to find out why it is not seeing
or how it is not liking my web page.

But how would a non-expert user (or an expert user in a hurry) would debug this?

Should the modbvalue framework log more error messages to the javascrpt console ("I am ignoring your modbvalue entry because...")?

Should it have a debug mode where it reports to the javascript console all the tags it scanned, all the tags it found, etc
to give me some clue why it does not find my modbvalue tag?

Right now I am not even sure if this framework is activated, perhaps I did something wrong in how I load the page
and the modbvalue framework is not loaded. The documentation gives some magic incantations but does not explain
where and how this framework is loaded and activated. (But I do not see any differences between my page and
the example in the documentation. Except that I do not load control.js, I do not need all the thermometer bars, etc.
If I do load it, still my modbvalue does not work).

K.O.
    Reply  25 Jun 2021, Stefan Ritt, Bug Report, my html modbvalue thing is not working? 
Can you post your complete page here so that I can have a look?

Stefan
Entry  21 Jun 2021, Lars Martin, Bug Report, ELog documentation inconsistency 
The documentation fro the Elog ODB tree here:
https://midas.triumf.ca/MidasWiki/index.php//Elog_ODB_tree#Url

says:

The Built-in elog will ignore this key.

If using an Built-in Elog, this key must NOT be present.

I assume this is an artifact from amending the documentation, but it's unclear if 
the key has to be removed or not. I.e. if the key exists and is empty, will the 
built-in elog work? In what way will it break?
Entry  17 Jun 2021, Joseph McKenna, Info, Add support for rtsp camera streams in mlogger (history_image.cxx) unnamed.png
mlogger (history_image) now supports rtsp cameras, in ALPHA we have 
acquisitioned several new network connected cameras. Unfortunately they dont 
have a way of just capturing a single frame using libcurl


========================================
Motivation to link to OpenCV libraries
========================================

After looking at the ffmpeg libraries, it seemed non trivial to use them to 
listen to a rtsp stream and write a series of jpgs.

OpenCV became an obvious choice (it is itself linked to ffmpeg and 
gstreamer), its a popular, multiplatform, open source library that's easy to 
use. It is available in the default package managers in centos 7 and ubuntu 
(an is installed by default on lxplus).

========================================
How it works:
========================================

The framework laid out in history_image.cxx is great. A separate thread is 
dedicated for each camera. This is continued with the rtsp support, using 
the same periodicity:

if (ss_time() >= o["Last fetch"] + o["Period"]) {
An rtsp camera is detected by its URL, if the URL starts with ‘rtsp://’ its 
obvious its using the rtsp protocol and the cv::VideoCapture object is 
created (line 147).

If the connection fails, it will continue to retry, but only send an error 
message on the first 10 attempts (line 150). This counter is reset on 
successful connection
If MIDAS has been built without OpenCV, mlogger will send an error message 
that OpenCV is required if a rtsp URL is given (line 166)
The VideoCapture ‘stays live' and will grab frames from the camera based on 
the sleep, saving to file based on the Period set in the ODB.

If the VideoCapture object is unable to grab a frame, it will release() the 
camera, send an error message to MIDAS, then destroy itself, and create a 
new version (this destroy and create fully resets the connection to a 
camera, required if its on flaky wifi)
If the VideoCapture gets an empty frame, it also follows the same reset 
steps.
If the VideoCaption fills a cv::Frame object successfully, the image is 
saved to disk in the same way as the curl tools.

========================================
Concerns for the future:
========================================

VideoCapture is decoding the video stream in the background, allowing us to 
grab frames at will. This is nice as we can be pretty agnostic to the video 
format in the stream (I tested with h264 from a TP-LINK TAPO C100, but the 
CPU usage is not negligible.

I noticed that this used ~2% of the CPU time on an intel i7-4770 CPU, given 
enough cameras this is considerable. In ALPHA, I have been testing with 10 
cameras:
elog:2220/1 

My suggestion / request would be to move the camera management out of 
mlogger and into a new program (mcamera?), so that users can choose to off 
load the CPU load to another system (I understand the OpenCV will use GPU 
decoders if available also, which can also lighten the CPU load).
    Reply  18 Jun 2021, Konstantin Olchanski, Info, Add support for rtsp camera streams in mlogger (history_image.cxx) 
> mlogger (history_image) now supports rtsp cameras

my goodness, we will drive the video surveillance industry out of business.

> My suggestion / request would be to move the camera management out of 
> mlogger and into a new program (mcamera?), so that users can choose to off 
> load the CPU load to another system (I understand the OpenCV will use GPU 
> decoders if available also, which can also lighten the CPU load).

every 2 years I itch to separate mlogger into two parts - data logger
and history logger.

but then I remember that the "I" in MIDAS stands for "integrated",
and "M" stands for "maximum" and I say, "nah..."

(I guess we are not maximum integrated enough to have mhttpd, mserver
and mlogger to be one monolithic executable).

There is also a line of thinking that mlogger should remain single-threaded
for maximum reliability and ease of debugging. So if we keep adding multithreaded
stuff to it, perhaps it should be split-apart after all. (anything that makes
the size of mlogger.cxx smaller is a good thing, imo).

K.O.
Entry  15 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved! 
I am sure everybody else has 10gige and 40gige networks and are sending terabytes of data before breakfast.

Myself, I only have one computer with a 10gige network link and sufficient number of daq boards to fill
it with data. Here is my success story of getting all this data through MIDAS.

This is the anti-matter experiment ALPHA-g now under final assembly at CERN. The main particle detector is a long but 
thin cylindrical TPC. It surrounds the magnetic bottle (particle trap) where we make and study anti-hydrogen. There are 
64 daq boards to read the TPC cathode pads and 8 daq boards to read the anode wires and to form the trigger. Each daq 
board can produce data at 80-90 Mbytes/sec (1gige links). Data is sent as UDP packets (no jumbo frames). Altera FPGA 
firmware was done here at TRIUMF by Bryerton Shaw, Chris Pearson, Yair Lynn and myself.

Network interconnect is a 96-port Juniper switch with a 10gige uplink to the main daq computer (quad core Intel(R) 
Xeon(R) CPU E3-1245 v6 @ 3.70GHz, 64 GBytes of DDR4 memory).

MIDAS data path is: UDP packet receiver frontend -> event builder -> mlogger -> disk -> lazylogger -> CERN EOS cloud 
storage.

First chore was to get all the UDP packets into the main computer. "U" in UDP stands for "unreliable", and at first, UDP 
packets have been disappearing pretty much anywhere they could. To fix this, in order:

- reading from the udp socket must be done in a dedicated thread (in the midas context, pauses to write statistics or 
check alarms result in lost udp packets)
- udp socket buffer has to be very big
- maximum queue sizes must be enabled in the 10gige NIC
- ethernet flow control must be enabled on the 10gige link
- ethernet flow control must be enabled in the switch (to my surprise many switches do not have working end-to-end 
ethernet flow control and lose UDP packets, ask me about this. our big juniper switch balked at first, but I got it 
working eventually).
- ethernet flow control must be enabled on the 1gige links to each daq module
- ethernet flow control must be enabled in the FPGA firmware (it's a checkbox in qsys)
- FPGA firmware internally must have working back pressure and flow control (avalon and axi buses)
- ideally, this back-pressure should feed back to the trigger. ALPHA-g does not have this (it does not need it).

Next chore was to multithread the UDP receiver frontend and to multithread the event builder. Stock single-threaded 
programs quickly max out with 100% CPU use and reach nowhere near 10gige data speeds.

Naive multithreading, with two threads, reader (read UDP packet, lock a mutex, put it into a deque, unlock, repeat) and 
sender (lock a mutex, get a packet from deque, unlock, bm_send_event(), repeat) spends all it's time locking and 
unlocking the mutex and goes nowhere fast (with 1500 byte packets, about 600 kHz of lock/unlock at 10gige speed).

So one has to do everything in batches: reader thread: accumulate 1000 udp packets in an std::vector, lock the mutex, 
dump this batch into a deque, unlock, repeat; sender thread: lock mutex, get 1000 packets from the deque, unlock, stuff 
the 1000 packets into 1 midas event, bm_send_event(), repeat.

It takes me 5 of these multithreaded udp reader frontends to keep up with a 10gige link without dropping any UDP packets. 
My first implementation chewed up 500% CPU, that's all of it, there is only 4 CPU cores available, leaving nothing
for the event builder (and mlogger, and ...)

I had to:
a) switch from plain socket read() to socket recvmmsg() - 100000 udp packets per syscall vs 1 packet per syscall, and
b) switch from plain bm_send_event() to bm_send_event_sg() - using a scatter-gather list to avoid a memcpy() of each udp 
packet into one big midas event.

Next is the event builder.

The event builder needs to read data from the 5 midas event buffers (one buffer per udp reader frontend, each midas event 
contains 1000 udp packets as indovidual data banks), examine trigger timestamps inside each udp packet, collect udp 
packets with matching timestamps into a physics event, bm_send_event() it to the SYSTEM buffer. rinse and repeat.

Initial single threaded implementation maxed out at about 100-200 Mbytes/sec with 100% busy CPU.

After trying several threading schemes, the final implementation has these threads:
- 5 threads to read the 5 event buffers, these threads also examine the udp packets, extract timestamps, etc
- 1 thread to sort udp packets by timestamp and to collect them into physics events
- 1 thread to bm_send_event() physics events to the SYSTEM buffer
- main thread and rpc handler thread (tmfe frontend)

(Again, to reduce lock contention, all data is passed between threads in large batches)

This got me up to about 800 Mbytes/sec. To get more, I had to switch the event builder from old plain bm_send_event() to 
the scatter-gather bm_send_event_sg(), *and* I had to reduce CPU use by other programs, see steps (a) and (b) above.

So, at the end, success, full 10gige data rate from daq boards to the MIDAS SYSTEM buffer.

(But wait, what about the mlogger? In this experiment, we do not have a disk storage array to sink this
much data. But it is an already-solved problem. On the data storage machines I built for GRIFFIN - 8 SATA NAS HDDs using 
raidz2 ZFS - the stock MIDAS mlogger can easily sink 1000 Mbytes/sec from SYSTEM buffer to disk).

Lessons learned:

- do not use UDP. dealing with packet loss will cost you a fortune in headache medicines and hair restorations.
- use jumbo frames. difference in per-packet overhead between 1500 byte and 9000 byte packets is almost a factor of 10.
- everything has to be done in bulk to reduce per-packet overheads. recvmmsg(), batched queue push/pop, etc
- avoid memory allocations (I has a per-packet std::string, replaced it with char[5])
- avoid memcpy(), use writev(), bm_send_event_sg() & co

K.O.

P.S. Let's counting the number of data copies in this system:

x udp reader frontend:
- ethernet NIC DMA into linux network buffers
- recvmmsg() memcpy() from linux network buffer to my memory
- bm_send_event_sg() memcpy() from my memory to the MIDAS shared memory event buffer

x event builder:
- bm_receive_event() memcpy() from MIDAS shared memory event buffer to my event buffer
- my memcpy() from my event buffer to my per-udp-packet buffers
- bm_send_event_sg() memcpy() from my per-udp-packet buffers to the MIDAS shared memory event buffer (SYSTEM)

x mlogger:
- bm_receive_event() memcpy() from MIDAS SYSTEM buffer
- memcpy() in the LZ4 data compressor
- write() syscall memcpy() to linux system disk buffer
- SATA interface DMA from linux system disk buffer to disk.

Would a monolithic massively multithreaded daq application be more efficient?
("udp receiver + event builder + logger"). Yes, about 4 memcpy() out of about 10 will go away.

Would I be able to write such a monolithic daq application?

I think not. Already, at 10gige data rates, for all practical purposes, it is impossible
to debug most problems, especially subtle trouble in multithreading (race conditions)
and in memory allocations. At best, I can sprinkle assert()s and look at core dumps.

So the good old divide-and-conquer approach is still required, MIDAS still rules.

K.O.
    Reply  15 Jun 2021, Stefan Ritt, Info, 1000 Mbytes/sec through midas achieved! frontend.cxx
In MEG II we also kind of achieved this rate. Marco F. will post an entry soon to describe the details. There is only one thing 
I want to mention, which is our network switch. Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade 
switch. We have "rack switches", which are collector switch for each rack receiving up to 10 x 1GBit inputs, and outputting 1 x 
10 GBit to an "aggregation switch", which collects all 10 GBit lines form rack switches and forwards it with (currently a single 
) 10 GBit line. For the rack switch we use a 

MikroTik CRS354-48G-4S+2Q+RM 54 port

and for the aggregation switch

MikroTik CRS326-24S-2Q+RM 26 Port

both cost in the order of 500 US$. We were astonished that they don't loose UDP packets when all inputs send a packet at the 
same time, and they have to pipe them to the single output one after the other, but apparently the switch have enough buffers 
(which is usually NOT written in the data sheets). 

To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is 
completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the 
details.

Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your 
bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend 
attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk. I got

Event size: 7 MB

No logging: 900 events/s = 6.7 GBytes/s

Logging with LZ4 compression: 155 events/s = 1.2 GBytes/s

Logging without compression: 170 events/s = 1.3 GBytes/s

So with this simple approach I got already more than 1 GByte of "dummy data" through midas, indicating that the buffer 
management is not so bad. I did use the plain mfe.c frontend framework, no bm_send_event_sg() (but mfe.c uses rpc_send_event() which is an 
optimized version of bm_send_event()).

Best,
Stefan
       Reply  16 Jun 2021, Marco Francesconi, Info, 1000 Mbytes/sec through midas achieved! 
As reported by Stefan, in MEG II we have very similar ethernet throughputs.
In total, we have 34 crates each with 32 DRS4 digitiser chips and a single 1 Gbps readout link through a Xilinx Zynq SoC.
The data arrives in push mode without any external intervention, the only throttling being an optional prescaling on the trigger rate.
We discovered the hard way that 1 Gbps throughput on Zynq is not trivial at all: the embedded ethernet MAC does not support jumbo frames (always read the fine prints in the manuals!) and the embedded Linux ethernet stack seems to struggle when we go beyond 250 Mbps of UDP traffic.

Anyhow, even with the reduced speed, the maximum throughput at network input is around 8.5 Gbps which passes through the Mikrotik switches mentioned by Stefan.
We had very bad experiences in the past with similar price-point switches, observing huge packet drops when the instantaneous switching capacity cannot cope with the traffic, but so far we are happy with the Mikrotik ones.

On the receiver side, we have the DAQ server with an Intel E5-2630 v4 CPU and a 10 Gbit connection to the network using an Intel X710 Network card.
In the past, we used also a "cheap" 10 Gbit card from Tehuti but the driver performance was so bad that it could not digest more than 5 Gbps of data.

The current frontend is based on the mfe.c scheme for historical reasons (the very first version dates back to 2015).
We opted for a monolithic multithread solution so we can reuse the underlying DAQ code for other experiments which may not have the complete Midas backend.
Just to mention them: one is the FOOT experiment (which afaik uses an adapted version of Altas DAQ) and the other is the LOLX experiment (for which we are going to ship to Canada soon a small 32 channel system using Midas).
A major modification to Konstantin scheme is that we need to calibrate all WFMs online so that a software zero suppression can be applied to reduce the final data size (that part is still to be implemented).
This requirement results in additional resource usage to parse the UDP content into floats and calibrate them.
Currently, we have 7 packet collector threads to digest the full packet flow (using recvmmsg), followed by an event building stage that uses 4 threads and 3 other threads for WFM calibration.
We have progressive packet numbers on each packet generated by the hardware and a set of flags marking the start and end of the event; combining the packet number difference between the start and end of the event and the total received packets for that event it is really easy to understand if packet drops are happening.

All the thread infrastructure was tested and we could digest the complete throughput, we still have to finalise the full 10 Gbit connection to Midas because the final system has been installed only recently (April).
We are using EQ_USER flag to push events into mfe.c buffers with up to 4 threads, but I was observing that above ~1.5 Gbps the rb_get_wp() returns almost always DB_TIMEOUT and I'm forced to drop the event.
This conflicts with the measurements reported by Stefan (we were discussing this yesterday), so we are still investigating the possible cause.

It is difficult to report three years of development in a single Elog, I hope I put all the relevant point here.
It looks to me that we opted for very complementary approaches for high throughput ethernet with Midas, and I think there are still a lot of details that could be worth reporting.
In case someone organises some kind of "virtual workshop" on this, I'm willing to participate.
Best,

Marco


> In MEG II we also kind of achieved this rate. Marco F. will post an entry soon to describe the details. There is only one thing 
> I want to mention, which is our network switch. Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade 
> switch. We have "rack switches", which are collector switch for each rack receiving up to 10 x 1GBit inputs, and outputting 1 x 
> 10 GBit to an "aggregation switch", which collects all 10 GBit lines form rack switches and forwards it with (currently a single 
> ) 10 GBit line. For the rack switch we use a 
> 
> MikroTik CRS354-48G-4S+2Q+RM 54 port
> 
> and for the aggregation switch
> 
> MikroTik CRS326-24S-2Q+RM 26 Port
> 
> both cost in the order of 500 US$. We were astonished that they don't loose UDP packets when all inputs send a packet at the 
> same time, and they have to pipe them to the single output one after the other, but apparently the switch have enough buffers 
> (which is usually NOT written in the data sheets). 
> 
> To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is 
> completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the 
> details.
> 
> Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your 
> bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend 
> attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk. I got
> 
> Event size: 7 MB
> 
> No logging: 900 events/s = 6.7 GBytes/s
> 
> Logging with LZ4 compression: 155 events/s = 1.2 GBytes/s
> 
> Logging without compression: 170 events/s = 1.3 GBytes/s
> 
> So with this simple approach I got already more than 1 GByte of "dummy data" through midas, indicating that the buffer 
> management is not so bad. I did use the plain mfe.c frontend framework, no bm_send_event_sg() (but mfe.c uses rpc_send_event() which is an 
> optimized version of bm_send_event()).
> 
> Best,
> Stefan
          Reply  18 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved! 
> ... MEG II ... 34 crates each with 32 DRS4 digitiser chips and a single 1 Gbps readout link through a Xilinx Zynq SoC.
>
> Zynq ... embedded ethernet MAC does not support jumbo frames (always read the fine prints in the manuals!)
> and the embedded Linux ethernet stack seems to struggle when we go beyond 250 Mbps of UDP traffic.

that's an ouch. we use the altera ethernet mac, and jumbo frames are supported, but the firmware data path
was originally written assuming 1500-byte packets and it is too much work to rewrite it for jumbo frames.

we send the data directly from the FPGA fabric to the ethernet, there is an avalon/axi bus multiplexer
to split the ethernet packets to the NIOS slow control CPU. not sure if such scheme is possible
for SoC FPGAs with embedded ARM CPUs.

and yes, a 1 GHz ARM CPU will not do 10gige. You see it yourself, measure your memcpy() speed. Where
typical PC will have dual-channel 128-bit wide memory (and the famous for it's low latency
Intel memory controller), ARM SoC will have at best 64-bit wide memory (some boards are only 32-bit wide!),
with DDR3 (not DDR4) severely under-clocked (i.e. DDR3-900, etc). This is why the new Apple ARM chips
are so interesting - can Apple ARM memory controller beat the Intel x86 memory controller?

> On the receiver side, we have the DAQ server with an Intel E5-2630 v4 CPU

that's the right gear for the job. quad-channel memory with nominal "Max Memory Bandwidth 68.3 GB/s",
10 CPU cores. My benchmark of memcpy() for the much older duad-channel memory i7-4820 with DDR3-1600 DIMMs
is 20 Gbytes/sec. waiting for ARM CPU with similar specs.

> and a 10 Gbit connection to the network using an Intel X710 Network card.
> In the past, we used also a "cheap" 10 Gbit card from Tehuti but the driver performance was so bad that it could not digest more than 5 Gbps of data.

yup, same here. use Intel ethernet exclusively, even for 1gige links.

> A major modification to Konstantin scheme is that we need to calibrate all WFMs online so that a software zero suppression

I implemented hardware zero suppression in the FPGA code. I think 1 GHz ARM CPU does not have the oomph for this.

> rb_get_wp() returns almost always DB_TIMEOUT

replace rb_xxx() with std::deque<std::vector<char>> (protected by a mutex, of course). lots of stuff in the mfe.c frontend
is obsolete in the same way. check out the newer tmfe frontends (tmfe.md, tmfe.h and tmfe examples).

> It is difficult to report three years of development in a single Elog

but quite successful at it. big thanks for your write-up. I think our info is quite useful for the next people.

K.O.
       Reply  18 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved! 
> In MEG II we also kind of achieved this rate.
>
> Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade switch.

Right. We built this DAQ system about 3 years ago and the cheep Chineese switches arrived
on the market about 1 year after we purchased the big 96 port juniper switch. Bad timing/good timing.

Actually I have a very nice 24-port 1gige switch ($2000 about 3 years ago), I could have
used 4 of them in parallel, but they were discontinued and replaced with a $5000 switch
(+$3000 for a 10gige uplink. I think I got the last very last one cheap switch).

But not all Chineese switches are equal. We have an Ubiquity 10gige switch, and it does
not have working end-to-end ethernet flow control. (yikes!).

BTW, for this project we could not use just any cheap switch, we must have 64 fiber SFP ports
for connecting on-TPC electronics. This narrows the market significantly and it does
not match the industry standard port counts 8-16-24-48-96.

> MikroTik CRS354-48G-4S+2Q+RM 54 port
> MikroTik CRS326-24S-2Q+RM 26 Port

We have a hard time buying this stuff in Vancouver BC, Canada. Most of our regular suppliers
are US based and there is a technology trade war still going on between the US and China.
I guess we could buy direct on alibaba, but for the risk of scammers, scalpers and iffy shipping.

> both cost in the order of 500 US$

tell one how much we overpay for US based stuff. not surprising, with how Cisco & co can afford
to buy sports arenas, etc.

> We were astonished that they don't loose UDP packets when all inputs send a packet at the 
> same time, and they have to pipe them to the single output one after the other,
> but apparently the switch have enough buffers.

You probably see ethernet flow control in action. Look at the counters for ethernet pause frames
in your daq boards and in your main computer.

> (which is usually NOT written in the data sheets).

True, when I looked into this, I found a paper by somebody in Berkley for special
technique to measure the size of such buffers.

(The big Juniper switch has only 8 Mbytes of buffer. The current wisdom for backbone networks
is to have as little buffering as possible).

> To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is 
> completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the 
> details.

We do not do this. (very bad!). When each trigger arrives, all 64+8 DAQ boards send a train of UDP packets
at maximum line speed (64+8 at 1 gige) all funneled into one 10 gige ((64+8)/10 oversubscription).

Before we got ethernet flow control to work properly, we had to throttle all the 1gige links by about 60%
to get any complete events at all. This would not have been acceptable for physics data taking.

> Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your 
> bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend 
> attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk.

Dummy frontend is not very representative, because limitation is the memory bandwidth
and CPU load, and a real ethernet receiver has quite a bit of both (interrupt processing,
DMA into memory, implicit memcpy() inside the socket read()).

For example, typical memcpy() speeds are between 22 and 10 Gbytes/sec for current
generation CPUs and DRAM. This translates for a total budget of 22 and 10 memcpy()
at 10gige speeds. Subtract from this 1 memcpy() to DMA data from ethernet into memory
and 1 memcpy() to DMA data from memory to storage. Subtract from this 2 implicit
memcpy() for read() in the frontend and write() in mlogger. (the Linux sendfile() syscall
was invented to cut them out). Subtract from this 1 memcpy() for instruction and incidental
data fetch (no interesting program fits into cache). Subtract from this memory bandwidth
for running the rest of linux (systemd, ssh, cron jobs, NFS, etc). Hardly anything
left when all is said and done. (Found it, the alphagdaq memcpy() runs at 14 Gbytes/sec,
so total budget of 14 memcpy() at 10gige speeds).

And the event builder eats up 2 CPU cores to process the UDP packets at 10gige rate,
there is quite a bit of CPU-expensive data unpacking, inspection and processing
going on that cannot be cut out. (alphagdaq has 4 cores, "8 threads").

K.O.

P.S. Waiting for rack-mounted machines with AMD "X" series processors... K.O.
Entry  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
There have been times in ALPHA that an alarm is triggered and the shift crew 
are unclear who to contact if they aren't trained to fix the specific 
failure mode.

I wish to add the property 'Users responsible' to the ODB for Alarms and 
Programs.

I have drafted what this might look like in a new pull request:
https://bitbucket.org/tmidas/midas/pull-requests/22/add-users-responsible-
field-for-specific

It requires changing of several data structures, I think I have found all 
instances of the definitions so the ODB should 'repair' any of the old 
structures adding in users responsible.

If 'Users responsible' is set, MIDAS messages append them after the message 
in brackets '()'. If used in conjunction with the MIDAS messenger 
(mmessenger), the users responsible can be 'tagged' directly.

I.e, for slack, simply set the 'users responsible' to <@UserID|Nickname>, 
for mattermost '@username', for discord '<@userid>'. Note that discord 
doesn't allow you to tag by username, but numeric userid



I have expanded char array in 'al_trigger_class' to handle the potentially 
longer MIDAS messages. Perhaps since I'm touching these lines I should 
change these temporary containers to std::string (line 383 and 386 of 
alarm.cxx)?

I have tested this quite a bit for my system, I am not sure how I can test 
mjsonrpc.
    Reply  28 May 2021, Stefan Ritt, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
I think this is a good idea and I support it. We have a similar problem in MEG and 
we solved that with external (bash) scripts called in case of alarms. One feature 
there we have is that for some alarms, several people want to get notified. So 
people can "subscribe" to certain alarms. The subscription are now handled inside 
Slack which I like better, but maybe it would be good to have more than one "user 
responsible". Like if one person is sleeping/traveling, it's good to have a 
substitute. Can you make an array out of that? Or a comma-separated list?

Best,
Stefan
       Reply  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> I think this is a good idea and I support it. We have a similar problem in MEG and 
> we solved that with external (bash) scripts called in case of alarms. One feature 
> there we have is that for some alarms, several people want to get notified. So 
> people can "subscribe" to certain alarms. The subscription are now handled inside 
> Slack which I like better, but maybe it would be good to have more than one "user 
> responsible". Like if one person is sleeping/traveling, it's good to have a 
> substitute. Can you make an array out of that? Or a comma-separated list?
> 
> Best,
> Stefan

Presently there are 256 characters in the 'users responsible' field, so you can just 
list many users (no space, space or comma whatever). Discord, slack and mattermost 
don't care, they just parse the user tags.

I can still make this an array and pass a std::vector<std::string> into 
al_trigger_class function?
          Reply  28 May 2021, Stefan Ritt, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> I can still make this an array and pass a std::vector<std::string> into 
> al_trigger_class function?

Maybe 256 chars are enough at the moment. If other people complain in the future, we can 
re-visit.

Stefan
             Reply  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> > I can still make this an array and pass a std::vector<std::string> into 
> > al_trigger_class function?
> 
> Maybe 256 chars are enough at the moment. If other people complain in the future, we can 
> re-visit.
> 
> Stefan

Thinking about it, an array of maybe 80 character would give enough space for a name, a tag 
and phone number. Do I need to budget memory very strictly? Would 32 entries of 80 
characters be too much? 
                Reply  28 May 2021, Stefan Ritt, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> > > I can still make this an array and pass a std::vector<std::string> into 
> > > al_trigger_class function?
> > 
> > Maybe 256 chars are enough at the moment. If other people complain in the future, we can 
> > re-visit.
> > 
> > Stefan
> 
> Thinking about it, an array of maybe 80 character would give enough space for a name, a tag 
> and phone number. Do I need to budget memory very strictly? Would 32 entries of 80 
> characters be too much? 

On that level memory is cheap.

Stefan
                   Reply  28 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 

I've updated the branch / pull request to use an array of 10 entries (80 chars each). 32 felt a 
little overkill when I saw it on screen, but absolutely happy to set it to any number you 
recommend.

The array gets flattened out when an alarm is triggered, currently the formatting produces

AlarmClass : AlarmMessage (Flattened List Of Users Responsible Array With Space Separators)

If experiments want to use Discord / Slack / Mattermost tags and or add phone numbers, that 
should fit in 80 characters
                      Reply  31 May 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
This list of responsible being attached to alarm message strings will be great for the 
mmessenger, however, perhaps its going to generate very long messages for the speaker programs 
(web interface and mlxspeaker ):

AlarmClass : AlarmMessage (ResponsibleUser1 ResponsibleUser2 ResponsibleUser3 ResponsibleUser4 
... ResponsibleUser4)

especially if people put in user tags or emergency contact details...

Should we add a key word or character for the programs that create audio to parse that silence 
the list of responsible users? I'd be tempted to use a single character but there is a risk 
users might have that in a custom alarm message. Maybe something usual like the 'bel' 
character? '|'? 

Perhaps use the string 'Responsible:' or 'Users:' to trim out the Users Responsible list from 
the message string?

AlarmClass : AlarmMessage Responsible:(ResponsibleUser1 ResponsibleUser2 ResponsibleUser3 
ResponsibleUser4 ... ResponsibleUser4)

AlarmClass : AlarmMessage Users:(ResponsibleUser1 ResponsibleUser2 ResponsibleUser3 
ResponsibleUser4 ... ResponsibleUser4)
                         Reply  02 Jun 2021, Konstantin Olchanski, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> This list of responsible being attached to alarm message strings ...

This is a great idea. But I think we do not need to artificially limit ourselves
to string and array lengths.

The code in alarm.c should be changes to use std::string and std::vector<std::string> (STRING_LIST 
#define), db_get_record() should be replaced with individual ODB reads (that's what it does behind 
the scenes, but in a non-type and -size safe way).

I think the web page code will work correctly, it does not care about string lengths.

K.O.
                            Reply  09 Jun 2021, Joseph McKenna, Suggestion, Have a list of 'users responsible' in Alarms and Programs odb entries 
> > This list of responsible being attached to alarm message strings ...
> 
> This is a great idea. But I think we do not need to artificially limit ourselves
> to string and array lengths.
> 
> The code in alarm.c should be changes to use std::string and std::vector<std::string> (STRING_LIST 
> #define), db_get_record() should be replaced with individual ODB reads (that's what it does behind 
> the scenes, but in a non-type and -size safe way).
> 
> I think the web page code will work correctly, it does not care about string lengths.
> 
> K.O.

Auto growing lists is an excellent plan. I am making decent progress and should have something to 
report soon
ELOG V3.1.4-2e1708b5