ELOG Midas

Back Midas Rome Roody Rootana

Midas DAQ System, Page 21 of 52

Not logged in

Find | Login | Help

New entries since:

Wed Dec 31 16:00:00 1969

Full | Summary | Threaded | Collapse | Expand

1021 Entries

Goto page Previous 1, 2, 3 ... 20, 21, 22 ... 50, 51, 52 Next

27 Sep 2019, Konstantin Olchanski, Bug Fix, improvement for midas web page resource use

I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser
"task manager" and by other tools.

For example, size of the "status" page was observe to reach 200, 600 and even 900 Mbytes. The "programs" page (which
does not have nearly as much stuff as the status page), was observed to reach 200-600 Mbytes. This is comparable to the
New York Times front page, which has much more stuff, but usually runs at about 200 Mbytes. (they do force a periodic full
page reload, to deal with exactly this same type of trouble, I suspect).

Also I observed the midas web pages consume an unusual amount of CPU - 5-10-15% - all in inactive tabs in minimized
windows.

All this was quite noticeable in my oldish mac laptop with only 8 GBytes of RAM.

Using the google-chrome performance analyzer I was able to identify the reason of high memory use - our 1/sec periodic
updates leak "too many" DOM "nodes" and I suspect that due to throttling of inactive tabs, the garbage collector simply
does not keep up with us.

(Note that javascript features automatic memory management with garbage collection. In practice in means that where in
C/C++ we have malloc() and free(), in javascript we only have malloc() and no free(), and cannot explicitly release memory
we know we no longer need. In the C/C++ sense, all memory allocations are leaked, and one relies on a janitor to "clean it all
up" eventually, later).

The source of node leakage was unexpected (unexpected to me). It turns out that each assignment to e.innerHTML creates
a new node, even if the new contents is the same as the old contents. (also the html parser has to run, consuming extra cpu
cycles).

Obvious solution is to write code like this:
if (v !== e.innerHTML) { e.innerHTML = v };

This helped quite a bit on the "programs" page, but not as much as expected, and hardly at all on the "status" page.

It turns out, that read of innerHTML does not necessarily return the same string as it was written into it.
For example, if "v" is "a&b", e.innerHTML will return "a&amp;b" and the comparison will misfire.
There is more cases like this, see the section "Test set and get e.innerHTML" on the "example" midas page.

To help dealing with this, I suggest that instead of "inline" comparison (as above), one writes this:
mhttpd_set_if_changed(e, v);

Then to check that the comparison is effective, go to mhttpd.js and uncomment the console.log() call in
mhttpd_set_if_changed(), reload the page and look at the javascript console to see all calls that result
in assignment of innerHTML (and leakage of DOM nodes).

This done, after replacing many "&" with "&amp;" and many "\'" with "\"", node leakage on the "programs" page was reduced
to 1 node per 1/sec update: the unavoidable change to the timestamp on the top-right of the page.

Luckily, Stefan pointed me to the solution for this: use of e.firstChild.data instead of e.innerHTML. The only quirk is that the
node should not be empty, which was easy to arrange by setting the initial value of the timestamp to a dummy value.

With these changes, the "programs" page (and most other pages) now leak 0 nodes (from the 1/sec periodic updates).

There is still some small memory leakage from making the RPC requests and from receiving the RPC replies, but the
garbage collector seems to have no trouble with them.

Typical memory use for all midas pages is now 50-60 Mbytes (down from 100-200 Mbytes).

The "status" page took a bit more work to fix due to it's curious coding, but it, too now uses 50-60 Mbytes as well. It still
leaks quite a few nodes (to be fixed!), but the garbage collector seems to keep up with the allocations.

K.O.

27 Sep 2019, Konstantin Olchanski, Bug Fix, improvement for midas web page resource use (alarm sound)

> I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser 
> "task manager" and by other tools.
> 
> For example, size of the "status" page was observe to reach 200, 600 and even 900 Mbytes.
> [this was fixed by using set_if_changed(e, v);
> 
> Also I observed the midas web pages consume an unusual amount of CPU - 5-10-15% - all in inactive tabs in minimized 
> windows.
> 

The case of high CPU use turned out to be quite nasty.

The symptoms:
- the "programs" page in an inactive tab in a minimized window sits "doing nothing" for a day or two.
- uses about 0 to 0.1 to 1% CPU and 40-50-60 Mbytes of RAM (after the previous improvements)
- suddenly I see it use 10-15-20% CPU, continuously, non stop
- I open this tab
- suddenly, CPU use goes to 100%, memory use quickly grows from 40-50-60 Mbytes to 100-200 Mbytes.
- after a few seconds everything settles down, CPU use is back to 0-0.1-1%, but memory use does not go down.
- WTH?!?

The culprit turned out to be the playing of the alarm sound. (I have all tabs "muted" by default, also speakers usually powered down).

If I comment-out the playing of the alarm sound, this problem goes away completely. Pretty conclusive, I think.

After adding lots of debug console.log() calls, I think I identified the problem: audio objects were being created,
but they were not starting to play their sound files. When I opened the tab, all of them (about 400) at the same time
loaded the mp3 file (resulting in memory use going from 50 Mbytes to 190 Mbytes, typical) and started playing
(as seen on the audio event activity in the cpu profile traces from the google-chrome "performance" tool).

I think I am looking at an unexpected interaction between audio objects and google-chrome throttling of inactive tabs.

To muddy the waters some more, google-chrome periodically fails audio.play() with an exception to the effect of
"we will not play audio because user is not interacting with this page enough". See
https://bitbucket.org/tmidas/midas/issues/191/exception-on-audioplay

Now I think I have this sort of fixed. I have to handle the audio.play() failure (which is not a normal exception,
but a rejected promise, the handler is quite different), and I do not allow creating new audio objects if previous
audio object did not finish playing.

(note the "normal" timing: periodic update every 1 sec, playing of alarm sound event 60 seconds, length of alarm sound file is 3 sec,
two sound files should never overlap. now a console.log message is printed if overlap is detected)

This leaves us with the problem of alarm sound not playing "because the user didn't interact with the document first",
and I think there is nothing I can do about that.

K.O.

P.S. Another quirk is I discovered: go to the "config" page and press the new buttons "play test sound" and "speak test message". In muted
tabs, the test sound will not sound, but the test message will be shouted out loudly. This seems inconsistent to me. Unwanted audio/video ads
are blocked but loud shouting of "shave with burma-shave" is permitted. I also wonder if speaking is subject to this
"user did not interact" business. If not, we could replace the playing of our relaxing alarm beep with the yelling of "alarm! alarm! alarm!".

K.O.

28 Nov 2019, Konstantin Olchanski, Bug Fix, improvement for midas web page resource use (alarm sound and fit_message)

> > I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser 
> > "task manager" and by other tools.

The final fix is in. (plus a fix from Stefan).

When the audio.play() promise is rejected, one must clear audio.src, otherwise the browser will continue
with loading the audio file (but will not play it at the end).

Normally, this should not be a problem, but in inactive tabs, all activity is throttled down, and it so happens
that these audio objects accumulate (they are in the state of "we are trying to load the sound file, but
browser slows us down so much!"), consume huge amounts of memory (page memory use goes from ~50 Mbytes
to ~100-200 Mbytes) and consume huge amounts of CPU (not clear how, probably it's the firing of "loading", "canplay", etc
event handlers).

It does not help that mhttpd_fit_message() had a performance bug and consumed large amounts of CPU causing even
more slowing down by the be browser.

After adding audio.src="", all this is gone. I see no special CPU use, and I do not see any strange large memory use.

I still sometimes see inactive tabs grow from ~50 Mbytes to amount ~100 Mbytes. After I open them (activate them),
they quickly shrink back to ~50 Mbytes. I conclude that the browser is slowing down the garbage collector in inactive
tabs so much that it does not keep up with our 1/sec data polling.

So Stefan's fix to reduce polling from 1/sec to 1/10sec should help with this, too. (plus reduction of CPU use by fit_message() should
leave more time for the garbage collector to run).

P.S. General rules for browser slow down of inactive tabs seem to be written here:
https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API

Slowdown of timers is written here:
https://developer.mozilla.org/en-US/docs/Web/API/WindowOrWorkerGlobalScope/setTimeout#Notes

K.O.

28 Nov 2019, Konstantin Olchanski, Bug Fix, improvement for midas web page resource use (alarm sound and fit_message)

> > > I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser
> > > "task manager" and by other tools.

The work on this problem has been blogged in the bitbucket issue tracker:
https://bitbucket.org/tmidas/midas/issues/158/midas-status-page-memory-leak

K.O.

Below is a dump of the issue for posterity ---

Team Midas MIDAS related packages midas Issues
midas status page memory leak
Create issue
Issue #158 RESOLVEDOpen Workflow More Edit
dd1 created an issue 2018-12-25
I have the midas status page (https://daq16.triumf.ca/) open in macos google chrome 71.0.3578.98 and I watch in the "task manager" how the memory use is
246 Mbytes and growing at around 1 Mbyte every 2-3 seconds. CPU use is around 3-5%, network use is 47 kBytes/sec. The slowly growing memory use
indicates that we have a memory leak. (Note that javascript uses "automatic garbage collection" memory management, which does not eliminate memory
leaks. Only capability to explicitly free unused memory is eliminated). K.O.

Comments (35)
dd1 REPORTER
Actual memory use goes up to around 250-something MBytes, then drops down to 240-something, them slowly grows back up, drops down, rinse, repeat.
This is the javascript garbage collection in action. So there is no memory leak on the status page, but still why do we generate around 1 Mbyte/sec of
javascript memory allocations? As comparison, the NYTimes front page consumes 270 Mbytes. One would expect the midas front page to be much more light
weight... K.O.

Edit Pin to top Mark as spam Delete 2018-12-27
dd1 REPORTER
Then there is a question of memory use by the "message" page. This page does grow infinitely large by design - as new messages are added to midas.log -
as as the user keeps scrolling the messages back in time. Perhaps we should somehow limit the total memory use there... K.O.

Edit Pin to top Mark as spam Delete 2018-12-27
Stefan Ritt
changed status to closed
I see the same behaviour. The relatively large memory allocation by Chrome probably comes from some bitmap caching. The browser prints the page
contents into some temporary bitmap and then flushes it to the screen. That can easily take a few MB. I monitor such behaviour since several years now (for
other processes) and concluded that I don't need to worry about JavaScript memory consumption.

Concerning the messages page: One line takes about 100 Bytes. If you scroll really fast, you can do maybe 30 lines per second, thus 3kB. If we allow the
browser to consume another 100 MB (should be easily possible these days), you have to continuously scroll for 100000kb/3kb=30000 seconds or eight
hours. Good luck!

Closing this topic if no complaints.

Pin to top Mark as spam Delete 2019-01-08
dd1 REPORTER
changed status to open
still see high memory use by midas pages. K.O.

Edit Pin to top Mark as spam Delete 2019-09-15
dd1 REPORTER
See high memory use from long running (days-weeks) web pages:

status page of my test experiment - 953 MB - 155 MB after reload
odb editor - 661 MB - 80 MB after reload
programs page - 602 MB - 64 MB after reload
sequencer - 253 MB - 151 MB after reload
sequencer - ??? (very big) - reloaded before I wrote it down
I think we are leaking memory somewhere. Or causing unnecessary allocations that the javascript garbage collector does not keep up with or does not
cleanup correctly. K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-09-15
dd1 REPORTER
I am suspicious of memory use trouble from periodic-update code that keeps setting innerHTML to the same value as it was before, unnecessarily. (this also
causes other problems - cannot cut-and-paste affected parts of the web page, high cpu use to redraw the (unchanged) page). K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-09-15
Stefan Ritt
For setting innerHTML we should always use

if (text !== control.inner HTML)

control.innerHTML = text;

&#8204;

I thought I caught most of the cases, but I might have missed some. Please add as needed.

&#8204;

Stefan

Pin to top Mark as spam Delete 2019-09-15
dd1 REPORTER
Strange things continue. Just say huge CPU usage from 3 midas web pages (odb editor, programs page and the new sequencer page). All 3 pages are tabs in
an iconized browser window. Suddenly machine feels slow, and I see all 3 use 25% CPU each (by the chrome-browser task manager window). Opened the
browser window, sent to the offending tabs, nothing looks amiss, CPU usage went back to 0%. WTH? (all 3 pages have 100 Mbyte memory use, all 3 pages
update at 1 Hz). K.O.

Edit Pin to top Mark as spam Delete 2019-09-16
dd1 REPORTER
looked at the �programs� page. learned how to use the google-chrome �performance� tool. I was definitely leaking html nodes. The leak was in an
unexpected place - innerHTML with a link was miscomparing because of unexpected string transformation:

xbad: "<a href='?cmd=odb&odb_path=System/Clients/" + key + "'>" + host + "</a>";
good: "<a href=\"?cmd=odb&amp;odb_path=System/Clients/" + key + "\">" + host + "</a>";
Now node leak from my periodic update went from 35 nodes to 2 nodes per update. The performance tool fails to identify where these last 2 nodes are
coming from.

K.O.

Edit Pin to top Mark as spam Delete 2019-09-17
dd1 REPORTER
Forgot to add - the periodic update from mhttpd_init() is also leaking nodes. I will look at it some other time. K.O.

Edit Pin to top Mark as spam Delete 2019-09-17
dd1 REPORTER
after improvement to the �programs� page, the tab is staying at 50-60 Mbytes. promising� K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
Fixed node leak in mhttpd_refresh(): the alarm display was setting e.innerHTML even if it did not change.

There only remains an unavoidable node leak with �mheader_last_updated� where we set the current time every 1 second. If I comment this out, there is no
node leak on the �programs� page.

K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
�programs� page memory use now sits around 40 Mbytes. K.O.

Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
Stefan points me to the use of e.firstChild.data instead of e.innerHTML, per https://medium.com/@ok.bayat/fixing-memory-leak-problem-in-javascript-
application-ed3a2d9d92df

K.O.

Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
implemented this for the timestamp update and (i.e.) the �programs� page now leaks 0 nodes. memory use for all pages sits around 40-60 Mbytes. K.O.

Edit Pin to top Mark as spam Delete 2019-09-26
dd1 REPORTER
see problem of high cpu usage again, after google-chrome restarted after an update to latest version. for example, �program� page is 65 Mbytes, uses 20%
CPU. (in an inactive tab). If I open this tab, for maybe 10 seconds, it goes to 100+ Mbytes with big CPU usage (>100%), then drops down to 90 Mbytes, 0%
CPU usage. I do not see any other web pages or tabs doing this. Only our midas pages. WTH!?! K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-10-13
dd1 REPORTER
figured out high cpu usage reported as �rendering�. Open �devtools�, goto �performance�, press Command-Shift-P, start typing �rendering�, select �fps
meter�. A black square will open in top-left, showing graphics activity (frame rate, GPU usage, etc).

Now wait for new message to appear in the top status bar. It will be �yellow� at first, that it will fade to �gray�. During this fading, GPU use is 100% during
about 1 second, FPS is about 50 frames/sec.

K.O.

Edit Pin to top Mark as spam Delete 2019-10-14
dd1 REPORTER
quick google search shows much discussion about css animations using �too much CPU�, i.e. google �css pulsing background�, but no clear way to tell the
browser to slow down. It looks to me like the background-color animation tries to run at maximum possible frame-rate, as if electricity is free. (Since I am
debugging high-cpu and high-memory use of inactive tabs, there is nobody looking at these animations). K.O.

Edit Pin to top Mark as spam Delete 2019-10-14
Stefan Ritt
New messages are displayed with a yellow background and fade to grey after 5 seconds. This is handeled in mhttpd.js around line 2144. You can try to
remove the lines

d.style.setProperty("-webkit-transition", "background-color 3s", "");
d.style.setProperty("transition", "background-color 3s", "");
and see if the CPU load goes down.

Pin to top Mark as spam Delete 2019-10-14
dd1 REPORTER
captured another trace of midas page using 20% cpu in an inactive tab, iconized browser window. capturing is difficult, requires very fast mousing to: select
the right tab, right-click to �inspect�, select �performance� tab, click on �start capture�, and hope that by this time the web page activity does not complete.
this time I got the last 200 ms or so.

what I see is again is �media activity� (only identified as �task�), GPU activity (only identified as �GPU activity�) and main thread activity (identified as an
infinitely repeating sequence of �receive response 206 audio/mpeg�, �receive data 39287 bytes�, �finish loading�, then the same sequence again. 39287 is
the file size of resources/beep.mp3. There is no corresponding network activity, so the loading of beep.mp3 must be coming from cache. On the javascript
console, there are the usual �not allowed to play audio because user did not interact� messages repeating about every 1-2-3 minutes.

I read this as: for reasons unknown, a huge number of audio requests becomes queued (the tab was inactive/iconized for many days) then they start trying to
play (load beep.mp3, do not play it because �not allowed�, move on to the next audio object, load � etc). This is consistent with the cpu use, with the
captured traces and with the quick growth in memory size (beep.mp3 objects are created, consume memory, cannot be free�d until garbage collector runs
later. much later).

The above scenario is impossible with how the current audio playing code is written (only one audio object can exist at a time, new audio object can only be
created after the previous one finished playing).

Two possible explanations: (a) the code running in the web page is not the same code as in mhttpd.js (running an old version from cache) or (b) the code
�one audio object at time� is not working correctly if javascript code is throttled /delayed/stopped in inactive tabs.

Following code will have this problem:

var only_one = null;
function foo() { /* runs from periodic timer */
if (!only_one) {
a = new Audio(�beep.mp3�);
/* throttled/suspended/delayed here */
/* multiple Audio objects created because "only_one" is still null */
only_one = a;
}
}
K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-10-17
dd1 REPORTER
fading background - yes, I found the code. pretty neat. I moved it around to remove the timer - I am suspicious of how the timers run in inactive tabs. but no
time to study it.

but the current problem is clearly with audio objects, and the only audio we have is the periodic playing of beep.mp3. who knew there will be so much trouble.

there is still the unexplained use of GPU, but maybe playing/decoding mp3 files uses the GPU.

I am also puzzled why the status page from midas-2019-03 does not show any of these problems. it just sits there using no memory (50 Mbytes) and no
CPU. perhaps we changed something in the playing of audio files since last March (when midas-2019-03 was tagged).

K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-10-17
dd1 REPORTER
For the first time I saw my message �mhttpd_alarm_play: Cannot play alarm sound: previous alarm sound did not finish playing yet� reported on the javascript
console. This confirms my guess that playing of audio is actually delayed and indeed we need to check that the previous audio finished playing before
creating new audio objects. But the check in the current code has a race condition. If the delay/stall is inside �new Audio()�, we will create multiple audio
objects as �last_audio� is still in the �finished playing� state, we only change it after the return from �new Audio()�. K.O.

Edit Pin to top Mark as spam Delete 2019-10-28
dd1 REPORTER
see big improvement. now inactive tabs grow from 50-ish Mbytes to 170-ish Mbytes, then when I open them, there is some cpu use (GC, I guess) and
memory use drops back to 50-ish Mbytes. So we are not leaking any memory anymore. Looking at the console messages, I see that my fixes are helping -
there is messages about attempts to create new Audio() when previous one did not finish yet. K.O.

Edit Pin to top Mark as spam Delete 2019-11-04
dd1 REPORTER
I guess, inactive tabs are throttled by google-chrome so much that their GC (memory garbage collection) does not keep up with our 1/sec data updates. I do
not think we need to keep updating inactive tabs at this high frequency, but I am not sure how to detect if we are active or inactive. Maybe I can detect the
throttling instead. K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-11-04
dd1 REPORTER
see consistent behaviour from google-chrome:

have all these midas tabs open, inactive, window iconized, typical tab size is 50-ish Mbytes.
google-chrome update arrives
update is installed, all windows and tabs automatically closed, then reopened.
the midas tabs are still inactive, window is iconized
after a few days, see behaviour as described before:
midas tabs use 20-30% CPU, size is 100-ish Mbytes
if I open one of these tabs, it�s cpu usage goes up to 160%, size grows to 250-ish Mbytes, then within 5-10 seconds drops to 100 Mbytes, CPU usage goes
from 160% to zero.
when looking at this, if I am quick enough, I can right-click �inspect�, go to the �performance� tab, and press the �start collecting data� button and I capture
the very tail end of all this strange activity. This is the traces I have been describing so far.
K.O.

Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
see big blob of activity:

timer activation
mhttpd_message()
first call to mhttpd_fit_message()
a long cycle of (maybe 10-20) �recalculate size�, �layout�, �parse html�
second call to mhttpd_fit_message()
same long cycle of �
The way I understand this, mhttpd_fit_message() changes the size of some html element that causes the whole window to be re-layed-out.

K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
trying to figure out what triggers a long run of the �rasterizer� thread. see a very strange call sequence

timer fires
mhttpd_alarm_play()
mhttpdConfig()
�Layout�
mhttpd_alarm_play() calls mhttpdConfig() (3 times) to find out if alarm sound is enabled, the period, the file name, etc. so far so good. but mhttpdConfig()
does not touch any DOM objects, so why is it shown as calling �Layout�?!?

other than this trace, I see nothing else that would trigger the rasterizer thread�

(note) this time, mhttpd_alarm_play() does not call mhttpd_alarm_play_now(), so �new Audio� and stuff does not enter this picture.

K.O.

Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
in the early part of the trace, where I think the meat of �tab is using cpu and memory� is, I see the audio events firing in rapid sequence: loadeddata, canplay,
canplaythrough, rinse, repeat.

It turns out that promise rejection from audio.play() does not stop the loading of the sound file. This is easy to see by attaching the event handlers to these
events and by observing these event handlers print something to the javascript console.

If that is what is happening, it explains what I see: all my previous attempts to prevent the piling up of sound files are unsuccessful, and when I open the
previously inactive tab, all the queued sound files start loading (and not playing per �user did not interact� policy).

google docs suggest using audio.src=�� to cancel loading of sound files and it does seem to work. testing it now.

K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
gotcha. came back home, found one tab using about 10% cpu. audio.src=�� is commented out, javascript console is full of this:

Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 234
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 234
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 234
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 235
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 235
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 235
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 236
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 236
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 236
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 237
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 237
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 237
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 238
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 238
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 238
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 239
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 239
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 239
the timestamp is exactly when I opened this tab. so confirmed, a whole bunch of audio files got queued, when I open the tab they all try to play. (there is no
actual sound, all tabs are muted).

now I uncomment audio.src=�� and see what happens.

K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
looks good. an update to google-chrome came in and after installing, I no longer see midas tabs show high cpu usage or high memory use. I think the
audio.src=�� fix is it. I will be committing these fixes to midas. K.O.

Edit Pin to top Mark as spam Delete 2019-11-12
Stefan Ritt
The loop in mhttpd_fit_message() is there for a good reason: I want to display the message in a single line. If it�s too long, I want first to cut the time stamp
and then display it. If it�s still too long, I want to truncate the message and display ��� at the end. The problem is what �too long� is. Nobody can tell you how
much pixel a message on your browser take, because this depends on the installed fonts, the exact character spacing of your browser and so on. So the only
way I could make this happen is to add one char at a time, until we get close to the maximum allowed space. If course this requires a re-layout of the page for
10-20 times, but when your window is in the foreground this is not a problem, since a browser can do this with small CPU load. The �scope� application I use
does 70 frames per second at 30% CPU load. So one could make the loop a bit smarter, like binary search, which would drop the 10-20 iterations to log2(10-
20) ~ 4-5, but still there would be a loop.

How that the update of the messages in the background is suppressed with the hidden API, do you still have that problem or can we consider it fixed?

Stefan

Pin to top Mark as spam Delete 2019-11-18
dd1 REPORTER
see new behaviour - after many days, inactive page size is ~180 Mbytes. 0% CPU use (an improvement from before where there was large CPU use). activate
the tab, nothing much happens, 0% CPU use (again an improvement from before). after about 30 seconds, memory use drops down to the normal 50-70-80
Mbytes. I think what we see is the garbage collector is throttled down and does not keep up with our allocations. Stefan�s new fix reducing polling in inactive
pages from 1/sec to 1/10sec should help with this. K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-11-19
dd1 REPORTER
mhttpd_fit_message() - confirmed. I was confused about the function argument.

I thought it is passed an array of messages. no, it is one message string and the loop is over the message string length. The loop is done twice (second time,
with the time/date stamp removed). google-chrome debugger does show that this uses large amount of CPU, mainly to compute d.offsetWidth.

I think I will refactor these loops - instead of growing the message, I will shrink it.

K.O.

Edit Pin to top Mark as spam Delete 4 hours ago
dd1 REPORTER
rewrote mhttpd_fit_message() to reduce CPU use: try to fit complete message, if too long, try to fit message without timestamp, if too long, guess the desired
length assuming all chars have same width, then grow or shrink the message until the size is right. K.O.

Edit Pin to top Mark as spam Delete 17 minutes ago
dd1 REPORTER
changed status to resolved
The main fix is to set audio.src="" in the promise rejection. K.O.

Edit Pin to top Mark as spam Delete just now

What would you like to say?
Assignee
�
Type
bug
Priority
major
Status
resolved
Votes
0 Vote for this issue
Watchers
1 Stop watching
Dismiss this bannerJira Software: the preferred issue tracker for Bitbucket. Join the team!

28 Nov 2019, Konstantin Olchanski, Bug Report, midas alarm sound unreliable in google-chrome

I accidentally discovered a problem with the alarm sounds played by midas.

The javascript code is very simple: var audio=new Audio("alarm.mp3"); audio.play();

In the past, this reliably played a sound.

More recently, I started seeing javascript log messages about "unhandled exception" from audio.play(). (for me, they often
interrupt my javascript debugger sessions, in a very annoying way).

Adding an exception handler to audio.play() was not effective: it turns out, these days, audio.play() returns a Promise
and one must handle the Promise rejection case. (also it turns out that in the rejection handler
one *must* clear audio.src to avoid problems with excessive memory and cpu use).

But why does audio.play() throw a rejected promise?!?

This is the error message provided: "play() failed because the user didn't interact with the document first. https://goo.gl/xX8pDD"
name: "NotAllowedError".

The link takes us to https://developers.google.com/web/updates/2017/09/autoplay-policy-changes. This web page seems to explain
everything neatly, but most of the information turns out to be wrong and unhelpful:

a) "Muted autoplay is always allowed" - wrong - I see audio.play() rejection from muted tabs
b) "User has interacted with the domain (click, tap, etc.)." - wrong - I see rejection in the javascript debugger even as I debug the web page
c) "user's Media Engagement Index threshold has been crossed" - broken - playback count and MEI is always 0 (zero) for midas - per chrome://media-engagement/
d) "user has added the site to their home screen on mobile or installed the PWA on desktop" - what ?!?
e) chrome://flags/#autoplay-policy does not exist (was removed)

There may be a way to start google-chrome with special flags to globally enable autoplay, but it seems
to be aimed at "kiosk" type applications, not for general use browsers:
https://stackoverflow.com/questions/57455849/chrome-autoplay-policy-chrome-76

The bottom line:

there is no way to ensure that the alarm sound will always play

K.O.

P.S. I am tracking this problem here:
https://bitbucket.org/tmidas/midas/issues/191/exception-on-audioplay

28 Nov 2019, Stefan Ritt, Bug Report, midas alarm sound unreliable in google-chrome

The document

https://docs.google.com/document/d/1_278v_plodvgtXSgnEJ0yjZJLg14Ogf-ekAFNymAJoU/edit

says that the "... playback must be at lest 7 seconds long" to signal significant playback.

I attached a modified alarm sound file with 3 seconds of sound followed by 5 seconds of silence. Try that one and let me know if this changes anything. If successful, I can modify the 
other sound files as well.

Stefan

15 Nov 2019, Andreas Suter, Suggestion, javascript comunication

I am currently testing the new history system on the mhttpd side and stumbled over the following issue: typically our user open a lot of midas web-page tabs and keep them open. With the current version this leads after a night typically to a state where the browser is busy with itself and not reacting anymore.

One important reason seems to be that ALL tabs trying to communicate all the time which is totally unnecessary, since I think a hidden tab should stay in a sleeping mode. 

I was browsing if there is a way to find out if a tab is active or not, and found the following API which exactly does this:

https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API

Furthermore, the simple

document.hidden 

tag, could be used to find out if the page is currently active.

Wouldn't it a good idea to send all midas tabs which are not active into a sleep mode and only reactivate them if they come into focus?

I had a quick look at the JavaScript libs of midas, but I am not quite certain where to best inject this.

15 Nov 2019, Stefan Ritt, Suggestion, javascript comunication

Very good idea. And thanks for finding the document.hidden solution. I put it in, so give it a try.

Best,
Stefan


> I am currently testing the new history system on the mhttpd side and stumbled over the following issue: typically our user open a lot of midas web-page tabs and keep them open. With the current version this leads after a night typically to a state where the browser is busy with itself and not reacting anymore.
> 
> One important reason seems to be that ALL tabs trying to communicate all the time which is totally unnecessary, since I think a hidden tab should stay in a sleeping mode. 
> 
> I was browsing if there is a way to find out if a tab is active or not, and found the following API which exactly does this:
> 
> https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API
> 
> Furthermore, the simple
> 
> document.hidden 
> 
> tag, could be used to find out if the page is currently active.
> 
> Wouldn't it a good idea to send all midas tabs which are not active into a sleep mode and only reactivate them if they come into focus?
> 
> I had a quick look at the JavaScript libs of midas, but I am not quite certain where to best inject this.

17 Nov 2019, Konstantin Olchanski, Suggestion, javascript comunication

> Very good idea. And thanks for finding the document.hidden solution. I put it in, so give it a try.

Hi, Stefan - I did not look at your code, if all midas tabs are inactive, will the alarm sound still play?

K.O.


> 
> Best,
> Stefan
> 
> 
> > I am currently testing the new history system on the mhttpd side and stumbled over the following issue: typically our user open a lot of midas web-page tabs and keep them open. With the current version this leads after a night typically to a state where the browser is busy with itself and not reacting anymore.
> > 
> > One important reason seems to be that ALL tabs trying to communicate all the time which is totally unnecessary, since I think a hidden tab should stay in a sleeping mode. 
> > 
> > I was browsing if there is a way to find out if a tab is active or not, and found the following API which exactly does this:
> > 
> > https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API
> > 
> > Furthermore, the simple
> > 
> > document.hidden 
> > 
> > tag, could be used to find out if the page is currently active.
> > 
> > Wouldn't it a good idea to send all midas tabs which are not active into a sleep mode and only reactivate them if they come into focus?
> > 
> > I had a quick look at the JavaScript libs of midas, but I am not quite certain where to best inject this.

18 Nov 2019, Stefan Ritt, Suggestion, javascript comunication

> Hi, Stefan - I did not look at your code, if all midas tabs are inactive, will the alarm sound still play?

Nope. All updates are done in mhhtpd_refresh(), and I changed it such that nothing is updated if hidden.

I agree however that this is bad. You want to hear alarms always. So I added some code to do ONLY alarm updates if in the background. 
No GUI changes, only audio playing. Checking is reduced from 1 Hz to 0.1 Hz (once every 10 seconds). I don't have however a solution 
to your problem "not enough interaction with page" which I have never seen so far.

I wonder if we should try push notifications: https://developers.google.com/web/fundamentals/codelabs/push-notifications
which seems a bit complicated to me. It might also be too subtle if someone is sleeping in front of the computer.

Stefan

18 Nov 2019, Konstantin Olchanski, Suggestion, javascript comunication

> > Hi, Stefan - I did not look at your code, if all midas tabs are inactive, will the alarm sound still play?
> I added some code to do ONLY alarm updates if in the background (once every 10 seconds)

Ok, good.

> I don't have however a solution to your problem "not enough interaction with page" which I have never seen so far.

When it happens, you should see my messages about it in the javascript console. (rejected promise from audio.play()).

How to make it happen so one can see it and be sure it is handled correctly (audio.src has to be set to blank, at the least), I have no idea.

What I have been looking at for the huge memory and cpu use problem, I would call "a bad interaction between two kludges",
the first kludge is the throttling javascript in inactive tabs, the second kludge is the "not interacted" rejection of audio.play() promise.
(audio.play() returning a promise, and the strange sequencing between this promise success/reject and
the audio events loaded/canplay/done/etc smells of a kludge, too).

> I wonder if we should try push notifications: https://developers.google.com/web/fundamentals/codelabs/push-notifications
> which seems a bit complicated to me. It might also be too subtle if someone is sleeping in front of the computer.

I did not read all the explanation, but if it requires use of 3rd party services, I think we cannot use it, we do not want
to miss "experiment is on fire" alarms just because google is down.

K.O.

17 Nov 2019, Konstantin Olchanski, Suggestion, javascript comunication

> I am currently testing the new history system on the mhttpd side and stumbled over the following issue:
> typically our user open a lot of midas web-page tabs and keep them open. With the current version this leads after a night typically to a state where the browser is busy with itself and not reacting anymore.
> 
> One important reason seems to be that ALL tabs trying to communicate all the time which is totally unnecessary, since I think a hidden tab should stay in a sleeping mode. 

I am looking at two more problems with inactive tabs:

a) google chrome slows down the execution of javascript in inactive tabs, leading to trouble
with memory management - midas pages poll at 1/sec, each poll allocates memory for processing RPC messages,
and (until recently) allocates memory for new DOM objects to update the web page - but the garbage collector
gets slowed down and does not keep up - leading to huge memory use (up to 200 Mbytes) for inactive midas pages
that normally consume 50-ish Mbytes.

b) the playing of the alarm sound is throttled by "user has not interacted with the document" thing, but there is a bug -
instead of canceling the playing of the alarm sound, the sound file is still loaded, (but not played). (this is hard to debug
because I do not know how to manually trigger the "user has not interacted..." condition, I have to wait for many days for it.
Then, for inactive tabs, the loading of the sound files is slowed down, leading to many of them getting queued up,
and eventually they all try to load and play at the same time, again leading to huge memory and cpu use in inactive tabs.
(this sounds incredible because we play the alarm sound at most 1/minute, for sure the previous sound file must have
finished playing by then, but no, it is easy to see it happen - add a few console.log messages and wait for a few days).

>
> I was browsing if there is a way to find out if a tab is active or not, and found the following API which exactly does this:
> https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API
> 

From looking at the inactive tab business, I see that javascript in inactive tabs runs quite differently from javascript
in active tabs (i.e. timers do not work the same) and I see how the "visibility api" had to be invented to counter that.

> 
> Wouldn't it a good idea to send all midas tabs which are not active into a sleep mode and only reactivate them if they come into focus?
> I had a quick look at the JavaScript libs of midas, but I am not quite certain where to best inject this.
>

most midas web pages poll in two places - mhttpd_refresh() updates the current date timestamp, alarms, currently active midas.log message;
and each page has it's own loop for updating it's own data (i.e. "alarms" page, "programs" page).

we should be careful to not completely disable all polling as some experiments do use and do rely on the midas producing
loud alarm messages ("program logged is not running!!!", "program mhttpd aborted!!!"). Even if all midas tabs are inactive,
some javascript is some tab still has to run frequently enough to poll midas and to sound the alarm sounds (even though
I am not sure how to 100% reliably counteract the google-chrome not playing sound files because
of the "user did not interact with the site..." thing).

K.O.

18 Nov 2019, Stefan Ritt, Suggestion, javascript comunication

> a) google chrome slows down the execution of javascript in inactive tabs, leading to trouble
> with memory management - midas pages poll at 1/sec, each poll allocates memory for processing RPC messages,
> and (until recently) allocates memory for new DOM objects to update the web page - but the garbage collector
> gets slowed down and does not keep up - leading to huge memory use (up to 200 Mbytes) for inactive midas pages
> that normally consume 50-ish Mbytes.

Try my latest change, which drops any update if hidden, except the alarm sound. At the moment this only works on the main "status" page.

> most midas web pages poll in two places - mhttpd_refresh() updates the current date timestamp, alarms, currently active midas.log message;
> and each page has it's own loop for updating it's own data (i.e. "alarms" page, "programs" page).

That's correct. I changed mhttpd_refresh() now and the main loop for the "status" page. If that works for everybody, we can do that also for "programs" and other pages. The code is here:

      if (document.hidden) {
         // don't update page if hidden
         setTimeout(update_page, 500);
         return;
      }

where "update_page" has to be replaced with the proper function.

Stefan

08 Nov 2019, Pierre Gorel, Bug Report, Newly installed MIDAS on OSX: mhttpd crahes

Context: out of the box  MIDAS (using cmake) on OSX Mojave. 

Running with mongoose/opensslm installation following instruction here:
https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux

mhttpd crashing when midas webpage opened with Safari (12.1.2). Usually when opening the "chat" tab but sometimes also with the "message" tab.
mhttpd(11109,0x70000827a000) malloc: *** error for object 0x7f8669501ef0: pointer being freed was not allocated
mhttpd(11109,0x70000827a000) malloc: *** set a breakpoint in malloc_error_break to debug

No crash if using firefox (70.0.1 (64-bit))

12 Nov 2019, Konstantin Olchanski, Bug Report, Newly installed MIDAS on OSX: mhttpd crahes

> Context: out of the box  MIDAS (using cmake) on OSX Mojave. 
> 
> Running with mongoose/opensslm installation following instruction here:
> https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux
> 
> mhttpd crashing when midas webpage opened with Safari (12.1.2). Usually when opening the "chat" tab but sometimes also with the "message" tab.
> mhttpd(11109,0x70000827a000) malloc: *** error for object 0x7f8669501ef0: pointer being freed was not allocated
> mhttpd(11109,0x70000827a000) malloc: *** set a breakpoint in malloc_error_break to debug
> 
> No crash if using firefox (70.0.1 (64-bit))

I think we also have reports of mhttpd crash on macos with safari from the Dragon experiment,
but cannot reproduce the problem.

If you can reproduce this, can you capture the crash stack trace?

One way to do this is to enable core dumps in odb "/expt/enable core dumps" set to "y", restart mhttpd,
wait for the crash. I think macos writes core dumps into /cores/... Or you can run mhttpd inside lldb
and wait for the crash. the lldb command to show the stack trace is "bt", but you may need
to switch to different threads to see which one actually crashed. I forget what the command
for that is.

BTW, the mhttpd networking code has not changed in a long time, but an update
of mongoose web server library is overdue (to fix a memory leak, at least).

K.O.

15 Nov 2019, Pierre Gorel, Bug Report, Newly installed MIDAS on OSX: mhttpd crahes

It is reproducible alright.
Here are the core dump and the backtrace (I think  the former is more informative).



> > Context: out of the box  MIDAS (using cmake) on OSX Mojave. 
> > 
> > Running with mongoose/opensslm installation following instruction here:
> > https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux
> > 
> > mhttpd crashing when midas webpage opened with Safari (12.1.2). Usually when opening the "chat" tab but sometimes also with the "message" tab.
> > mhttpd(11109,0x70000827a000) malloc: *** error for object 0x7f8669501ef0: pointer being freed was not allocated
> > mhttpd(11109,0x70000827a000) malloc: *** set a breakpoint in malloc_error_break to debug
> > 
> > No crash if using firefox (70.0.1 (64-bit))
> 
> I think we also have reports of mhttpd crash on macos with safari from the Dragon experiment,
> but cannot reproduce the problem.
> 
> If you can reproduce this, can you capture the crash stack trace?
> 
> One way to do this is to enable core dumps in odb "/expt/enable core dumps" set to "y", restart mhttpd,
> wait for the crash. I think macos writes core dumps into /cores/... Or you can run mhttpd inside lldb
> and wait for the crash. the lldb command to show the stack trace is "bt", but you may need
> to switch to different threads to see which one actually crashed. I forget what the command
> for that is.
> 
> BTW, the mhttpd networking code has not changed in a long time, but an update
> of mongoose web server library is overdue (to fix a memory leak, at least).
> 
> K.O.

15 Nov 2019, Konstantin Olchanski, Bug Report, Newly installed MIDAS on OSX: mhttpd crahes

> It is reproducible alright.

Thanks. At first blush, a guess, read_passwords() is not thread-safe and is called from multiple threads, not protected by semaphore. Crash report shows 2 active threads 
(one made it is far as processing the mjson rpc, the other one crashed in read_passwords()).

K.O.


> Here are the core dump and the backtrace (I think  the former is more informative).
> 
> 
> 
> > > Context: out of the box  MIDAS (using cmake) on OSX Mojave. 
> > > 
> > > Running with mongoose/opensslm installation following instruction here:
> > > https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux
> > > 
> > > mhttpd crashing when midas webpage opened with Safari (12.1.2). Usually when opening the "chat" tab but sometimes also with the "message" tab.
> > > mhttpd(11109,0x70000827a000) malloc: *** error for object 0x7f8669501ef0: pointer being freed was not allocated
> > > mhttpd(11109,0x70000827a000) malloc: *** set a breakpoint in malloc_error_break to debug
> > > 
> > > No crash if using firefox (70.0.1 (64-bit))
> > 
> > I think we also have reports of mhttpd crash on macos with safari from the Dragon experiment,
> > but cannot reproduce the problem.
> > 
> > If you can reproduce this, can you capture the crash stack trace?
> > 
> > One way to do this is to enable core dumps in odb "/expt/enable core dumps" set to "y", restart mhttpd,
> > wait for the crash. I think macos writes core dumps into /cores/... Or you can run mhttpd inside lldb
> > and wait for the crash. the lldb command to show the stack trace is "bt", but you may need
> > to switch to different threads to see which one actually crashed. I forget what the command
> > for that is.
> > 
> > BTW, the mhttpd networking code has not changed in a long time, but an update
> > of mongoose web server library is overdue (to fix a memory leak, at least).
> > 
> > K.O.

20 Sep 2019, Frederik Wauters, Bug Report, lazylogger in cmake & max_event_size

compiling:
----------

The compile option -DHAVE_FTPLIB checked in mdsupport.cxx disappeared if you 
compile with cmake.

I added:

  add_compile_options(-DHAVE_FTPLIB)

to CMakeLists.txt to fix this. Can probably be done in a more elegant way 
(checking if the right libraries exist?).

running:
-------

Our MAX_EVENT_SIZE is set in the odb to 805306368. This number is also used in 

  INT lazy_copy(char *outfile, char *infile, int max_event_size)

this is to big when copying with ftp, causing a crash. Reducing it here with a 
factor 10 solves our problems.

27 Sep 2019, Konstantin Olchanski, Bug Report, lazylogger in cmake & max_event_size

> The compile option -DHAVE_FTPLIB checked in mdsupport.cxx disappeared if you 
> compile with cmake.

Hi, Stefan - do we still need to support FTP in the logger? In the lazylogger, special support for 
FTP is not needed, they can you the "script" method and do FTP without our help.

I move to remove FTP support from MIDAS. (second? other opinions?)

> Our MAX_EVENT_SIZE is set in the odb to 805306368. This number is also used in 
> this is to big when copying with ftp, causing a crash. Reducing it here with a 
> factor 10 solves our problems.

I am surprised that changing MAX_EVENT_SIZE (to a "too big" value) causes lazylogger to 
crash. More usually MAX_EVENT_SIZE has no effect until you try to write an event that is 
somehow "too big", then there is a crash. Perhaps there is a bug specifically in the FTP code.

Anyhow, I recommend the solution of using the "script" method. We have example lazylogger 
scripts in midas/progs/lazy*.perl (the scripts do not have to be in perl, python is ok). We do
not have any example that uses FTP because we do not use FTP for data storage. But you can
easily adapt lazy_test.perl and lazy_copy.perl to use scp and sftp, the secure versions of FTP.

K.O.

14 Oct 2019, Stefan Ritt, Bug Report, lazylogger in cmake & max_event_size

> > The compile option -DHAVE_FTPLIB checked in mdsupport.cxx disappeared if you 
> > compile with cmake.
> 
> Hi, Stefan - do we still need to support FTP in the logger? In the lazylogger, special support for 
> FTP is not needed, they can you the "script" method and do FTP without our help.
> 
> I move to remove FTP support from MIDAS. (second? other opinions?)

I oppose to remove FTP support from lazylogger. We still use it heavily at PSI. In comparison to the "script" method, it 
shows the current speed in MB/s which helps us to diagnose some network problem by writing this number into the 
history. The "script" method only give you an integral transfer speed after a file has be completely written.

I'm however not sure who FTP is used in lazylogger. It goes into mdsupport.cxx and I seem to remember that Pierre 
wrote the FTP code by hand, so no external library is necessary.

Stefan

24 Oct 2019, Konstantin Olchanski, Bug Report, lazylogger in cmake & max_event_size

> > > The compile option -DHAVE_FTPLIB checked in mdsupport.cxx disappeared if you 
> > > compile with cmake.
> > 
> > Hi, Stefan - do we still need to support FTP in the logger? In the lazylogger, special support for 
> > FTP is not needed, they can you the "script" method and do FTP without our help.
> > 
> > I move to remove FTP support from MIDAS. (second? other opinions?)
> 
> I oppose to remove FTP support from lazylogger.

Confirmed. FTP support in lazylogger stays.

K.O.

21 Oct 2019, Vinzenz Bildstein, Forum, Data for key truncated

I keep on getting messages like this:

16:25:35 [fecaen,ERROR] [odb.c:4567:db_get_data,ERROR] data for key
"/DAQ/params/VX1730/custom/Board 0/Channel 0/Input range" truncated

whenever I start my frontend. Input range is defined to be a BOOL and using
odbedit to read it shows:

Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
Input range                     BOOL    1     4     75h  0   RWD  y

without any error message. The entry is read using

         size = sizeof(fInputRange);
         db_get_data(hDb, hSubKey, &fInputRange, &size, TID_BOOL);

where fInputRange is a bool.

Where does this message come from and how can I resolve this?

23 Oct 2019, Konstantin Olchanski, Forum, Data for key truncated

> I keep on getting messages like this:
> 16:25:35 [fecaen,ERROR] [odb.c:4567:db_get_data,ERROR] data for key
> "/DAQ/params/VX1730/custom/Board 0/Channel 0/Input range" truncated
>
>  [  bool fInputRange... ]
>          size = sizeof(fInputRange);
>          db_get_data(hDb, hSubKey, &fInputRange, &size, TID_BOOL);
>

The error is correct. size of TID_BOOL is 4 byte (uint32_t) and you give is sizeof(bool) instead which is probably not 4.

Note that sizeof(bool) is not well defined, sometimes it is 1 (you need 4), sometimes something else, see
https://stackoverflow.com/questions/4897844/is-sizeofbool-defined-in-the-c-language-standard

A good fix would be to change fInputRange from bool to uint32_t (which is always 4 byte size).

#include <stdint.h>
...
uint32_t fInputRange;

K.O.

23 Sep 2019, Frederik Wauters, Suggestion, recover daq and hardware safety.

We have encountered a safety issue with our HPGe HV and it's midas frontend. Turning off or changing HV unknowingly has to be avoided at all costs.

Current safety protection

We use the DF_REPORT_STATUS flag to give the hardware settings precedence over odb settings. This all takes place in the init.

DAQ recovery Issue?

In the setup / development state, we sometimes have to remove the SHM files and reload an odb dump to recover the DAQ. When the FE is running, this can modify hardware settings. E.g. change a voltage

Question

Is there a way one can let the frontend know the "load" function is called in odbedit? Or other suggestions to build in this safety.

27 Sep 2019, Konstantin Olchanski, Suggestion, recover daq and hardware safety.

> We have encountered a safety issue with our HPGe HV and it's midas frontend.

At TRIUMF and other labs the words "safety issue" have very specific meaning and
we tend to follow this guidance: MIDAS is not certified for and is not intended for use with 
safety critical applications as defined here:
https://en.wikipedia.org/wiki/Safety-critical_system

> A safety-critical system ... malfunction may result in ... following outcomes:
> death or serious injury to people
> loss or severe damage to equipment/property
> environmental harm

If this is your case, you should use properly certified software *and hardware*. Safety 
officers at most institutions require certified hardware interlocks and other protections to 
prevent such undesirable outcomes. Use of certified PLCs is sometimes permitted.

But I suspect in your case, there is no "safety issue", you only want to protect some 
valuable but not critical equipment against accidental damage.

In this case, you can probably use midas, but if midas malfunction may result in destroying 
your experiment (i.e. accidentally set wrong voltage on 3000 phototubes), you should also 
have hardware based protections (hardware limits on max/min high voltage). Most HV 
power supplies implement such protections (screw-driver actuated max voltage limits).

If there is danger of destroying your experiment you should also have an independent 
review of your control system to avoid avoidable mistakes and obvious problems.

> Turning off or changing HV unknowingly has to be avoided at all costs

The function of changing high-voltage is implemented in your frontend program. Right in 
the place in this program where you transmit the voltage setting from ODB to the hardware 
is where you implement your protections (validate the voltage range, check that changing 
the voltage is permitted, etc). This protects you against unexpected/incorrect/erroneous
changes in ODB (wrong ODB is loaded, wrong values in ODB, ODB is corrupted, etc).

In addition, it is wise to set software based limits in the HV power supply (software 
controlled max high voltage, software controlled max current, etc). Most HV power supplies 
implement such functions.

To ensure high voltage cannot be changed at the wrong times, you can also implement 
procedural and hardware protections, such as unplug the power supply control connection 
(usually ethernet or serial or usb cable). This will prevent you from monitoring the high 
voltage currents and the only solution is to use a  power supply with a hardware "write 
protect" function (a key needs to be inserted and turned to allow changing anything).

All of this is generic and applies to any controls software, not just MIDAS.

Without at least some of these protections (especially protections in your frontend 
program), the questions you asked about loading ODB are insufficient.

K.O.

28 Sep 2019, Frederik Wauters, Suggestion, recover daq and hardware safety.

Dear Konstantin,

So let me retract the term "safety issue" then, it was more a request/question for this type of 
info between the fe and the odb.

We have most of what you mention:
* The HV hardware has current limits
* The Hardware has fixed ramping limits.

same for the software. 

The issue occurs when e.g. one channel can not be turned on and ramp for some temp/specific 
reason, and someone else is working on the daq and reloads the odb for e.g. 1h ago.  

> > We have encountered a safety issue with our HPGe HV and it's midas frontend.
> 
> At TRIUMF and other labs the words "safety issue" have very specific meaning and
> we tend to follow this guidance: MIDAS is not certified for and is not intended for use with 
> safety critical applications as defined here:
> https://en.wikipedia.org/wiki/Safety-critical_system
> 
> > A safety-critical system ... malfunction may result in ... following outcomes:
> > death or serious injury to people
> > loss or severe damage to equipment/property
> > environmental harm
> 
> If this is your case, you should use properly certified software *and hardware*. Safety 
> officers at most institutions require certified hardware interlocks and other protections to 
> prevent such undesirable outcomes. Use of certified PLCs is sometimes permitted.
> 
> But I suspect in your case, there is no "safety issue", you only want to protect some 
> valuable but not critical equipment against accidental damage.
> 
> In this case, you can probably use midas, but if midas malfunction may result in destroying 
> your experiment (i.e. accidentally set wrong voltage on 3000 phototubes), you should also 
> have hardware based protections (hardware limits on max/min high voltage). Most HV 
> power supplies implement such protections (screw-driver actuated max voltage limits).
> 
> If there is danger of destroying your experiment you should also have an independent 
> review of your control system to avoid avoidable mistakes and obvious problems.
> 
> > Turning off or changing HV unknowingly has to be avoided at all costs
> 
> The function of changing high-voltage is implemented in your frontend program. Right in 
> the place in this program where you transmit the voltage setting from ODB to the hardware 
> is where you implement your protections (validate the voltage range, check that changing 
> the voltage is permitted, etc). This protects you against unexpected/incorrect/erroneous
> changes in ODB (wrong ODB is loaded, wrong values in ODB, ODB is corrupted, etc).
> 
> In addition, it is wise to set software based limits in the HV power supply (software 
> controlled max high voltage, software controlled max current, etc). Most HV power supplies 
> implement such functions.
> 
> To ensure high voltage cannot be changed at the wrong times, you can also implement 
> procedural and hardware protections, such as unplug the power supply control connection 
> (usually ethernet or serial or usb cable). This will prevent you from monitoring the high 
> voltage currents and the only solution is to use a  power supply with a hardware "write 
> protect" function (a key needs to be inserted and turned to allow changing anything).
> 
> All of this is generic and applies to any controls software, not just MIDAS.
> 
> Without at least some of these protections (especially protections in your frontend 
> program), the questions you asked about loading ODB are insufficient.
> 
> K.O.

29 Sep 2019, Konstantin Olchanski, Suggestion, recover daq and hardware safety.

> 
> The issue occurs when e.g. one channel can not be turned on and ramp for some temp/specific 
> reason, and someone else is working on the daq and reloads the odb for e.g. 1h ago.  
> 

So you want to ensure that some HV channels are turned off and stay turned off. Yes?

Most effective solution will depend on the consequences of unwanted turning-on of your channels:

- if hardware is destroyed if turned on - I think you should have a hardware lock-out. (unplug the HV cable)
- if hardware malfunctions and will degrade if left turned on for long time (i.e. a hot phototube or sparking wire chamber) - your data 
monitoring software should detect the anomaly (it will show up as a hot channel, dead channel, etc) and the people running the 
experiment will realize the mistake and turn the channel back off. also hardware monitoring (HV currents, etc) should detect this, with 
same effect.
- if collected data becomes useless (the turned-off channel make big noise in all other channels), then same thing, your data 
monitoring should catch it.

The next consideration is what are you protecting against:

a) one person flags channel defective, turns it off, next person knows nothing, turns it back on - you need to work on documentation, 
shift hand-off and other human-level procedures
b) people running experiment load random odb files - same thing, from human-level procedures and documentation it should be made 
clear which odb files are correct and which should not be used
c) software malfunction (not human person) causes data change in odb causes turned-off channel to turn back on
d) hardware malfunction causes turned-off channel to turn back on (HV power supply hardware or firmware malfunctions and decides 
that all channels should be turned on at maximum high voltage)

In the experiments I am most familiar with, problem (b) is avoided by never loading/reloading odb files directly, most/all interaction
with the experiment is done through web pages, and these web pages are carefully coded to be safe against most user mistakes.

Cases (a), (b) and (c) you can protect against by changing the frontend code to refuse to turn on some channels:

int set_hv(int channel, int voltage) {
   if (channel == 35) return COMMAND_REFUSED;
   write_to_hardware(channel, voltage);
   return COMMAND_SUCCESS;
}

But in reality this solution only creates problem (e):

e) people running the experiment start random versions of the frontend program, make random changes to the frontend source code, 
multiple people working on the frontend have their own personal versions/copies of the source code, etc.

This is the worst-case scenario, meaning the experiment lost control of software configuration, and even basic software version 
control tools (like svn or git) are not being used. If your experiment gets that chaotic, all protections are likely to be ineffective - 
documentation will not work (people will ignore post-it notes "do not turn on!"), hardware protections will not work (unplugged cable 
labeled "do not plug in!" will be plugged back in and powered), etc. good luck, then.

K.O.

15 Oct 2019, Stefan Ritt, Suggestion, recover daq and hardware safety.

There is a not-so-well-known function in the ODB to write protect some keys. You can do

odbedit> chmod 1 /Equipment/HV/Demand

which will write protect your Demand values. You see that by doing

odbedit> ls -ls

where you only see a "R" at the right end instead a "RWD". I haven't tried it yet (so better do a dry run yourself), but that should prevent an odb load to overwrite your demand values. To change the values, put some logic on a custom page to unprotect the 
values, change them, and then protect them again.

Stefan

14 Oct 2019, Joseph McKenna, Forum, tmfe.cxx - Future frontend design

Hi,

I have been looking at the 2019 workshop slides, I am interested in the C++ future of MIDAS. 

I am quite interested in using the object oriented 


ALPHA will start data taking in 2021

06 Oct 2019, Nik Berger, Bug Report, History data size mismatch

Logging a list of variables to the history via links in the history ODB subtree,
we get messages as follows at every run start:

19:43:24.009 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes

19:43:24.008 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes

19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes

19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes

The history calculates the size of a record from the size of the individual variables, (history_schema.cxx, L2666 ff), whereas the ODB delivers the data aligned/padded to the size of the largest value in the record.
In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit), leading to a padded response from the ODB, 4 byte longer than the history expects.
Quick fix: Add another 32 bit dummy variable to the history. Gets rid of the error messages...
Should probably be fixed at a deeper level...

06 Oct 2019, Stefan Ritt, Bug Report, History data size mismatch

I wonder why do you this via ODB links. The "standard" way of writing to the history should be to create events for an equipment and flag this equipment as being written to the
history. All variables under /Equipment/<name>/Variables then automatically go into the history and you don't have to worry about ODB links. Only variables not fitting the
equipment/variables scheme should be dealt with via ODB links, like variables under equipment/statistics or parameters in another ODB tree. In a typical midas experiment, only
very few variables typically go into the 'System' event. This is however probably not a solution to your problem. If you have a similar structure (doubles plus an odd number of floats)
under 'variables', you might get the same error. I'n in contact with KO to fix this problem at the root level.

Stefan

> Logging a list of variables to the history via links in the history ODB subtree,
> we get messages as follows at every run start:
> 
> 19:43:24.009 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes
> 
> 19:43:24.008 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes
> 
> 19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes
> 
> 19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes
> 
> The history calculates the size of a record from the size of the individual variables, (history_schema.cxx, L2666 ff), whereas the ODB delivers the data aligned/padded to the size of the largest value in the record.
> In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit), leading to a padded response from the ODB, 4 byte longer than the history expects.
> Quick fix: Add another 32 bit dummy variable to the history. Gets rid of the error messages...
> Should probably be fixed at a deeper level...

10 Oct 2019, Konstantin Olchanski, Bug Report, History data size mismatch

>
> In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit)
>

Padding trouble, mixing "double" and "float" trouble. Ouch.

Best wisdom I received on this: never use "float", always use "double".

I was burned by "float" with following code, which produced the same result from
analyzing 100 files as from analyzing 1000 files. (why did we take data for 10 weeks
instead of 1 week?). Hint: "float" overflows way too quickly, after overflow sum+=1 does not change
the value of "sum". The actual code used ROOT TH1F. Lesson: always use TH1D.

float sum = 0; // should always be "double" !!!
foreach data_file {
    foreach data from current data file {
        sum += data;
    }   
}
print sum;

K.O.

10 Oct 2019, Nik Berger, Bug Report, History data size mismatch

>I wonder why do you this via ODB links. The "standard" way of writing to the history should be to create events for an equipment and flag this equipment as being written to the
>history. All variables under /Equipment/<name>/Variables then automatically go into the history and you don't have to worry about ODB links. Only variables not fitting the
>equipment/variables scheme should be dealt with via ODB links, like variables under equipment/statistics or parameters in another ODB tree. In a typical midas experiment, only
>very few variables typically go into the 'System' event. This is however probably not a solution to your problem. If you have a similar structure (doubles plus an odd number of floats)
>under 'variables', you might get the same error.

>
> In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit)
>

We do this in the MuX DAQ and mix things that come directly from MIDAS (the MIDAS trigger rate) and things from the
analyzer (rates in the self-triggering detectors) and some temperatures from yet somewhere else. Yes, we could have
kept that apart, yes, in this case a double would also work (and not break things), but a bug is a bug...
I could think of senisble use cases where doubles and ints are mixed and I also know quite a few areas where it makes
sense to use floats...

Nik

10 Oct 2019, Stefan Ritt, Bug Report, History data size mismatch

> Yes, we could have
> kept that apart, yes, in this case a double would also work (and not break things), but a bug is a bug...
> I could think of senisble use cases where doubles and ints are mixed and I also know quite a few areas where it makes
> sense to use floats...

I agree with Nik that we should fix this on the midas level. Since it happens in history_schema.cxx which was written by KO, maybe he can have a look.

Stefan

28 Sep 2019, Pintaudi Giorgio, Forum, MIDAS interface for WAGASCI online monitor

Hello!
This question is rather complex so please forgive me if I leave out some
details.

I am currently developing an online monitor to check the data quality for the
WAGASCI experiment. The online monitor would show (almost in real-time) the
gain, the dark noise, and the pedestal for all the channels, the 2D tracks
inside the detectors for each spill and so on. This is possible because we can
continuously calibrate the WAGASCI electronics even during a Physics run.

Anyway, as I said during the MIDAS workshop, right now, we do not use MIDAS as a
frontend DAQ to readout the Physics data from the electronics (we use Pyrame and
the BabyMIND DAQ for that). One day, we might have Pyrame and the BabyMIND DAQ
send the Physics data to MIDAS in the form of MIDAS events ... but we are still
far from it (mainly because of lack of man-power on the BabyMIND side). I do not
think we will ever achieve this goal in the lifetime of the experiment because
the BabyMIND people do not see any added value in using MIDAS as a DAQ. But this
is another issue so I am going to drop this argument for now.

The fact is that I have written and tested all the code to continuously read the
WAGASCI electronics in real-time. I now would like to display some histograms and
figures in a MIDAS custom page that would automatically refresh/update. I have
not written the visualization part yet, because I would like to hear your
feedback first.

So my questions are. Suppose you have some ROOT histograms updating in real
time, what is the best way to show them in a MIDAS custom page? Is the ROOT
HttpServer an option here? If not ROOT, is there a better way to display
histograms in a web page?

I could have avoided the long introduction and just asked the questions but I
wanted to give you a little background.

This is a cartoonist impression of what I would like to achieve.

Thank you
Giorgio

29 Sep 2019, Thomas Lindner, Forum, MIDAS interface for WAGASCI online monitor

Hi Pintaudi Giorgio,

I think that the ROOT THttpServer is an option. The ROOT tools are not perfect, but it is relatively easy to embed plots in custom MIDAS pages. I have a description of one way of doing this here:

https://midas.triumf.ca/MidasWiki/index.php/Rootana_javascript_displays

Thomas

29 Sep 2019, Konstantin Olchanski, Forum, MIDAS interface for WAGASCI online monitor

> online monitor would show (almost in real-time) the 
> gain, the dark noise, and the pedestal for all the channels, the 2D tracks 
> inside the detectors for each spill and so on.

Hmm... I now realize that the midas distribution does not include an example web page that 
integrates all these elements into one easy to understand html file.

I think an example page that would answer your questions and the questions from the other 
thread about starting/stopping runs, should include following elements:

- the general midas web page framework (the midas left-hand side menu, the top side 
status display, Stefan's new odb tags)
- buttons to start and stop runs (javascript code to call the run transition RPCs)
- embedded images for history display (old style gif and new style canvas)
- embedded images for ROOT histograms (via the ROOT http server and jsroot)
- code to live-update all these elements independantly from each other (to allow history 
plots and ROOT histograms to update at different frequencies).

As for the web page code for showing a mini-event-display, I think we do not know yet how 
to do - the event data lives inside the analyzer as C++ data structures, so somehow it 
needs to be encoded as json (this code is missing - but one can use the ROOT C++ to json 
encoder/streamer), needs to be transported to the web browser (we know how to do this) 
and at the end, plotting the json data on a canvas is the easy part.

I know some experiments have done all of this, and I think we should have such a pipeline 
available as part of the ROOTANA package. Maybe some day...

K.O.

29 Sep 2019, Pintaudi Giorgio, Forum, MIDAS interface for WAGASCI online monitor

Dear Thomas and Konstantin,

thank you very much for the feedback. I found the ROOTANA javascript display a good source of 
information and references.

As Thomas said, maybe the simplest thing would be to use the ROOT THttpServer. Honestly, I do 
not think that ROOT was ever meant to act as an online monitor due to its wacky memory 
management and abysmal multithread support. In other words, I think that by using ROOT we would 
inevitably lose some performance. 

Perhaps there are better ways of achieving the same goal. For example, I was leaning towards a 
plotly.js based approach where I would encode a series of vectors in base64 strings (for better 
transmission performance), send them to the client through the MJSONRPC mechanism, decode them 
and then feed them to plotly.js. But in this case, I should study many new libraries 
(plotly.js, the library for the base64 encoding, the Gaussian fitting, etc...) and I do not 
have the time to do that now: "beam is coming".

So ROOT it is. I will use the ROOTANA javascript display as a reference. Do you happen to know 
who wrote that part? 

In that example, you have some "static" histograms that you keep always in memory, while in our 
case the number of channels is so big that we have to dynamically generate the histograms only 
when needed (when the user select a single channel).

Best regards
Giorgio

30 Sep 2019, Konstantin Olchanski, Forum, MIDAS interface for WAGASCI online monitor

> 
> As Thomas said, maybe the simplest thing would be to use the ROOT THttpServer. Honestly, I do 
> not think that ROOT was ever meant to act as an online monitor due to its wacky memory 
> management and abysmal multithread support. In other words, I think that by using ROOT we would 
> inevitably lose some performance.
>

Yes. The previous data analysis frameworks - PAW/PAW++/CERNlib (CERN), NOVA (TRIUMF) - certainly had
support for online monitoring. In CERNlib/PAW the histograms were stored in shared memory,
the analyzer running in the background was filling them, the PAW/PAW++ display was displaying
them "live". I was very surprised to find this function removed/not implemented in ROOT, given
that the same people were behind both projects (Rene Brun & co).

We tried to roll our own implementation of this in ROOTANA/ROODY, with mixed success.

I am glad the JSROOT project finally gained traction and web based "I can see the data" is now
available in ROOT.

> 
> Perhaps there are better ways of achieving the same goal. For example, I was leaning towards a 
> plotly.js based approach where 
>

There are many web/javascript graphics libraries out there, all have the weak spot - how do you
get your data into them?

Going forward, I see us standardizing on JSROOT: https://root.cern.ch/js/

>
> ... send them to the client through the MJSONRPC mechanism ...
>

I am not sure JSROOT have any support for interacting from the web page to the back-end analyzer. Perhaps
we can use the MIDAS MJSONRPC library for this. Hmm... (Note that the ROOT HTTP server is a derivative
of the mongoose web server library, which we use in mhttpd, so I already know how to work it)

>
> I would encode a series of vectors in base64 strings (for better transmission performance)
>

We looked into this when deciding on the data encoding for the midas history data. There is a tradeoff
between network use and cpu use - to save on the network, you try to reduce the data size by using
compressed binary data - to save on the CPU you try to minimize data encoding.

For history data, we gave up on binary json (extra decoding needed), gave up on text json (extra decoding 
needed), gave up on compression (extra cpu use for decompression) and use javascript native binary processing
("arraybuffer").

Our thinking is that network bandwidth is usually quite big and is getting bigger, but cpu resource is limited
and is expensive. (mobile devices seems to be stuck with ~2 GHz CPUs; cpu use means battery use and
battery capacity is limited, not improving quickly)

> 
> So ROOT it is. I will use the ROOTANA javascript display as a reference. Do you happen to know 
> who wrote that part? 
> 

Yes. See "Contact" at https://root.cern.ch/js/

>
> In that example, you have some "static" histograms that you keep always in memory, while in our 
> case the number of channels is so big that we have to dynamically generate the histograms only 
> when needed (when the user select a single channel).
> 

This requires interaction with the analyzer, requires some kind of RPC mechanism. I am now curious what jsroot 
have, also it would not be too hard to add the mjsonrpc library to rootana. Cooperation from ROOT multithreading
is not required: I can queue the RPC requests in a separate (thread safe, non-ROOT) buffer, then process
them in the ROOT main event loop (this is how the ROOTANA histogram server worked in the days when
ROOT had no multithread support at all).

K.O.

17 Sep 2019, Richard Longland, Forum, mhttpd start and stop redirect to Transition page

I recently upgraded to MIDAS version midas-2019-06-b. I had to make a few changes 
to get our custom page running again, but am a little confused on starting and 
stopping runs. When I click on my "Start" button, it now redirects to a 
Transition page rather than reloading the status page. The standard MIDAS status 
page does the same. Could someone explain the reasoning for the current behavior?

Furthermore my "Stop" button is now broken with the following error:
Error: Invalid URL "CS/EngeRun&" or query "cmd=Stop&redir=EngeRun%26" or command 
"Stop"

I looked through the mhttpd.js code and managed to get the start button to load 
the status page, at least, but the stop button seems to be written differently. 
For example, start calls:
location.search = "cmd=Transition";
whereas stop does:
mhttpd_goto_page("Transition"); // DOES NOT RETURN

Can anyone offer any insights or advice? I can change the former to "cmd=Status", but 
the latter doesn't allow it.

27 Sep 2019, Konstantin Olchanski, Forum, mhttpd start and stop redirect to Transition page

> I recently upgraded to MIDAS version midas-2019-06-b. I had to make a few changes 
> to get our custom page running again, but am a little confused on starting and 
> stopping runs.

So far so good.

> When I click on my "Start" button, it now redirects to a 
> Transition page rather than reloading the status page.

Are you sure? The "start" button redirects to the "start" page (start.html) which redirects
to the "transition" page (transition.html), which does not redirect anywhere so you can see
the result of the transition.

> Could someone explain the reasoning for the current behavior?

It's been like this for years now. Stefan suggest that we implement the "start" page
and the "transition" page as overlays on top of the status page, but it did not happen yet.

> Furthermore my "Stop" button is now broken with the following error:
> Error: Invalid URL "CS/EngeRun&" or query "cmd=Stop&redir=EngeRun%26" or command  "Stop"

I grep for "EngeRun" and I do not see it anywhere in the midas sources. Can you grep for it
to see if it is coming from one of your pages?

If you want to start/stop runs from your custom page, look at start.html and transition.html - you will
need to make the run transition RPC calls (cut-and-paste the code to your page) and (obviously)
you will not have any redirects to some strange pages.

> For example, start calls:
> location.search = "cmd=Transition";
> whereas stop does:
> mhttpd_goto_page("Transition"); // DOES NOT RETURN

It's the same thing, look at mhttpd_goto_page().

> Can anyone offer any insights or advice? I can change the former to "cmd=Status", but 
> the latter doesn't allow it.

I am not sure what you are trying to do. If you need the "start" button on the status page
to do something different from what it does now, just hack status.html until it does so.
If you need some specific help with that, I am happy to help. I think I answered all questions
you asked so far.

K.O.

06 Sep 2019, Pintaudi Giorgio, Forum, Open a hotlink to a single element in an ODB array

Hello!
Just a little question about the ODB hotlinks. Is it possible to open a hotlink
to a single element in and ODB array?
I have searched through the documentation and I have taken a look at the source
code but I could not find any piece of code to use as a reference (maybe I have
not searched deeply enough).

This is more or less what I would like to achieve (without error checking):

    for (INT i = 0; i < hv_info->num_channels; i++) {
      char element[HKEY_STRING_LENGTH];
      snprintf(element, HKEY_STRING_LENGTH, "%s[%d]", path, i);
      if (db_find_key(hDB, hv_info->hKeyRoot, element, &hKey) == DB_SUCCESS) {
        if ((hv_info->driver[i]->flags & flag) == 0) 
          db_open_record(hDB, hKey, &array[i], sizeof(double), MODE_READ, callback, pequipment);
        else
          db_open_record(hDB, hKey, &array[i], sizeof(double), MODE_READ, NULL, NULL);
      } else {
        cm_msg(MERROR, __func__, "Key %s not found", element);
      }
    }

But it is not working because the key is not found ...
Thank you
Giorgio

16 Sep 2019, Konstantin Olchanski, Forum, Open a hotlink to a single element in an ODB array

> Is it possible to open a hotlink to a single element in an ODB array?

Not possible.

> sprintf(element, "%s[%d]", path, i);
> db_find_key(hDB, hv_info->hKeyRoot, element, &hKey);

There is some confusion and inconsistency between db_xxx() functions,
some of them accept the array index "a[10]" syntax, some do not.

db_find_key() and db_watch()/db_open_record() do not operate on array elements
and do not accept the "a[10]" array index syntax.

K.O.

26 Sep 2019, Stefan Ritt, Forum, Open a hotlink to a single element in an ODB array

Pintaudi Giorgio wrote:

Hello!
Just a little question about the ODB hotlinks. Is it possible to open a hotlink
to a single element in and ODB array?

Yes it is with the now preferred function db_watch(). Following program will open a hot link to the /Experiment/Run number:

#include <stdio.h>
#include "midas.h"

int run_number;

void run_number_changed(HNDLE hDB, HNDLE hKey, int i, void *info)
{
int run_number, size;

/* get run number */
size = sizeof(run_number);
db_get_data(hDB, hKey, &run_number, &size, TID_INT);
printf("Run number is %d\n", run_number);
}

main()
{
HNDLE hKey;

/* connect to experiment */
cm_connect_experiment("", "", "ODB Test", NULL);

/* open hot link to run number */
db_find_key(1, 0, "/runinfo/run number", &hKey);
db_watch(1, hKey, run_number_changed, NULL);

/* enter idle loop */
while (cm_yield(1000); != RPC_SHUTDOWN);

cm_disconnect_experiment();
return 1;
}

27 Sep 2019, Pintaudi Giorgio, Forum, Open a hotlink to a single element in an ODB array

Thank you for the feedback.
I will try to use the db_watch function in the future.
I tried to look for more info about the db_watch function in the Wiki but I could not find much.
The Doxygen documentation website (http://ladd00.triumf.ca/~daqweb/doc/midas-devel/doc/html) seems to be down: no html folder.
Giorgio

Stefan Ritt wrote:

Pintaudi Giorgio wrote:

Hello!
Just a little question about the ODB hotlinks. Is it possible to open a hotlink
to a single element in and ODB array?

27 Sep 2019, Konstantin Olchanski, Forum, Open a hotlink to a single element in an ODB array

> I will try to use the db_watch function in the future.

Note that db_watch() and db_open_record() work exactly the same way, both only allow 
watching "whole" odb entries, you cannot watch individual array elements.

The db_watch() callback function gives you the array index of the array element that was 
changed and that fired the notification. 

*but*

If you change many array elements quickly you will not necessary receive notifications for 
all and each of of them (underlying transport is UDP allows notification packet loss).

If you are watching 1 array element change at a slow rate (1/sec), db_watch() will work well.

Otherwise, you can watch the whole array, in the db_watch() callback, read the new array 
contents, compare it with your saved copy of pervious array contents, identify which array 
elements have changed and dance from here. (this method does not work if you do not 
actually change the array element values: change from "1" to "1", this is an old weakness in 
the midas hot link mechanism).

If you are not sure how to use db_watch(), look inside midas/progs/odbedit.cxx search for 
db_watch() and search for the db_watch() callback function.

K.O.

14 Aug 2019, Stefan Ritt, Info, New history plot facility

During my visit at TRIUMF we rewrote the history plotting functionality of midas. Instead of
static GIF images, we have now interactive JavaScript panels where we can scroll, zoom,
inspect values and much more (example is attached). We are now in a state where this is still
work in progress, but already at this stage it might be useful for others to report any
feedback.

Simply upgrade the the newest develop branch of midas, and you will see two menu items
"OldHistory" which is the old system and "History" which is the new system. In the new
system, you can drag with the mouse to scroll, use the mouse wheel to zoom in and out the
time axis, and hover with your mouse over data points to see its value. If you zoom out,
old data is loaded automatically in the background.

Following items are planned, but not yet implemented:

- Printing of run markers as in the old history

- Delete old data in the buffer to limit memory consumption if the browser window is
open for very long (weeks)

- Implement time interval selector (clock icon, select "last day", "last 8 hours" etc.)

- New settings dialog as a floating dialog box. At the moment, the setting page of the
old history system is used

- Export / Printing / Sending to ELOG any history plot

- Implement a formula for plotting data, such as "y = 12 * (x-14) +32". This will replace
the old "offset" and "factor" and is more flexible. The formula can be passed directly
to the JavaScript engine and will be executed on the web page. It should be also
possible to combine different channels, like the difference of two history values.

- Determine the number of digits for variable display from the axis limits. Like if a value
changes between 520001 and 520002 only, we need more digits than the usual 6.

Many of these things will be implemented in the next weeks. If you have any more idea
or find some bugs, please report back to me.

Best,
Stefan for the midas team

06 Sep 2019, Andreas Suter, Info, New history plot facility

I like the new history system very much, but I stumbled over a couple of issues.
I used the version "Thu Aug 29 08:24:29 2019 +0200 -
midas-2019-06-b-244-gdd6585bb on branch develop":

1) it would be nice to have an option to format the label output (see attachment 1)

2) the background of a history plot is very handy if you only show one measure.
If you have multiple ones (see attachment 2), this is not the case anymore. It
would be nice if the background could be enabled/disabled.

> During my visit at TRIUMF we rewrote the history plotting functionality of
midas. Instead of 
> static GIF images, we have now interactive JavaScript panels where we can
scroll, zoom, 
> inspect values and much more (example is attached). We are now in a state
where this is still 
> work in progress, but already at this stage it might be useful for others to
report any 
> feedback.
> 
> Simply upgrade the the newest develop branch of midas, and you will see two
menu items 
> "OldHistory" which is the old system and "History" which is the new system. In
the new 
> system, you can drag with the mouse to scroll, use the mouse wheel to zoom in
and out the 
> time axis, and hover with your mouse over data points to see its value. If you
zoom out, 
> old data is loaded automatically in the background.
> 
> Following items are planned, but not yet implemented:
> 
> - Printing of run markers as in the old history
> 
> - Delete old data in the buffer to limit memory consumption if the browser
window is 
>    open for very long (weeks)
> 
> - Implement time interval selector (clock icon, select "last day", "last 8
hours" etc.)
> 
> - New settings dialog as a floating dialog box. At the moment, the setting
page of the 
>    old history system is used
> 
> - Export / Printing / Sending to ELOG any history plot
> 
> - Implement a formula for plotting data, such as "y = 12 * (x-14) +32". This
will replace 
>    the old "offset" and "factor" and is more flexible. The formula can be
passed directly 
>    to the JavaScript engine and will be executed on the web page. It should be
also 
>    possible to combine different channels, like the difference of two history
values.
> 
> - Determine the number of digits for variable display from the axis limits.
Like if a value 
>    changes between 520001 and 520002 only, we need more digits than the usual 6.
> 
> Many of these things will be implemented in the next weeks. If you have any
more idea 
> or find some bugs, please report back to me.
> 
> Best,
> Stefan for the midas team

06 Sep 2019, Stefan Ritt, Info, New history plot facility

> 1) it would be nice to have an option to format the label output (see attachment 1)

That's clearly a bug, I will fix it.

 
> 2) the background of a history plot is very handy if you only show one measure.
> If you have multiple ones (see attachment 2), this is not the case anymore. It
> would be nice if the background could be enabled/disabled.

Looking at your plot, even without the background things look messy. Please note
that you can display only a single curve by double clicking on it (and back with Escape).
If all curves are on top of each other, you can get them apart a bit by zooming
in to the vertical axis, then double click. Let ma know if that works for you.

Best regards,
Stefan

06 Sep 2019, Andreas Suter, Info, New history plot facility

> > 2) the background of a history plot is very handy if you only show one measure.
> > If you have multiple ones (see attachment 2), this is not the case anymore. It
> > would be nice if the background could be enabled/disabled.
> 
> Looking at your plot, even without the background things look messy. Please note
> that you can display only a single curve by double clicking on it (and back with Escape).
> If all curves are on top of each other, you can get them apart a bit by zooming
> in to the vertical axis, then double click. Let ma know if that works for you.

This I found out, yet the attachment here shows another case where it would be useful to be
able to disable the background, namely if you have positive and negative measures in one
plot. Somehow it suggests that CH1 and CH2 show very different values, whereas it is only a
difference in the sign of this variables.

It's not all the important but I would like to mention this is the early stage before
everything is fully frozen.

07 Sep 2019, Stefan Ritt, Info, New history plot facility

> This I found out, yet the attachment here shows another case where it would be useful to be
> able to disable the background, namely if you have positive and negative measures in one
> plot. Somehow it suggests that CH1 and CH2 show very different values, whereas it is only a
> difference in the sign of this variables.

Ok, I added an option which lets you switch off the background. 

I also changed the background drawing such that it only goes to the y=0 axis, not the bottom of the screen. 
That should help displaying negative values.

Stefan

08 Sep 2019, Stefan Ritt, Info, New history plot facility

> 1) it would be nice to have an option to format the label output (see attachment 1)

I fixed that in the current version.

Stefan

10 Sep 2019, Andreas Suter, Info, New history plot facility

Our typical use case is that a lot of people are connected to the experiment
having some history tabs open most of the time. Hence, I setup a test system and
connect to it from all kind of systems/browsers. What I see currently quite
often is the error hs_read_arraybuffer (see the attachement).

For firefox 60.8.0esr this can result into a full freeze of the tab and only
closing it is possible.

For chromium based browsers you eventually get a popup informing that it is not
responsive anymore.

The worst though is safari 12.1.2 which not only freezes the tab, but
reproducibly crashes the mhttpd on the server side.

Are there ways to get a log which would document where the problems start?

16 Sep 2019, Konstantin Olchanski, Info, New history plot facility

> I see currently quite often is the error hs_read_arraybuffer (see the 
attachement).
> Are there ways to get a log which would document where the problems 
start?
> [also crash of mhttpd]

We can debug it from both ends, javascript and mhttpd:

On the web page, the error message says "see javascript console", do you see 
anything there?

Or the tab is so hung-up that you cannot even access the console? In this 
case, can you open the console before running your test?

In some browsers (firefox, google-chrome) this will also activate the javascript 
debugger and as likely as not will make the bug go away (ouch!)

On the mhttpd side, please capture the stack trace from the crash: enable 
core dumps (ODB "/experiment/enable core dumps" set to "y", after the crash, 
run "ls -l core.*; gdb mhttpd core.9999") or run mhttpd inside gdb or attach 
gdb to a running mhttpd (gdb -p 9999). Once in gdb, run "info thr" to list all 
threads, "thr 0; bt", "thr 1; bt", etc to get stack traces from all threads, only 
one of them contains the crash (tedious!).

Email me the stack trace (or post here), in case we want to look at values
of any variables from the crash, keep the core dump and do not rebuild 
mhttpd.

K.O.

17 Sep 2019, Andreas Suter, Info, New history plot facility

> On the mhttpd side, please capture the stack trace from the crash: enable 
> core dumps (ODB "/experiment/enable core dumps" set to "y", after the crash, 
> run "ls -l core.*; gdb mhttpd core.9999") or run mhttpd inside gdb or attach 
> gdb to a running mhttpd (gdb -p 9999). Once in gdb, run "info thr" to list all 
> threads, "thr 0; bt", "thr 1; bt", etc to get stack traces from all threads, only 
> one of them contains the crash (tedious!).
> 
> Email me the stack trace (or post here), in case we want to look at values
> of any variables from the crash, keep the core dump and do not rebuild 
> mhttpd.
> 
> K.O.

here comes the stack trace (only happens when using safari 12.1.2 macOS 10.14.6):

(gdb) thr 1
[Switching to thread 1 (Thread 0x7f57ceffd700 (LWP 3538))]
#0  0x00007f57f29fe377 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007f57f29fe377 in raise () from /lib64/libc.so.6
#1  0x00007f57f29ffa68 in abort () from /lib64/libc.so.6
#2  0x00007f57f330e7d5 in __gnu_cxx::__verbose_terminate_handler() () from
/lib64/libstdc++.so.6
#3  0x00007f57f330c746 in ?? () from /lib64/libstdc++.so.6
#4  0x00007f57f330c773 in std::terminate() () from /lib64/libstdc++.so.6
#5  0x00007f57f330c993 in __cxa_throw () from /lib64/libstdc++.so.6
#6  0x00007f57f330cf2d in operator new(unsigned long) () from /lib64/libstdc++.so.6
#7  0x00007f57f336ba19 in std::string::_Rep::_S_create(unsigned long, unsigned long,
std::allocator<char> const&)
    () from /lib64/libstdc++.so.6
#8  0x00007f57f336c62b in std::string::_Rep::_M_clone(std::allocator<char> const&,
unsigned long) ()
   from /lib64/libstdc++.so.6
#9  0x00007f57f336ccfc in std::basic_string<char, std::char_traits<char>,
std::allocator<char> >::basic_string(std::string const&) () from /lib64/libstdc++.so.6
#10 0x000000000041ce0f in check_digest_auth (hm=hm@entry=0x7f57ceffc520, auth=0x74b060
<auth_mg>)
    at /home/nemu/nemu/tmidas/midas/progs/mhttpd.cxx:17143
#11 0x0000000000452a61 in handle_http_message (msg=0x7f57ceffc520, nc=0x2019ca0,
this=<optimized out>, 
    this=<optimized out>, this=<optimized out>) at
/home/nemu/nemu/tmidas/midas/progs/mhttpd.cxx:17703
#12 handle_http_event_mg (nc=nc@entry=0x2019ca0, ev=ev@entry=100,
ev_data=ev_data@entry=0x7f57ceffc520)
    at /home/nemu/nemu/tmidas/midas/progs/mhttpd.cxx:17753
#13 0x0000000000464c4b in mg_call (nc=nc@entry=0x2019ca0, 
    ev_handler=0x4521f0 <handle_http_event_mg(mg_connection*, int, void*)>, ev=100, 
    ev_data=ev_data@entry=0x7f57ceffc520) at
/home/nemu/nemu/tmidas/midas/progs/mongoose6.cxx:2120
#14 0x000000000046790e in mg_http_call_endpoint_handler (nc=nc@entry=0x2019ca0,
ev=<optimized out>, 
    hm=hm@entry=0x7f57ceffc520) at /home/nemu/nemu/tmidas/midas/progs/mongoose6.cxx:4946
#15 0x0000000000467e3f in mg_http_handler (nc=nc@entry=0x2019ca0, ev=ev@entry=3, 
    ev_data=ev_data@entry=0x7f57ceffcb2c) at
/home/nemu/nemu/tmidas/midas/progs/mongoose6.cxx:5139
#16 0x0000000000464c4b in mg_call (nc=nc@entry=0x2019ca0, 
    ev_handler=0x467a20 <mg_http_handler(mg_connection*, int, void*)>,
ev_handler@entry=0x0, ev=ev@entry=3, 
    ev_data=ev_data@entry=0x7f57ceffcb2c) at
/home/nemu/nemu/tmidas/midas/progs/mongoose6.cxx:2120
#17 0x0000000000464fb7 in mg_recv_common (nc=nc@entry=0x2019ca0,
buf=buf@entry=0x7f57c0000cd0, len=len@entry=279)
    at /home/nemu/nemu/tmidas/midas/progs/mongoose6.cxx:2676
#18 0x00000000004659c8 in mg_if_recv_tcp_cb (len=279, buf=0x7f57c0000cd0, nc=0x2019ca0)
    at /home/nemu/nemu/tmidas/midas/progs/mongoose6.cxx:2680
#19 mg_read_from_socket (conn=0x2019ca0) at
/home/nemu/nemu/tmidas/midas/progs/mongoose6.cxx:3378
#20 mg_mgr_handle_conn (nc=0x2019ca0, fd_flags=1, now=now@entry=1568705761.3290441)
    at /home/nemu/nemu/tmidas/midas/progs/mongoose6.cxx:3511
#21 0x0000000000465ee0 in mg_mgr_poll (mgr=mgr@entry=0x7f57ceffcda0,
timeout_ms=timeout_ms@entry=1000)
    at /home/nemu/nemu/tmidas/midas/progs/mongoose6.cxx:3687
#22 0x0000000000466085 in per_connection_thread_function (param=0x2019ca0)
    at /home/nemu/nemu/tmidas/midas/progs/mongoose6.cxx:3805
#23 0x00007f57f39c7ea5 in start_thread () from /lib64/libpthread.so.0
#24 0x00007f57f2ac68cd in clone () from /lib64/libc.so.6

17 Sep 2019, Konstantin Olchanski, Info, New history plot facility

> > On the mhttpd side, please capture the stack trace from the crash
> 
> here comes the stack trace (only happens when using safari 12.1.2 macOS 10.14.6):
> 
> #10 0x000000000041ce0f in check_digest_auth ...
>

The crash is in check_digest_auth() which checks the mongoose web server password (if not using 
password protection from the https proxy i.e. apache httpd).

If so you should see this crash on all pages, not just when you access history pages, yes?

Ok, I just checked, my safari is "Version 12.1.2 (13607.3.10)" and I see no immediate crash, even on 
history pages.

But I am macos 10.13.6, maybe that makes a difference.

If you see the safari crash on all pages, then it is not history-specific.

In this case, I would like you to file a bug report on bitbucket "mhttpd crash with safari" and we follow up 
on it there.

K.O.

16 Sep 2019, Konstantin Olchanski, Info, New history plot facility

> During my visit at TRIUMF we rewrote the history plotting functionality of midas.

This is a most amazing achievement. We wanted to do this "for years" and I think we have
benefitted greatly from the delay - tools available for building interactive web graphics
have improved so much so recently.

For example, delivering binary data from mhttpd to javascript (avoiding json encoding and decoding
saves tons of CPU cycles) went from "how do I do this?!?" to "I did it in only 3 hours!".

> We are now in a state where this is still work in progress, but already at this stage it might
> be useful for others to report any feedback.

The old gif-based history plots took a lot of effort and a long time to get where they work well
for most experiments and where we are happy with them.

From the TRIUMF side of things, lots of polishing of the graphics and of the user interface came
through use at our bigger experiments - TWIST (TRIUMF), ALPHA (CERN), T2K/ND280 (Japan).

So, much improvement and polishing of the new graphics is still ahead for us.

> Simply upgrade the the newest develop branch of midas, and you will see two menu items 
> "OldHistory" which is the old system and "History" which is the new system.

I hope to start the new release branch for midas-2019-09 soon. For the release, we will try
to have both the old and the new history graphics to integrate smoothly. The old graphics
still has to work well, as some users may prefer the old graphics and the old user interface.

Also the new system is still incomplete, i.e. there is no trivial way to save a history plot into a file:

> Following items are planned, but not yet implemented:
> - Printing of run markers as in the old history
> - Export / Printing / Sending to ELOG any history plot

K.O.

16 Sep 2019, Stefan Ritt, Info, New history plot facility

>  Also the new system is still incomplete, i.e. there is no trivial way to save a history plot into a file:

That has been implemented in meantime. Just click on the download arrow and you can save the current window in CSV or PNG format.

Stefan

08 Sep 2019, Vinzenz Bildstein, Bug Report, https redirect and ODB access

I'm not sure if these issues are related or not, but I'm getting an error
message when I want to access the root of the ODB via the webserver:
[mhttpd,ERROR] [mhttpd.cxx:563:rread,ERROR] Cannot read file '/root', read of
4096 returned -1, errno 21 (Is a directory)

I also tried turning the re-direct from http to https off, but this does not
seem to work. I also noticed that the redirect changes the localhost into a
hostname. Where does mongoose take this hostname from?

EDIT: Seems that the change of the hostname is due to a setting in /etc/hosts,
i.e. all my fault ...

EDIT: I think there was some issue with the mhttpd. When I checked the output (I
used screen to run it), it was full of these messages:

ss_semaphore_wait_for: semop/semtimedop(2588679) returned -1, errno 43
(Identifier removed)
al_check: Something is wrong with our semaphore, ss_semaphore_wait_for()
returned 408, aborting.
al_check: Cannot abort - this will lock you out of odb. From this point, MIDAS
will not work correctly. Please read the discussion at
https://midas.triumf.ca/elog/Midas/945

Restarted it and it stopped redirecting. So accessing the root of the ODB via
the webserver is the only issue now.

16 Sep 2019, Konstantin Olchanski, Bug Report, https redirect and ODB access

> I'm not sure if these issues are related or not, but I'm getting an error
> message when I want to access the root of the ODB via the webserver:
> [mhttpd,ERROR] [mhttpd.cxx:563:rread,ERROR] Cannot read file '/root', read of
> 4096 returned -1, errno 21 (Is a directory)

This is an old bug. It was part of the "custom path" confusion. Fixed (I think) in all midas-2019 
releases.

To confirm, which version are you using (run "odbedit ver" or look on the mhttpd "help" page)?

If you have an older version, I recommend that you update to midas-2019-03 (cd midas; git pull; 
git checkout midas-2019-03; make clean; make).

If you feel adventurous, you can also update to the head of the development version
and see all the new features (cmake, c++11, new history pages).

If you do not feel adventurous, wait until we have midas-2019-09 ready, use midas-2019-03 
until then.

K.O.

07 Feb 2019, Stefan Ritt, Info, History panels in custom pages

A new tag has been implemented to display history panels in custom pages, integrated in the 
new custom page design from 2017. The full documentation can be found at 

https://midas.triumf.ca/MidasWiki/index.php/New_Custom_Pages_(2017)#mhistory

Attached is a simple example of such a panel and the result. You have to rename triggerrate.txt to triggerrate.html if you want to use it (can't do it here since otherwise the browser wants to render it which does not work outside of midas).

Stefan

08 Feb 2019, Thomas Lindner, Info, History panels in custom pages

> A new tag has been implemented to display history panels in custom pages, integrated in the 
> new custom page design from 2017. The full documentation can be found at 
> 

As part of consolidating/cleaning the MIDAS Wiki documentation, the "New Custom Pages" was folded into the main "Custom Page".  So to see a
description of Stefan's new functionality please go to 

https://midas.triumf.ca/MidasWiki/index.php/Custom_Page#mhistory

12 Sep 2019, Pintaudi Giorgio, Info, History panels in custom pages

> > A new tag has been implemented to display history panels in custom pages, integrated in the
> > new custom page design from 2017. The full documentation can be found at
> >
>
> As part of consolidating/cleaning the MIDAS Wiki documentation, the "New Custom Pages" was folded into the main "Custom Page". So to see a
> description of Stefan's new functionality please go to
>
> https://midas.triumf.ca/MidasWiki/index.php/Custom_Page#mhistory

Hello!

I am trying to use the new mhistory panels in the WAGASCI slow control custom page, but I cannot get them to work.
All I get is an empty frame. Anyway, in the History tab I can see the history plots correctly.

Here is a minimal example:

<html>
<head>
   <title>Test</title>
   <link rel="stylesheet" href="midas.css">
   <script src="controls.js"></script>
   <script src="midas.js"></script>
   <script src="mhttpd.js"></script>
</head>
<body class="mcss" onload="mhttpd_init('Test');">

<div id="mheader"></div>
<div id="msidenav"></div>

<div id="mmain">
  <div name="mhistory" data-group="Test" data-panel="Test" data-scale="1m" style="width:600px;border:1px solid black;"></div>
</div>
</body>
</html>

Of course, the "Test" group and "Test" panel exist in the ODB and are correctly shown in the History tab. No error is shown in the console of the web browser.
I am using the latest version of MIDAS as of September 12.

Can you confirm that this feature is working in the latest MIDAS? If yes, how can I troubleshoot the problem?

Regards
Giorgio

12 Sep 2019, Stefan Ritt, Info, History panels in custom pages

Indeed there was a bug in some JavaScript code, which I fixed here: https://bitbucket.org/tmidas/midas/commits/d2b1a783240e252820c622001e15c09c5d7798c0

Note that your code will bring you the "old style" history panels (with GIF images). If you want the new style (interactive canvas panels), you need the following:

1) Add

<script src="mhistory.js"></Script>

to the top of your custom page

2) Add "mhistory_init();" to the "onload" function of your page, like

<body class="mcss" onloas="mhttpd_init('Example');mhistory_init();">

3) Change the class of the panel from "mhistory" to "mjhistory", like

<div class="mjshistory" data-group=...>

Best regards,
Stefan

13 Sep 2019, Pintaudi Giorgio, Info, History panels in custom pages

Dear Stefan,
thank you very much for the prompt reply. Your suggestions worked wonderfully. Now I can display all the plots that I want where I want.
The new JavaScript history plots are really a huge improvement over the old ones.
Thank you again
Giorgio

Stefan Ritt wrote:

07 Aug 2019, Paolo Baesso, Bug Report, ROOTANA bug?

Hi,

I posted on the ROOTANA elog but there seems to be little activity there...

Could someone confirm if this is a bug?
https://midas.triumf.ca/elog/Rootana/14

Another user replied that they are encountering the same issue, so I think it is unlikely it is just our installation.

While ROOTANA is unusable for us, I tried to use the example Frontend and Analyzer (under the Experiment source folder). The analyzer does not seem to do much though. A root file is produced but nothing is placed into it. Is that normal?

Any help would be welcome.

07 Aug 2019, Thomas Lindner, Bug Report, ROOTANA bug?

Hi Paolo,

Sorry for the slow response.  We were discussing this with Konstantin yesterday.  He is aware of the problem now and will be working on a solution soon.

In the short term I found that it works if you just comment out the offending line:

indnerlt:rootana lindner$ git diff libMidasInterface/TMidasOnline.cxx
diff --git a/libMidasInterface/TMidasOnline.cxx b/libMidasInterface/TMidasOnline.cxx
index 92eb3e9..67da613 100644
--- a/libMidasInterface/TMidasOnline.cxx
+++ b/libMidasInterface/TMidasOnline.cxx
@@ -191,7 +191,7 @@ bool TMidasOnline::sleep(int mdelay)
   #ifdef CH_IPC
   ss_suspend_set_dispatch(CH_IPC, 0, NULL);
   #else
-  ss_suspend_set_dispatch_ipc(NULL);
+  //  ss_suspend_set_dispatch_ipc(NULL);
   #endif
  int status = ss_suspend(mdelay, 0);
   if (status == SS_SUCCESS)

This compiles and at least runs for me; so maybe that is helpful for you.  But Konstantin will provide a longer term solution.



> Hi,
> 
> I posted on the ROOTANA elog but there seems to be little activity there...
> 
> Could someone confirm if this is a bug?
> https://midas.triumf.ca/elog/Rootana/14
> 
> Another user replied that they are encountering the same issue, so I think it is unlikely it is just our installation.
> 
> While ROOTANA is unusable for us, I tried to use the example Frontend and Analyzer (under the Experiment source folder). The analyzer does not seem to do much though. A root file is produced but nothing is placed into it. Is that normal?
> 
> Any help would be welcome.

08 Aug 2019, Lauren Manton, Bug Report, ROOTANA bug?

Hi,

Thank you, commenting out the line worked and we can now compile the code. However, when we try to run ana.exe or anaDisplay.exe, we get the following errors:

Error in <TCling::RegisterModule>: cannot find dictionary module TMainDisplayWindowDict_rdict.pcm
Error in <TCling::RegisterModule>: cannot find dictionary module TRootanaDisplayDict_rdict.pcm
Error in <TCling::RegisterModule>: cannot find dictionary module TFancyHistogramCanvasDict_rdict.pcm
 

We see that the files are in /rootana/obj but we cannot find a way to point the compiler to them.

Could you please advise how to proceed,

Many thanks

> Hi Paolo,
> 
> Sorry for the slow response.  We were discussing this with Konstantin yesterday.  He is aware of the problem now and will be working on a solution soon.
> 
> In the short term I found that it works if you just comment out the offending line:
> 
> indnerlt:rootana lindner$ git diff libMidasInterface/TMidasOnline.cxx
> diff --git a/libMidasInterface/TMidasOnline.cxx b/libMidasInterface/TMidasOnline.cxx
> index 92eb3e9..67da613 100644
> --- a/libMidasInterface/TMidasOnline.cxx
> +++ b/libMidasInterface/TMidasOnline.cxx
> @@ -191,7 +191,7 @@ bool TMidasOnline::sleep(int mdelay)
>    #ifdef CH_IPC
>    ss_suspend_set_dispatch(CH_IPC, 0, NULL);
>    #else
> -  ss_suspend_set_dispatch_ipc(NULL);
> +  //  ss_suspend_set_dispatch_ipc(NULL);
>    #endif
>   int status = ss_suspend(mdelay, 0);
>    if (status == SS_SUCCESS)
> 
> This compiles and at least runs for me; so maybe that is helpful for you.  But Konstantin will provide a longer term solution.
> 
> 
> 
> > Hi,
> > 
> > I posted on the ROOTANA elog but there seems to be little activity there...
> > 
> > Could someone confirm if this is a bug?
> > https://midas.triumf.ca/elog/Rootana/14
> > 
> > Another user replied that they are encountering the same issue, so I think it is unlikely it is just our installation.
> > 
> > While ROOTANA is unusable for us, I tried to use the example Frontend and Analyzer (under the Experiment source folder). The analyzer does not seem to do much though. A root file is produced but nothing is placed into it. Is that normal?
> > 
> > Any help would be welcome.

08 Aug 2019, Konstantin Olchanski, Bug Report, ROOTANA bug?

> indnerlt:rootana lindner$ git diff libMidasInterface/TMidasOnline.cxx
> diff --git a/libMidasInterface/TMidasOnline.cxx b/libMidasInterface/TMidasOnline.cxx
> index 92eb3e9..67da613 100644
> --- a/libMidasInterface/TMidasOnline.cxx
> +++ b/libMidasInterface/TMidasOnline.cxx
> @@ -191,7 +191,7 @@ bool TMidasOnline::sleep(int mdelay)
>    #ifdef CH_IPC
>    ss_suspend_set_dispatch(CH_IPC, 0, NULL);
>    #else
> -  ss_suspend_set_dispatch_ipc(NULL);
> +  //  ss_suspend_set_dispatch_ipc(NULL);
>    #endif
>   int status = ss_suspend(mdelay, 0);
>    if (status == SS_SUCCESS)
> 
> This compiles and at least runs for me; so maybe that is helpful for you.  But Konstantin will provide a longer term solution.


This is a problem with the latest development version of MIDAS. ss_suspend overrides have been removed from system.cxx
and there is no way for rootana to avoid the problem of recursive call to the event handler.

I recommend that instead of using the latest development version of MIDAS you use one of the recent released versions (use "git tag -l", midas-2019-06-b is the latest release).

All the released versions of midas have the ss_suspend overrides implemented and rootana will work correctly. For the next release of midas
I will restore the ss_suspend override and update the rootana code.


K.O.

14 Aug 2019, Konstantin Olchanski, Bug Report, ROOTANA bug?

> -  ss_suspend_set_dispatch_ipc(NULL);
> +  //  ss_suspend_set_dispatch_ipc(NULL);
> 
> This compiles and at least runs for me; so maybe that is helpful for you.  But Konstantin will provide a longer term solution.

I now understand why this fix worked. Around December 2018 timeframe, I reworked the MIDAS event buffer code
and one improvement was to only send UDP buffer notifications if somebody is waiting for them. This probably
reduced to zero the probability of recursive calls to the user event handler - the problem originally fixed by the monkey
work against the midas ipc handler.

After looking at it, I now understand that the correct solution is to call ss_suspend(MSG_BM), but it turns out
inside MIDAS, handling of MSG_BM was incomplete and recursive calls to the user event handler were still
possible. (but most likely not actually happening anymore because of those changes to the event buffer code).

So.

a) ss_suspend(MSG_BM) inside midas now works correctly, recursive call to the user event handler will not happen.
b) TMidasOnline::sleep() now calls ss_suspend(MSG_BM), monkey business with ss_suspend_set_dispatch_ipc() is removed.

The problem of recursive call to the analyzer event handler is now fixed, both rootana and manalyzer (both use the same TMidasOnline code).

Read more about this here:
https://midas.triumf.ca/elog/Midas/1663

K.O.

14 Aug 2019, Konstantin Olchanski, Bug Fix, incorrect recursion in ss_suspend() via the user event handler

The ROOTANA midas analyzer uncovered a problem with recursive use of ss_suspend().

When running in graphical mode, the ROOT graphics main event loop was calling 
ss_suspend(0) (no MSG_BM) which would sometimes call the user event handler (if a new 
event arrives into the midas event buffer). Because this loop was already running in the 
user event handler, there was a crash.

This is the calling sequence leading to the incorrect recursion: (from system.cxx comments 
to ss_suspend())

analyzer ->
     -> cm_yield() in the main loop
     -> ss_suspend(0)
     -> MSG_BM message arrives in the UDP socket
     -> ss_suspend_process_ipc(0)
     -> cm_dispatch_ipc()
     -> bm_push_event()
     -> bm_push_buffer()
     -> bm_read_buffer()
     -> bm_dispatch_event()
     -> user event handler
     -> user event handler ROOT graphics main loop needs to sleep
     -> ss_suspend(0) <--- should be ss_suspend(MSG_BM)!!!     
     -> MSG_BM message arrives in the UDP socket
     -> ss_suspend_process_ipc(0) <- should be ss_suspend_process_ipc(MSG_BM)!!!
     -> cm_dispatch_ipc() <- without MSG_BM, calling cm_dispatch_ipc() again
     -> bm_push_event()
     -> bm_push_buffer()
     -> bm_read_buffer()
     -> bm_dispatch_event()
     -> user event handler <---- called recursively, very bad!

The proper fix for this is to always call ss_suspend(MSG_BM) from the user event handler 
and ss_suspend(MSG_ODB) from the user db_watch handler.

In this second case, ss_suspend(MSG_OBM) will lose/ignore/discard db_watch notifications, 
if you do not want that, call ss_suspend(0) and be ready for recursive calls to your 
db_watch handler. (this can happen in a frontend program that acts on changes in ODB and 
where some of these actions may require sleeping via ss_suspend()).

ss_suspend(MSG_BM) will discard MSG_BM messages, which is not a problem as 
bm_wait_for_events() and cm_yield() will immediately poll the event buffer and there will be 
no delay in receiving new events.

Until commit 99d6e90 there were problems in ss_suspend(MSG_BM) and recursive calls to 
the user event handler were still possible. It is now fixed in this and the previous commits.

K.O.

05 Aug 2019, Stefan Ritt, Info, Precedence of equipment/common structure

Today I fixed a long-annoying problem. We have in each front-end an equipment structure 
which defined the event id, event type, readout frequency etc. This is mapped to the ODB 
subtree

/Equipment/<name>/Common

In the past, the ODB setting took precedence over the frontend structure. We defined this 
like 25 years ago and I forgot what the exact reason was. It causes however many people 
(including myself) to fall into this trap: You change something in the front-end EQUIPMENT 
structure, you restart the front-end, but the new setting does not take effect since the 
(old) ODB value took precedence. After some debugging you find out that you have to both 
change the EQUIPMENT structure (which defines the default value for a fresh ODB) and the 
ODB value itself.

So I changed it in the current develop tree that the front-end structure takes precedence. 
You still have a hot-link, so if you want to change anything while the front-end is running 
(like the readout period), you can do that in the ODB and it takes effect immediately. But 
when you start the front-end the next time, the value from the EQUIPMENT structure is 
taken again. So please be aware of this new feature.

Happy BC day,
Stefan

06 Aug 2019, Thomas Lindner, Info, Precedence of equipment/common structure

Hi Stefan,

This change does not sound like a good idea to me.  I think that this change will cause just as much confusion as before; probably more since you are changing established behaviour.

It is common that MIDAS frontends usually have a Settings directory in the ODB where details about the frontend behaviour are set.  The Settings directory might get initialized from strings in the frontend code, but after initialization the Settings in the ODB have precedence and define how the frontend will behave.  Indeed, most of my custom webpages are designed to control my frontend programs through their Settings ODB tree.

So you have created a situation where Frontend/Settings in the ODB has precedence and is the main place for changing frontend behaviour; but Frontend/Common in the ODB is essentially meaningless and will get overwritten the next time the frontend restarts.  That seems likely to confuse people. 

If you really want to make this change I suggest that you delete the Frontend/Common directory entirely; or make it read-only so that people aren't fooled into changing it.

Thomas 



> Today I fixed a long-annoying problem. We have in each front-end an equipment structure 
> which defined the event id, event type, readout frequency etc. This is mapped to the ODB 
> subtree
> 
> /Equipment/<name>/Common
> 
> In the past, the ODB setting took precedence over the frontend structure. We defined this 
> like 25 years ago and I forgot what the exact reason was. It causes however many people 
> (including myself) to fall into this trap: You change something in the front-end EQUIPMENT 
> structure, you restart the front-end, but the new setting does not take effect since the 
> (old) ODB value took precedence. After some debugging you find out that you have to both 
> change the EQUIPMENT structure (which defines the default value for a fresh ODB) and the 
> ODB value itself.
> 
> So I changed it in the current develop tree that the front-end structure takes precedence. 
> You still have a hot-link, so if you want to change anything while the front-end is running 
> (like the readout period), you can do that in the ODB and it takes effect immediately. But 
> when you start the front-end the next time, the value from the EQUIPMENT structure is 
> taken again. So please be aware of this new feature.
> 
> Happy BC day,
> Stefan

06 Aug 2019, Stefan Ritt, Info, Precedence of equipment/common structure

Hi Thomas,

the change only affects Eqipment/<name>/common not the Equipment/<name>/Settings. 

The Common subtree is still hot-linked into the frontend, so when running things can be changed if needed. This mainly concerns the readout period of periodic events. 
Sometimes you want to change this quickly without restarting the frontend. Changing the other settings are kind of dangerous. If you change the ID of an event on the fly
you won't be able to analyze your data. So having this read-only in the ODB might be a good idea (you still need it in the ODB for the status page), except for the values
you want to change (like the readout period). 

Let's see what other people have to say.

Stefan

> Hi Stefan,
> 
> This change does not sound like a good idea to me.  I think that this change will cause just as much confusion as before; probably more since you are changing established behaviour.
> 
> It is common that MIDAS frontends usually have a Settings directory in the ODB where details about the frontend behaviour are set.  The Settings directory might get initialized from strings in the frontend code, but after initialization the Settings in the ODB have precedence and define how the frontend will behave.  Indeed, most of my custom webpages are designed to control my frontend programs through their Settings ODB tree.
> 
> So you have created a situation where Frontend/Settings in the ODB has precedence and is the main place for changing frontend behaviour; but Frontend/Common in the ODB is essentially meaningless and will get overwritten the next time the frontend restarts.  That seems likely to confuse people. 
> 
> If you really want to make this change I suggest that you delete the Frontend/Common directory entirely; or make it read-only so that people aren't fooled into changing it.
> 
> Thomas 
> 
> 
> 
> > Today I fixed a long-annoying problem. We have in each front-end an equipment structure 
> > which defined the event id, event type, readout frequency etc. This is mapped to the ODB 
> > subtree
> > 
> > /Equipment/<name>/Common
> > 
> > In the past, the ODB setting took precedence over the frontend structure. We defined this 
> > like 25 years ago and I forgot what the exact reason was. It causes however many people 
> > (including myself) to fall into this trap: You change something in the front-end EQUIPMENT 
> > structure, you restart the front-end, but the new setting does not take effect since the 
> > (old) ODB value took precedence. After some debugging you find out that you have to both 
> > change the EQUIPMENT structure (which defines the default value for a fresh ODB) and the 
> > ODB value itself.
> > 
> > So I changed it in the current develop tree that the front-end structure takes precedence. 
> > You still have a hot-link, so if you want to change anything while the front-end is running 
> > (like the readout period), you can do that in the ODB and it takes effect immediately. But 
> > when you start the front-end the next time, the value from the EQUIPMENT structure is 
> > taken again. So please be aware of this new feature.
> > 
> > Happy BC day,
> > Stefan

06 Aug 2019, Stefan Ritt, Info, Precedence of equipment/common structure

After some internal discussion, I decided to undo my previous change again, in order not to break existing habits. Instead, I created a new function

set_odb_equipment_common(equipment, name);

which should be called from frontend_init() which explicitly copies all data from the equipment structure in the front-end into the ODB.

Stefan

09 Aug 2019, Konstantin Olchanski, Info, Precedence of equipment/common structure

> Today I fixed a long-annoying problem. ...
> /Equipment/<name>/Common
> In the past, the ODB setting took precedence over the frontend structure...
> We defined this like 25 years ago and I forgot what the exact reason was.
> It causes however many people (including myself) to fall into this trap: ...

There is good number of confusions regarding entries in /eq/xxx/common:

- for some of them, the frontend code settings take precedence and overwrite settings in odb ("frontend file name")
- for some of them, ODB takes precedence and frontend code values are ignored ("read on" and "period")
- for some of them, changes in ODB take effect immediately (via db_watch) ("period")
- for some of them, frontend restart is required for changes to take effect (output event buffer name "buffer")
- some of them continuously update the odb values ("status", "status color")

I do not think there is a simple way to improve on this.

(One solution would replace the single "common" with several subdirectories, "per function",
one would have items where the code takes precedence, one would have items where odb takes
precedence (in effect, "standard settings"), one will have items that the frontend always updates
and that should not be changes via odb ("frontend name", etc). I am not sure this one solution
is necessarily an "improvement").

Lacking any ideas for improvements, I vote for the status quo. (plus a review of the documentation to ensure we have clearly
written up what each entry in "common" does and whether the user is permitted to edit it in odb).

K.O.

13 Aug 2019, Stefan Ritt, Info, Precedence of equipment/common structure

> Lacking any ideas for improvements, I vote for the status quo. (plus a review of the documentation to ensure we have clearly
> written up what each entry in "common" does and whether the user is permitted to edit it in odb).

I agree with that.

Stefan

22 Jul 2019, Hassan, Bug Report, Fetest History Plot

Hi,

We've been trying to run Fetest in the attempt of plotting the sine wave data on
the history page on the web server. However each time we've tried running a new
plot we have come across the error of 'no data' from the variables. In the
status page we are clearly obtaining data from the frontend and it is updating
the variable as expected in SLOW.

When setting up MIDAS we managed to produce a history plot from Fetest but are
unable to do so any longer. We did have a go at modifying the Fetest code but
created a backup before doing so and are now running the original backup.

What could be causing the Fetest data not to be showing in the history plot?

24 Jul 2019, Pierre-Andre Amaudruz, Bug Report, Fetest History Plot

> Hi,
> 
> We've been trying to run Fetest in the attempt of plotting the sine wave data on
> the history page on the web server. However each time we've tried running a new
> plot we have come across the error of 'no data' from the variables. In the
> status page we are clearly obtaining data from the frontend and it is updating
> the variable as expected in SLOW.
> 
> When setting up MIDAS we managed to produce a history plot from Fetest but are
> unable to do so any longer. We did have a go at modifying the Fetest code but
> created a backup before doing so and are now running the original backup.
> 
> What could be causing the Fetest data not to be showing in the history plot?

Is the logger running? (this application is handling the history data).
If yes: Did you change the variable names? If yes: restart the logger.

26 Jul 2019, Hassan, Bug Report, Fetest History Plot

Hi, our logger was running. I have tried restarting mlogger (even though we haven't
changed variable names). We ran the following commands one after another and still no
luck with history plot. Is there anything else that could be causing these problems?

Kind regards,
Hassan 

==================================================================================

[lm17773@it038146 ~]$ cd /opt/midas_software/midas/bin/
[lm17773@it038146 bin]$ mhttpd
[mhttpd,ERROR] [odb.cxx:1646:db_open_database,ERROR] Removed ODB client 'mhttpd',
index 0 because process pid 20094 does not exists
[mhttpd,ERROR] [odb.cxx:1646:db_open_database,ERROR] Removed ODB client 'Logger',
index 1 because process pid 20214 does not exists
[mhttpd,INFO] Removed open record flag from "/Experiment/Security/RPC hosts/Allowed hosts"
[mhttpd,INFO] Removed exclusive access mode from "/Experiment/Security/RPC
hosts/Allowed hosts"
[mhttpd,INFO] Removed open record flag from "/Experiment/Security/mhttpd hosts/Allowed
hosts"
[mhttpd,INFO] Removed exclusive access mode from "/Experiment/Security/mhttpd
hosts/Allowed hosts"
[mhttpd,INFO] Removed open record flag from "/Logger/History"
[mhttpd,INFO] Removed exclusive access mode from "/Logger/History"
[mhttpd,INFO] Removed open record flag from "/Sequencer/State"
[mhttpd,INFO] Removed exclusive access mode from "/Sequencer/State"
[mhttpd,INFO] Removed open record flag from "/History/LoggerHistoryChannel"
[mhttpd,INFO] Removed exclusive access mode from "/History/LoggerHistoryChannel"
[mhttpd,INFO] Removed open record flag from "/Equipment/slow/Variables"
[mhttpd,INFO] Removed exclusive access mode from "/Equipment/slow/Variables"
[mhttpd,INFO] Removed open record flag from "/Equipment/Trigger/Statistics/Events per
sec."
[mhttpd,INFO] Removed exclusive access mode from "/Equipment/Trigger/Statistics/Events
per sec."
[mhttpd,INFO] Removed open record flag from "/Equipment/Trigger/Statistics/kBytes per
sec."
[mhttpd,INFO] Removed exclusive access mode from "/Equipment/Trigger/Statistics/kBytes
per sec."
[mhttpd,INFO] Removed open record flag from "/Equipment/Periodic/Variables"
[mhttpd,INFO] Removed exclusive access mode from "/Equipment/Periodic/Variables"
[mhttpd,INFO] Removed open record flag from "/Equipment/Scaler/Variables"
[mhttpd,INFO] Removed exclusive access mode from "/Equipment/Scaler/Variables"
[mhttpd,INFO] Corrected 10 ODB entries
[mhttpd,INFO] Deleted entry '/System/Clients/20094' for client 'mhttpd' because it is
not connected to ODB
Mongoose web server will use SSL certificate file "/home/lm17773/online/ssl_cert.pem"
Mongoose web server will use authentication realm "sampleexpt", password file
"/home/lm17773/online/htpasswd.txt"
mongoose web server is redirecting HTTP port 8080 to
https://it038146.users.bris.ac.uk:8443
mongoose web server is listening on the HTTP port 8080
mongoose web server is listening on the HTTPS port 8443
====================================================================================

[lm17773@it038146 bin]$ mlogger
[Logger,INFO] Deleted entry '/System/Clients/20214' for client 'Logger' because it is
not connected to ODB
Log     directory is /home/lm17773/online/
Data    directory is same as Log unless specified in /Logger/channels/
History directory is same as Log unless specified in /Logger/history/
ELog    directory is same as Log
SQL     database is localhost/sampleexpt/Runlog
MIDAS logger started. Stop with "!"
====================================================================================
[lm17773@it038146 bin]$ fetest
Frontend name          :     fetest
Event buffer size      :     10485760
User max event size    :     4194304
User max frag. size    :     4194304
# of events per buffer :     2

Connect to experiment sampleexpt...
OK
Init hardware...frontend_init!
Event size set to 10240 bytes
Ring buffer wait sleep 1 ms
OK
time 1564131394, data 97.814758
time 1564131395, data 96.592583
time 1564131396, data 95.105652
time 1564131397, data 93.358040
time 1564131398, data 91.354546
time 1564131399, data 89.100655
time 1564131400, data 86.602539
time 1564131401, data 83.867058
time 1564131402, data 80.901703
time 1564131403, data 77.714592
Warning: bank RND4 has zero size
time 1564131404, data 74.314484
time 1564131405, data 70.710678
time 1564131406, data 66.913063
time 1564131407, data 62.932041
====================================================================================






> > Hi,
> > 
> > We've been trying to run Fetest in the attempt of plotting the sine wave data on
> > the history page on the web server. However each time we've tried running a new
> > plot we have come across the error of 'no data' from the variables. In the
> > status page we are clearly obtaining data from the frontend and it is updating
> > the variable as expected in SLOW.
> > 
> > When setting up MIDAS we managed to produce a history plot from Fetest but are
> > unable to do so any longer. We did have a go at modifying the Fetest code but
> > created a backup before doing so and are now running the original backup.
> > 
> > What could be causing the Fetest data not to be showing in the history plot?
> 
> Is the logger running? (this application is handling the history data).
> If yes: Did you change the variable names? If yes: restart the logger.

09 Aug 2019, Konstantin Olchanski, Bug Report, Fetest History Plot

> Hi, our logger was running.


One more thing is to check that the history files are actually being written to. Be default
the history files are written into the same directory where you have ODB, etc and
the file names are "*.hst".

Second thing, you can run "mlogger -v" and it will report that it is writing history events
into the history.

If all these things are happening,

Third, you can run "mhist" to see the data directly from the data files.

If all of these work, but you still get nothing in mhttpd, that would be very strange. If I remember
right, to see the sine wave, the history variable to plot is equipment "slow", variable "slow".


K.O.



> I have tried restarting mlogger (even though we haven't
> changed variable names). We ran the following commands one after another and still no
> luck with history plot. Is there anything else that could be causing these problems?
> 
> Kind regards,
> Hassan 
> 
> ==================================================================================
> 
> [lm17773@it038146 ~]$ cd /opt/midas_software/midas/bin/
> [lm17773@it038146 bin]$ mhttpd
> [mhttpd,ERROR] [odb.cxx:1646:db_open_database,ERROR] Removed ODB client 'mhttpd',
> index 0 because process pid 20094 does not exists
> [mhttpd,ERROR] [odb.cxx:1646:db_open_database,ERROR] Removed ODB client 'Logger',
> index 1 because process pid 20214 does not exists
> [mhttpd,INFO] Removed open record flag from "/Experiment/Security/RPC hosts/Allowed hosts"
> [mhttpd,INFO] Removed exclusive access mode from "/Experiment/Security/RPC
> hosts/Allowed hosts"
> [mhttpd,INFO] Removed open record flag from "/Experiment/Security/mhttpd hosts/Allowed
> hosts"
> [mhttpd,INFO] Removed exclusive access mode from "/Experiment/Security/mhttpd
> hosts/Allowed hosts"
> [mhttpd,INFO] Removed open record flag from "/Logger/History"
> [mhttpd,INFO] Removed exclusive access mode from "/Logger/History"
> [mhttpd,INFO] Removed open record flag from "/Sequencer/State"
> [mhttpd,INFO] Removed exclusive access mode from "/Sequencer/State"
> [mhttpd,INFO] Removed open record flag from "/History/LoggerHistoryChannel"
> [mhttpd,INFO] Removed exclusive access mode from "/History/LoggerHistoryChannel"
> [mhttpd,INFO] Removed open record flag from "/Equipment/slow/Variables"
> [mhttpd,INFO] Removed exclusive access mode from "/Equipment/slow/Variables"
> [mhttpd,INFO] Removed open record flag from "/Equipment/Trigger/Statistics/Events per
> sec."
> [mhttpd,INFO] Removed exclusive access mode from "/Equipment/Trigger/Statistics/Events
> per sec."
> [mhttpd,INFO] Removed open record flag from "/Equipment/Trigger/Statistics/kBytes per
> sec."
> [mhttpd,INFO] Removed exclusive access mode from "/Equipment/Trigger/Statistics/kBytes
> per sec."
> [mhttpd,INFO] Removed open record flag from "/Equipment/Periodic/Variables"
> [mhttpd,INFO] Removed exclusive access mode from "/Equipment/Periodic/Variables"
> [mhttpd,INFO] Removed open record flag from "/Equipment/Scaler/Variables"
> [mhttpd,INFO] Removed exclusive access mode from "/Equipment/Scaler/Variables"
> [mhttpd,INFO] Corrected 10 ODB entries
> [mhttpd,INFO] Deleted entry '/System/Clients/20094' for client 'mhttpd' because it is
> not connected to ODB
> Mongoose web server will use SSL certificate file "/home/lm17773/online/ssl_cert.pem"
> Mongoose web server will use authentication realm "sampleexpt", password file
> "/home/lm17773/online/htpasswd.txt"
> mongoose web server is redirecting HTTP port 8080 to
> https://it038146.users.bris.ac.uk:8443
> mongoose web server is listening on the HTTP port 8080
> mongoose web server is listening on the HTTPS port 8443
> 
====================================================================================
> 
> [lm17773@it038146 bin]$ mlogger
> [Logger,INFO] Deleted entry '/System/Clients/20214' for client 'Logger' because it is
> not connected to ODB
> Log     directory is /home/lm17773/online/
> Data    directory is same as Log unless specified in /Logger/channels/
> History directory is same as Log unless specified in /Logger/history/
> ELog    directory is same as Log
> SQL     database is localhost/sampleexpt/Runlog
> MIDAS logger started. Stop with "!"
> 
====================================================================================
> [lm17773@it038146 bin]$ fetest
> Frontend name          :     fetest
> Event buffer size      :     10485760
> User max event size    :     4194304
> User max frag. size    :     4194304
> # of events per buffer :     2
> 
> Connect to experiment sampleexpt...
> OK
> Init hardware...frontend_init!
> Event size set to 10240 bytes
> Ring buffer wait sleep 1 ms
> OK
> time 1564131394, data 97.814758
> time 1564131395, data 96.592583
> time 1564131396, data 95.105652
> time 1564131397, data 93.358040
> time 1564131398, data 91.354546
> time 1564131399, data 89.100655
> time 1564131400, data 86.602539
> time 1564131401, data 83.867058
> time 1564131402, data 80.901703
> time 1564131403, data 77.714592
> Warning: bank RND4 has zero size
> time 1564131404, data 74.314484
> time 1564131405, data 70.710678
> time 1564131406, data 66.913063
> time 1564131407, data 62.932041
> 
====================================================================================
> 
> 
> 
> 
> 
> 
> > > Hi,
> > > 
> > > We've been trying to run Fetest in the attempt of plotting the sine wave data on
> > > the history page on the web server. However each time we've tried running a new
> > > plot we have come across the error of 'no data' from the variables. In the
> > > status page we are clearly obtaining data from the frontend and it is updating
> > > the variable as expected in SLOW.
> > > 
> > > When setting up MIDAS we managed to produce a history plot from Fetest but are
> > > unable to do so any longer. We did have a go at modifying the Fetest code but
> > > created a backup before doing so and are now running the original backup.
> > > 
> > > What could be causing the Fetest data not to be showing in the history plot?
> > 
> > Is the logger running? (this application is handling the history data).
> > If yes: Did you change the variable names? If yes: restart the logger.

09 Aug 2019, Konstantin Olchanski, Bug Report, Fetest History Plot

> Hi, our logger was running.

Please do these simple tests:

- run "mlogger -v", it should report that it is writing slow/slow data into the history with rate 1 Hz (fetest 
should be running at this point, yes?)
- normally the history files are written into the experiment directory (where ODB is, etc) and have file names 
"*.hst". Observe that the files are growing. Use "ls -ltr". (mlogger and fetest should be running at this point, 
yes?)
- if all if this is happening, you can try to run "mhist" to see the history data

If all of the above works, but you still get nothing from the history plots in mhttpd, then we probably have a 
bug in midas and we would like very much to fix it. For this we will need some more information from you. I 
hope you have some time available to help us with this.

Hmm... the fetest history plots are not defined automatically, you have to create the history plot manually,
maybe this is where the problem happens. One thing to check here, the correct variable to plot
is "slow/slow", if I remember right.


K.O.


> I have tried restarting mlogger (even though we haven't
> changed variable names). We ran the following commands one after another and still no
> luck with history plot. Is there anything else that could be causing these problems?
> 
> Kind regards,
> Hassan 
> 
> 
================================================================================
==
> 
> [lm17773@it038146 ~]$ cd /opt/midas_software/midas/bin/
> [lm17773@it038146 bin]$ mhttpd
> [mhttpd,ERROR] [odb.cxx:1646:db_open_database,ERROR] Removed ODB client 'mhttpd',
> index 0 because process pid 20094 does not exists
> [mhttpd,ERROR] [odb.cxx:1646:db_open_database,ERROR] Removed ODB client 'Logger',
> index 1 because process pid 20214 does not exists
> [mhttpd,INFO] Removed open record flag from "/Experiment/Security/RPC hosts/Allowed hosts"
> [mhttpd,INFO] Removed exclusive access mode from "/Experiment/Security/RPC
> hosts/Allowed hosts"
> [mhttpd,INFO] Removed open record flag from "/Experiment/Security/mhttpd hosts/Allowed
> hosts"
> [mhttpd,INFO] Removed exclusive access mode from "/Experiment/Security/mhttpd
> hosts/Allowed hosts"
> [mhttpd,INFO] Removed open record flag from "/Logger/History"
> [mhttpd,INFO] Removed exclusive access mode from "/Logger/History"
> [mhttpd,INFO] Removed open record flag from "/Sequencer/State"
> [mhttpd,INFO] Removed exclusive access mode from "/Sequencer/State"
> [mhttpd,INFO] Removed open record flag from "/History/LoggerHistoryChannel"
> [mhttpd,INFO] Removed exclusive access mode from "/History/LoggerHistoryChannel"
> [mhttpd,INFO] Removed open record flag from "/Equipment/slow/Variables"
> [mhttpd,INFO] Removed exclusive access mode from "/Equipment/slow/Variables"
> [mhttpd,INFO] Removed open record flag from "/Equipment/Trigger/Statistics/Events per
> sec."
> [mhttpd,INFO] Removed exclusive access mode from "/Equipment/Trigger/Statistics/Events
> per sec."
> [mhttpd,INFO] Removed open record flag from "/Equipment/Trigger/Statistics/kBytes per
> sec."
> [mhttpd,INFO] Removed exclusive access mode from "/Equipment/Trigger/Statistics/kBytes
> per sec."
> [mhttpd,INFO] Removed open record flag from "/Equipment/Periodic/Variables"
> [mhttpd,INFO] Removed exclusive access mode from "/Equipment/Periodic/Variables"
> [mhttpd,INFO] Removed open record flag from "/Equipment/Scaler/Variables"
> [mhttpd,INFO] Removed exclusive access mode from "/Equipment/Scaler/Variables"
> [mhttpd,INFO] Corrected 10 ODB entries
> [mhttpd,INFO] Deleted entry '/System/Clients/20094' for client 'mhttpd' because it is
> not connected to ODB
> Mongoose web server will use SSL certificate file "/home/lm17773/online/ssl_cert.pem"
> Mongoose web server will use authentication realm "sampleexpt", password file
> "/home/lm17773/online/htpasswd.txt"
> mongoose web server is redirecting HTTP port 8080 to
> https://it038146.users.bris.ac.uk:8443
> mongoose web server is listening on the HTTP port 8080
> mongoose web server is listening on the HTTPS port 8443
> 
================================================================================
====
> 
> [lm17773@it038146 bin]$ mlogger
> [Logger,INFO] Deleted entry '/System/Clients/20214' for client 'Logger' because it is
> not connected to ODB
> Log     directory is /home/lm17773/online/
> Data    directory is same as Log unless specified in /Logger/channels/
> History directory is same as Log unless specified in /Logger/history/
> ELog    directory is same as Log
> SQL     database is localhost/sampleexpt/Runlog
> MIDAS logger started. Stop with "!"
> 
================================================================================
====
> [lm17773@it038146 bin]$ fetest
> Frontend name          :     fetest
> Event buffer size      :     10485760
> User max event size    :     4194304
> User max frag. size    :     4194304
> # of events per buffer :     2
> 
> Connect to experiment sampleexpt...
> OK
> Init hardware...frontend_init!
> Event size set to 10240 bytes
> Ring buffer wait sleep 1 ms
> OK
> time 1564131394, data 97.814758
> time 1564131395, data 96.592583
> time 1564131396, data 95.105652
> time 1564131397, data 93.358040
> time 1564131398, data 91.354546
> time 1564131399, data 89.100655
> time 1564131400, data 86.602539
> time 1564131401, data 83.867058
> time 1564131402, data 80.901703
> time 1564131403, data 77.714592
> Warning: bank RND4 has zero size
> time 1564131404, data 74.314484
> time 1564131405, data 70.710678
> time 1564131406, data 66.913063
> time 1564131407, data 62.932041
> 
================================================================================
====
> 
> 
> 
> 
> 
> 
> > > Hi,
> > > 
> > > We've been trying to run Fetest in the attempt of plotting the sine wave data on
> > > the history page on the web server. However each time we've tried running a new
> > > plot we have come across the error of 'no data' from the variables. In the
> > > status page we are clearly obtaining data from the frontend and it is updating
> > > the variable as expected in SLOW.
> > > 
> > > When setting up MIDAS we managed to produce a history plot from Fetest but are
> > > unable to do so any longer. We did have a go at modifying the Fetest code but
> > > created a backup before doing so and are now running the original backup.
> > > 
> > > What could be causing the Fetest data not to be showing in the history plot?
> > 
> > Is the logger running? (this application is handling the history data).
> > If yes: Did you change the variable names? If yes: restart the logger.

08 Aug 2019, Art Olin, Suggestion, midas cmake migration

I want to report a bug in the ROOT build process that might be relevant to the midas implementation. I had an annoying failure to build root 6.18 (current pro version) with a misleading error message about a fault in the root code. It turned out this was a cmake problem, and the error was from my cmake version being older than 3.14, which is quite recent. Took a bit of searching to find this.

I recommend when the cmake version is distributed that the instructions include the required cmake version. Developers are generally working well ahead of what is available in the older OS's.

08 Aug 2019, Stefan Ritt, Suggestion, midas cmake migration

Each CMakeLists.txt should specify which version of CMake it requires. The MIDAS CMakeLists.txt requires CMake 3.1 or later. 
We deliberately stayed away from fancy cutting edge CMake features in order to make midas easier to compile. On top of that,
midas is a much simpler package compared to root, so things are not so complicated.

Stefan

08 Aug 2019, Konstantin Olchanski, Suggestion, midas cmake migration

> Each CMakeLists.txt should specify which version of CMake it requires. The MIDAS CMakeLists.txt requires CMake 3.1 or later. 
> We deliberately stayed away from fancy cutting edge CMake features in order to make midas easier to compile. On top of that,
> midas is a much simpler package compared to root, so things are not so complicated.

The oldest cmake I actually used is 3.6.1 (on SL6), so I do not know if cmake versions between 3.1 and 3.6 actually work for us. Perhaps we should set 
the CMakefile requirement to 3.6.1 to match the oldest version we know works. If somebody has an older cmake, they have a choice of updating it or 
trying it as as and reporting success/failure to the midas forum here.

K.O.

08 Aug 2019, Stefan Ritt, Suggestion, midas cmake migration

I just tried CMake 3.1.0 and it worked with midas. So I believe all versions between 3.1.0 and 3.6.1 are ok.

Actually playing around with different versions I realized that 3.0.0 is also ok, so I changed the requirement of midas down to 3.0

Stefan

Goto page Previous 1, 2, 3 ... 20, 21, 22 ... 50, 51, 52 Next

ELOG V3.1.6-083448f7