Back Midas Rome Roody Rootana
  Midas DAQ System, Page 44 of 157  Not logged in ELOG logo
    Reply  17 Nov 2019, Konstantin Olchanski, Suggestion, javascript comunication 
> Very good idea. And thanks for finding the document.hidden solution. I put it in, so give it a try.

Hi, Stefan - I did not look at your code, if all midas tabs are inactive, will the alarm sound still play?

K.O.


> 
> Best,
> Stefan
> 
> 
> > I am currently testing the new history system on the mhttpd side and stumbled over the following issue: typically our user open a lot of midas web-page tabs and keep them open. With the current version this leads after a night typically to a state where the browser is busy with itself and not reacting anymore.
> > 
> > One important reason seems to be that ALL tabs trying to communicate all the time which is totally unnecessary, since I think a hidden tab should stay in a sleeping mode. 
> > 
> > I was browsing if there is a way to find out if a tab is active or not, and found the following API which exactly does this:
> > 
> > https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API
> > 
> > Furthermore, the simple
> > 
> > document.hidden 
> > 
> > tag, could be used to find out if the page is currently active.
> > 
> > Wouldn't it a good idea to send all midas tabs which are not active into a sleep mode and only reactivate them if they come into focus?
> > 
> > I had a quick look at the JavaScript libs of midas, but I am not quite certain where to best inject this. 
    Reply  18 Nov 2019, Stefan Ritt, Suggestion, javascript comunication 
> Hi, Stefan - I did not look at your code, if all midas tabs are inactive, will the alarm sound still play?

Nope. All updates are done in mhhtpd_refresh(), and I changed it such that nothing is updated if hidden.

I agree however that this is bad. You want to hear alarms always. So I added some code to do ONLY alarm updates if in the background. 
No GUI changes, only audio playing. Checking is reduced from 1 Hz to 0.1 Hz (once every 10 seconds). I don't have however a solution 
to your problem "not enough interaction with page" which I have never seen so far.

I wonder if we should try push notifications: https://developers.google.com/web/fundamentals/codelabs/push-notifications
which seems a bit complicated to me. It might also be too subtle if someone is sleeping in front of the computer.

Stefan
    Reply  18 Nov 2019, Stefan Ritt, Suggestion, javascript comunication 
> a) google chrome slows down the execution of javascript in inactive tabs, leading to trouble
> with memory management - midas pages poll at 1/sec, each poll allocates memory for processing RPC messages,
> and (until recently) allocates memory for new DOM objects to update the web page - but the garbage collector
> gets slowed down and does not keep up - leading to huge memory use (up to 200 Mbytes) for inactive midas pages
> that normally consume 50-ish Mbytes.

Try my latest change, which drops any update if hidden, except the alarm sound. At the moment this only works on the main "status" page.

> most midas web pages poll in two places - mhttpd_refresh() updates the current date timestamp, alarms, currently active midas.log message;
> and each page has it's own loop for updating it's own data (i.e. "alarms" page, "programs" page).

That's correct. I changed mhttpd_refresh() now and the main loop for the "status" page. If that works for everybody, we can do that also for "programs" and other pages. The code is here:

      if (document.hidden) {
         // don't update page if hidden
         setTimeout(update_page, 500);
         return;
      }

where "update_page" has to be replaced with the proper function.

Stefan
    Reply  18 Nov 2019, Konstantin Olchanski, Suggestion, javascript comunication 
> > Hi, Stefan - I did not look at your code, if all midas tabs are inactive, will the alarm sound still play?
> I added some code to do ONLY alarm updates if in the background (once every 10 seconds)

Ok, good.

> I don't have however a solution to your problem "not enough interaction with page" which I have never seen so far.

When it happens, you should see my messages about it in the javascript console. (rejected promise from audio.play()).

How to make it happen so one can see it and be sure it is handled correctly (audio.src has to be set to blank, at the least), I have no idea.

What I have been looking at for the huge memory and cpu use problem, I would call "a bad interaction between two kludges",
the first kludge is the throttling javascript in inactive tabs, the second kludge is the "not interacted" rejection of audio.play() promise.
(audio.play() returning a promise, and the strange sequencing between this promise success/reject and
the audio events loaded/canplay/done/etc smells of a kludge, too).

> I wonder if we should try push notifications: https://developers.google.com/web/fundamentals/codelabs/push-notifications
> which seems a bit complicated to me. It might also be too subtle if someone is sleeping in front of the computer.

I did not read all the explanation, but if it requires use of 3rd party services, I think we cannot use it, we do not want
to miss "experiment is on fire" alarms just because google is down.

K.O.
Entry  14 Jul 2004, Exaos Lee, , install problem of Makefile on MacOS X (Darwin 7.4.0, gcc 3.3) 
I have compiled the sources on Darwin 7.4.0 with gcc 3.3. After the compilation of source codes, I 
try to execute "gmake install". I got the following message:
-------
Nothing to be done for "install".
-------

The install target could not be executed. Then I add the following line to the Makefile:
------
.PHONY: install
------

The install target can be executed. But when it is tring to copy "dio" to the proper directory, it 
cannot find the file. Then I found that the "utils" target isn't built. 
I try to build the target: darwin/bin/dio, I got the following error:
-------
cc -g -O2 -Wall -Iinclude -Idrivers -Ldarwin/lib -DINCLUDE_FTPLIB   -DOS_LINUX -DOS_DARWIN 
-DHAVE_STRLCPY -fPIC -Wno-unused-function -o darwin/bin/dio utils/dio.c
utils/dio.c:39:20: sys/io.h: No such file or directory
utils/dio.c: In function `main':
utils/dio.c:46: warning: implicit declaration of function `iopl'
gmake: *** [darwin/bin/dio] Error 1
--------
So, the include file "sys/io.h" may be changed under Darwin. I don't know how. I will try later. I 
hope somebody can notice this. 
Best regards.
    Reply  14 Jul 2004, Exaos Lee, , install problem of Makefile on MacOS X (Darwin 7.4.0, gcc 3.3) 
There are not such a file "io.h" inside my MacOS X. In fact, I didn't find any file containing function iopl().
So what is the equivalent function of iopl() under MacOS X? The utility dio should be modified in order to be compiled under Darwin-
gcc platform. 
    Reply  14 Jul 2004, Konstantin Olchanski, , install problem of Makefile on MacOS X (Darwin 7.4.0, gcc 3.3) 
> There are not such a file "io.h" inside my MacOS X. In fact, I didn't find any file containing function iopl().
> So what is the equivalent function of iopl() under MacOS X? The utility dio should be modified in order to be compiled under Darwin-
> gcc platform. 

"dio" is not supported under MacOSX. It is used to grant user programs access to PCI and ISA cards (usually CAMAC interfaces). We have
no MacOSX hardware with PCI or ISA slots so we cannot test and support this functionality.

The MacOSX Makefile should not try to build "dio". I will accept a patch to fix this Makefile bug.

K.O.
Entry  15 Feb 2017, NguyenMinhTruong, Bug Report, increase event buffer size 
Dear all,

I have problem in event buffer size.

When run MIDAS, I got error "total event size (1307072) larger than buffer size
(1048576)", so I guess that the EVENT_BUFFER_SIZE is small.

I change EVENT_BUFFER_SIZE in midas.h from 0x100000 to 0x200000. After compiling
and run MIDAS, I got other error "Shared memory segment with key 0x4d040761
already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" in
system.C

I check the shmget() function in system.C and it is said that error come from
Shared memory segments larger than 16,773,120 bytes and create teraspace shared
memory segments

Anyone has this problem before?
Thanks for your help

M.T
Entry  15 Feb 2017, NguyenMinhTruong, Bug Report, increase event buffer size 
Dear all,

I have problem in event buffer size.

When run MIDAS, I got error "total event size (1307072) larger than buffer size
(1048576)", so I guess that the EVENT_BUFFER_SIZE is small.

I change EVENT_BUFFER_SIZE in midas.h from 0x100000 to 0x200000. After compiling
and run MIDAS, I got other error "Shared memory segment with key 0x4d040761
already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" in
system.C

I check the shmget() function in system.C and it is said that error come from
Shared memory segments larger than 16,773,120 bytes and create teraspace shared
memory segments

Anyone has this problem before?
Thanks for your help

M.T
    Reply  16 Feb 2017, Konstantin Olchanski, Bug Report, increase event buffer size 
> I have problem in event buffer size.
> 
> When run MIDAS, I got error "total event size (1307072) larger than buffer size
> (1048576)", so I guess that the EVENT_BUFFER_SIZE is small.
>

Correct. You have a choice of sending smaller events or increasing the buffer size.

Increasing the buffer size consumes computer memory, how much memory do you have on your machine?

> 
> I change EVENT_BUFFER_SIZE in midas.h from 0x100000 to 0x200000. After compiling
> and run MIDAS, I got other error "Shared memory segment with key 0x4d040761
> already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" in
> system.C
> 

This is not normal. In recent versions of MIDAS (for the last few years)

a) buffer size is changed via ODB "/Experiment/buffer sizes", no need to edit midas.h
b) shared memory was switched from SYSV shared memory to POSIX shared memory, and you should not see any references to 
SYSV shared memory functions like "ipcrm", "shmget" and "segment key".

Are you using a very old version of MIDAS? Or maybe you have a MIDAS installation that still uses SYSV shared memory. Check 
the contents of .SHM_TYPE.TXT (in the same directory as .ODB.SHM), if would normally say "POSIXv2_SHM". If it says 
something else, it is best to convert to POSIX SHM. Simplest way is to stop everything, save odb to text file, delete 
.SHM_TYPE.TXT, restart odb with odbedit, reload from text file. Now check that .SHM_TYPE.TXT says "POSIXv2_SHM".

>
> I check the shmget() function in system.C and it is said that error come from
> Shared memory segments larger than 16,773,120 bytes and create teraspace shared
> memory segments
> 

What teraspace?!? You changed the size from 1 Mbyte to 2 Mbyte (0x200000), this is still below even the value you have above 
(16,773,120).

At the end, it is not clear what your problem is. After changing the shared memory size (via odb or via midas.h),
the midas *will* complain about the mismatch in size (existing vs expected) and will tell you how to fix it, (run "ipcrm").
After does this, is there still an error? Normally everything will just work. (you might also have to erase .SYSTEM.SHM,
midas will tell you to do so if it is needed).

So what is your final error? (After running ipcrm?)

K.O.
    Reply  19 Feb 2017, NguyenMinhTruong, Bug Report, increase event buffer size 

I am sorry for my late reply memory in my PC is 16 GB I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM". But there is no buffer sizes in "/Experiment" After run "ipcrm -M 0x4d040761 size0x204a3c", remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T

    Reply  20 Feb 2017, NguyenMinhTruong, Bug Report, increase event buffer size 
I am sorry for my late reply 

memory in my PC is 16 GB 

I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM". 
But there is no buffer sizes in "/Experiment" 

After run "ipcrm -M 0x4d040761 size0x204a3c", remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T
    Reply  20 Feb 2017, Konstantin Olchanski, Bug Report, increase event buffer size 
> memory in my PC is 16 GB 

You can safely go to buffer size 100 Mbytes or more.

> I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".

Good.

> But there is no buffer sizes in "/Experiment" 

This is strange. How old is your midas? What does it say on the "help" page in "Revision"?

> After run "ipcrm -M 0x4d040761 size0x204a3c"

This command is wrong. It probably gave you an error instead of removing the shared memory, that's why
nothing worked afterwards.

My copy of system.c reads this:
cm_msg(MERROR, "ss_shm_open", "Shared memory segment with key 0x%x already exists, please remove it manually: ipcrm -M 0x%x", key, key);

Note how there is no text "size0x..." in my copy? What does your copy say? Did somebody change it?

> remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
> with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T

Yes, that's because the ipcrm command is wrong and did not work,
it should read "ipcrm -M 0x4d040761" without the spurious "size..." text.

K.O.
    Reply  20 Feb 2017, Konstantin Olchanski, Bug Report, increase event buffer size 
> memory in my PC is 16 GB 

You can safely go to buffer size 100 Mbytes or more.

> I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".

Good.

> But there is no buffer sizes in "/Experiment" 

This is strange. How old is your midas? What does it say on the "help" page in "Revision"?

> After run "ipcrm -M 0x4d040761 size0x204a3c"

This command is wrong. It probably gave you an error instead of removing the shared memory, that's why
nothing worked afterwards.

My copy of system.c reads this:
cm_msg(MERROR, "ss_shm_open", "Shared memory segment with key 0x%x already exists, please remove it manually: ipcrm -M 0x%x", key, 
key);

Note how there is no text "size0x..." in my copy? What does your copy say? Did somebody change it?

> remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
> with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T

Yes, that's because the ipcrm command is wrong and did not work,
it should read "ipcrm -M 0x4d040761" without the spurious "size..." text.

K.O.
    Reply  20 Feb 2017, Konstantin Olchanski, Bug Report, increase event buffer size 
> > memory in my PC is 16 GB 
> 
> You can safely go to buffer size 100 Mbytes or more.
> 
> > I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".
> 
> Good.


No, wait, this is all wrong. If it says POSIX shared memory, how come it later
complains about SYSV shared memory and tells you to run SYSV shared memory
commands like ipcrm?!?


> > But there is no buffer sizes in "/Experiment" 


Now this kind of makes sense - you are probably running a strange mixture
of very old and recently new MIDAS. Probably you current version is so old
that it does not use .SHM_TYPE.TXT and can only do SYSV shared memory
and so old it does not have "/Experiment/buffer sizes".

But at some point you must have run a recent version of midas, or you would
not have the file .SHM_TYPE.TXT in your experiment directory.

I say:

a) run the correct ipcrm command (without the spurious "size..." text)
b) review your computer contents to identify all the versions of midas
   and to make sure you are using the midas you want to use (old or new,
   whatever), but not some wrong version by accident (incorrect PATH setting, etc)

As MIDAS developers, we usually recommend that you use the latest version of MIDAS,
certainly latest version is simpler to debug.

K.O.
Entry  14 Aug 2019, Konstantin Olchanski, Bug Fix, incorrect recursion in ss_suspend() via the user event handler 
The ROOTANA midas analyzer uncovered a problem with recursive use of ss_suspend().

When running in graphical mode, the ROOT graphics main event loop was calling 
ss_suspend(0) (no MSG_BM) which would sometimes call the user event handler (if a new 
event arrives into the midas event buffer). Because this loop was already running in the 
user event handler, there was a crash.

This is the calling sequence leading to the incorrect recursion: (from system.cxx comments 
to ss_suspend())

analyzer ->
     -> cm_yield() in the main loop
     -> ss_suspend(0)
     -> MSG_BM message arrives in the UDP socket
     -> ss_suspend_process_ipc(0)
     -> cm_dispatch_ipc()
     -> bm_push_event()
     -> bm_push_buffer()
     -> bm_read_buffer()
     -> bm_dispatch_event()
     -> user event handler
     -> user event handler ROOT graphics main loop needs to sleep
     -> ss_suspend(0) <--- should be ss_suspend(MSG_BM)!!!     
     -> MSG_BM message arrives in the UDP socket
     -> ss_suspend_process_ipc(0) <- should be ss_suspend_process_ipc(MSG_BM)!!!
     -> cm_dispatch_ipc() <- without MSG_BM, calling cm_dispatch_ipc() again
     -> bm_push_event()
     -> bm_push_buffer()
     -> bm_read_buffer()
     -> bm_dispatch_event()
     -> user event handler <---- called recursively, very bad!

The proper fix for this is to always call ss_suspend(MSG_BM) from the user event handler 
and ss_suspend(MSG_ODB) from the user db_watch handler.

In this second case, ss_suspend(MSG_OBM) will lose/ignore/discard db_watch notifications, 
if you do not want that, call ss_suspend(0) and be ready for recursive calls to your 
db_watch handler. (this can happen in a frontend program that acts on changes in ODB and 
where some of these actions may require sleeping via ss_suspend()).

ss_suspend(MSG_BM) will discard MSG_BM messages, which is not a problem as 
bm_wait_for_events() and cm_yield() will immediately poll the event buffer and there will be 
no delay in receiving new events.

Until commit 99d6e90 there were problems in ss_suspend(MSG_BM) and recursive calls to 
the user event handler were still possible. It is now fixed in this and the previous commits.

K.O.
Entry  18 Mar 2016, William Page, Bug Report, incomplete copy using odbedit copy 
Hi,

Attempting to copy a subtree to a new location in the ODB using odbedit with "copy <src> <dest>" is 
occasionally not copying the entire <src> subtree.

I am experiencing this issue consistently when trying to copy subtrees from the "/Equipment" ODB tree to 
a new location.  The first 2-3 variables/directories of the <src> subtree will be copied to <dest> but the 
full subtree will not be copied over.
    Reply  22 Mar 2016, Stefan Ritt, Bug Report, incomplete copy using odbedit copy 
> Hi,
> 
> Attempting to copy a subtree to a new location in the ODB using odbedit with "copy <src> <dest>" is 
> occasionally not copying the entire <src> subtree.
> 
> I am experiencing this issue consistently when trying to copy subtrees from the "/Equipment" ODB tree to 
> a new location.  The first 2-3 variables/directories of the <src> subtree will be copied to <dest> but the 
> full subtree will not be copied over.

I just tried myself and could successfully copy of even large trees in the /Equipment subtree. Need to reproduce the problem to fix 
it. Maybe close-to-full ODB?

Stefan
    Reply  27 Sep 2019, Konstantin Olchanski, Bug Fix, improvement for midas web page resource use (alarm sound) 
> I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser 
> "task manager" and by other tools.
> 
> For example, size of the "status" page was observe to reach 200, 600 and even 900 Mbytes.
> [this was fixed by using set_if_changed(e, v);
> 
> Also I observed the midas web pages consume an unusual amount of CPU - 5-10-15% - all in inactive tabs in minimized 
> windows.
> 

The case of high CPU use turned out to be quite nasty.

The symptoms:
- the "programs" page in an inactive tab in a minimized window sits "doing nothing" for a day or two.
- uses about 0 to 0.1 to 1% CPU and 40-50-60 Mbytes of RAM (after the previous improvements)
- suddenly I see it use 10-15-20% CPU, continuously, non stop
- I open this tab
- suddenly, CPU use goes to 100%, memory use quickly grows from 40-50-60 Mbytes to 100-200 Mbytes.
- after a few seconds everything settles down, CPU use is back to 0-0.1-1%, but memory use does not go down.
- WTH?!?

The culprit turned out to be the playing of the alarm sound. (I have all tabs "muted" by default, also speakers usually powered down).

If I comment-out the playing of the alarm sound, this problem goes away completely. Pretty conclusive, I think.

After adding lots of debug console.log() calls, I think I identified the problem: audio objects were being created,
but they were not starting to play their sound files. When I opened the tab, all of them (about 400) at the same time
loaded the mp3 file (resulting in memory use going from 50 Mbytes to 190 Mbytes, typical) and started playing
(as seen on the audio event activity in the cpu profile traces from the google-chrome "performance" tool).

I think I am looking at an unexpected interaction between audio objects and google-chrome throttling of inactive tabs.

To muddy the waters some more, google-chrome periodically fails audio.play() with an exception to the effect of
"we will not play audio because user is not interacting with this page enough". See
https://bitbucket.org/tmidas/midas/issues/191/exception-on-audioplay

Now I think I have this sort of fixed. I have to handle the audio.play() failure (which is not a normal exception,
but a rejected promise, the handler is quite different), and I do not allow creating new audio objects if previous
audio object did not finish playing.

(note the "normal" timing: periodic update every 1 sec, playing of alarm sound event 60 seconds, length of alarm sound file is 3 sec,
two sound files should never overlap. now a console.log message is printed if overlap is detected)

This leaves us with the problem of alarm sound not playing "because the user didn't interact with the document first",
and I think there is nothing I can do about that.

K.O.

P.S. Another quirk is I discovered: go to the "config" page and press the new buttons "play test sound" and "speak test message". In muted
tabs, the test sound will not sound, but the test message will be shouted out loudly. This seems inconsistent to me. Unwanted audio/video ads
are blocked but loud shouting of "shave with burma-shave" is permitted. I also wonder if speaking is subject to this
"user did not interact" business. If not, we could replace the playing of our relaxing alarm beep with the yelling of "alarm! alarm! alarm!".

K.O.
    Reply  28 Nov 2019, Konstantin Olchanski, Bug Fix, improvement for midas web page resource use (alarm sound and fit_message) 
> > I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser 
> > "task manager" and by other tools.

The final fix is in. (plus a fix from Stefan).

When the audio.play() promise is rejected, one must clear audio.src, otherwise the browser will continue
with loading the audio file (but will not play it at the end).

Normally, this should not be a problem, but in inactive tabs, all activity is throttled down, and it so happens
that these audio objects accumulate (they are in the state of "we are trying to load the sound file, but
browser slows us down so much!"), consume huge amounts of memory (page memory use goes from ~50 Mbytes
to ~100-200 Mbytes) and consume huge amounts of CPU (not clear how, probably it's the firing of "loading", "canplay", etc
event handlers).

It does not help that mhttpd_fit_message() had a performance bug and consumed large amounts of CPU causing even
more slowing down by the be browser.

After adding audio.src="", all this is gone. I see no special CPU use, and I do not see any strange large memory use.

I still sometimes see inactive tabs grow from ~50 Mbytes to amount ~100 Mbytes. After I open them (activate them),
they quickly shrink back to ~50 Mbytes. I conclude that the browser is slowing down the garbage collector in inactive
tabs so much that it does not keep up with our 1/sec data polling.

So Stefan's fix to reduce polling from 1/sec to 1/10sec should help with this, too. (plus reduction of CPU use by fit_message() should
leave more time for the garbage collector to run).

P.S. General rules for browser slow down of inactive tabs seem to be written here:
https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API

Slowdown of timers is written here:
https://developer.mozilla.org/en-US/docs/Web/API/WindowOrWorkerGlobalScope/setTimeout#Notes

K.O.
ELOG V3.1.4-2e1708b5