Back Midas Rome Roody Rootana
  Midas DAQ System, Page 39 of 138  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
ID Date Author Topic Subjectdown
  1737   17 Nov 2019 Konstantin OlchanskiSuggestionjavascript comunication
> I am currently testing the new history system on the mhttpd side and stumbled over the following issue:
> typically our user open a lot of midas web-page tabs and keep them open. With the current version this leads after a night typically to a state where the browser is busy with itself and not reacting anymore.
> 
> One important reason seems to be that ALL tabs trying to communicate all the time which is totally unnecessary, since I think a hidden tab should stay in a sleeping mode. 

I am looking at two more problems with inactive tabs:

a) google chrome slows down the execution of javascript in inactive tabs, leading to trouble
with memory management - midas pages poll at 1/sec, each poll allocates memory for processing RPC messages,
and (until recently) allocates memory for new DOM objects to update the web page - but the garbage collector
gets slowed down and does not keep up - leading to huge memory use (up to 200 Mbytes) for inactive midas pages
that normally consume 50-ish Mbytes.

b) the playing of the alarm sound is throttled by "user has not interacted with the document" thing, but there is a bug -
instead of canceling the playing of the alarm sound, the sound file is still loaded, (but not played). (this is hard to debug
because I do not know how to manually trigger the "user has not interacted..." condition, I have to wait for many days for it.
Then, for inactive tabs, the loading of the sound files is slowed down, leading to many of them getting queued up,
and eventually they all try to load and play at the same time, again leading to huge memory and cpu use in inactive tabs.
(this sounds incredible because we play the alarm sound at most 1/minute, for sure the previous sound file must have
finished playing by then, but no, it is easy to see it happen - add a few console.log messages and wait for a few days).

>
> I was browsing if there is a way to find out if a tab is active or not, and found the following API which exactly does this:
> https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API
> 

From looking at the inactive tab business, I see that javascript in inactive tabs runs quite differently from javascript
in active tabs (i.e. timers do not work the same) and I see how the "visibility api" had to be invented to counter that.

> 
> Wouldn't it a good idea to send all midas tabs which are not active into a sleep mode and only reactivate them if they come into focus?
> I had a quick look at the JavaScript libs of midas, but I am not quite certain where to best inject this.
>

most midas web pages poll in two places - mhttpd_refresh() updates the current date timestamp, alarms, currently active midas.log message;
and each page has it's own loop for updating it's own data (i.e. "alarms" page, "programs" page).

we should be careful to not completely disable all polling as some experiments do use and do rely on the midas producing
loud alarm messages ("program logged is not running!!!", "program mhttpd aborted!!!"). Even if all midas tabs are inactive,
some javascript is some tab still has to run frequently enough to poll midas and to sound the alarm sounds (even though
I am not sure how to 100% reliably counteract the google-chrome not playing sound files because
of the "user did not interact with the site..." thing).

K.O.
  1738   17 Nov 2019 Konstantin OlchanskiSuggestionjavascript comunication
> Very good idea. And thanks for finding the document.hidden solution. I put it in, so give it a try.

Hi, Stefan - I did not look at your code, if all midas tabs are inactive, will the alarm sound still play?

K.O.


> 
> Best,
> Stefan
> 
> 
> > I am currently testing the new history system on the mhttpd side and stumbled over the following issue: typically our user open a lot of midas web-page tabs and keep them open. With the current version this leads after a night typically to a state where the browser is busy with itself and not reacting anymore.
> > 
> > One important reason seems to be that ALL tabs trying to communicate all the time which is totally unnecessary, since I think a hidden tab should stay in a sleeping mode. 
> > 
> > I was browsing if there is a way to find out if a tab is active or not, and found the following API which exactly does this:
> > 
> > https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API
> > 
> > Furthermore, the simple
> > 
> > document.hidden 
> > 
> > tag, could be used to find out if the page is currently active.
> > 
> > Wouldn't it a good idea to send all midas tabs which are not active into a sleep mode and only reactivate them if they come into focus?
> > 
> > I had a quick look at the JavaScript libs of midas, but I am not quite certain where to best inject this. 
  1739   18 Nov 2019 Stefan RittSuggestionjavascript comunication
> Hi, Stefan - I did not look at your code, if all midas tabs are inactive, will the alarm sound still play?

Nope. All updates are done in mhhtpd_refresh(), and I changed it such that nothing is updated if hidden.

I agree however that this is bad. You want to hear alarms always. So I added some code to do ONLY alarm updates if in the background. 
No GUI changes, only audio playing. Checking is reduced from 1 Hz to 0.1 Hz (once every 10 seconds). I don't have however a solution 
to your problem "not enough interaction with page" which I have never seen so far.

I wonder if we should try push notifications: https://developers.google.com/web/fundamentals/codelabs/push-notifications
which seems a bit complicated to me. It might also be too subtle if someone is sleeping in front of the computer.

Stefan
  1740   18 Nov 2019 Stefan RittSuggestionjavascript comunication
> a) google chrome slows down the execution of javascript in inactive tabs, leading to trouble
> with memory management - midas pages poll at 1/sec, each poll allocates memory for processing RPC messages,
> and (until recently) allocates memory for new DOM objects to update the web page - but the garbage collector
> gets slowed down and does not keep up - leading to huge memory use (up to 200 Mbytes) for inactive midas pages
> that normally consume 50-ish Mbytes.

Try my latest change, which drops any update if hidden, except the alarm sound. At the moment this only works on the main "status" page.

> most midas web pages poll in two places - mhttpd_refresh() updates the current date timestamp, alarms, currently active midas.log message;
> and each page has it's own loop for updating it's own data (i.e. "alarms" page, "programs" page).

That's correct. I changed mhttpd_refresh() now and the main loop for the "status" page. If that works for everybody, we can do that also for "programs" and other pages. The code is here:

      if (document.hidden) {
         // don't update page if hidden
         setTimeout(update_page, 500);
         return;
      }

where "update_page" has to be replaced with the proper function.

Stefan
  1741   18 Nov 2019 Konstantin OlchanskiSuggestionjavascript comunication
> > Hi, Stefan - I did not look at your code, if all midas tabs are inactive, will the alarm sound still play?
> I added some code to do ONLY alarm updates if in the background (once every 10 seconds)

Ok, good.

> I don't have however a solution to your problem "not enough interaction with page" which I have never seen so far.

When it happens, you should see my messages about it in the javascript console. (rejected promise from audio.play()).

How to make it happen so one can see it and be sure it is handled correctly (audio.src has to be set to blank, at the least), I have no idea.

What I have been looking at for the huge memory and cpu use problem, I would call "a bad interaction between two kludges",
the first kludge is the throttling javascript in inactive tabs, the second kludge is the "not interacted" rejection of audio.play() promise.
(audio.play() returning a promise, and the strange sequencing between this promise success/reject and
the audio events loaded/canplay/done/etc smells of a kludge, too).

> I wonder if we should try push notifications: https://developers.google.com/web/fundamentals/codelabs/push-notifications
> which seems a bit complicated to me. It might also be too subtle if someone is sleeping in front of the computer.

I did not read all the explanation, but if it requires use of 3rd party services, I think we cannot use it, we do not want
to miss "experiment is on fire" alarms just because google is down.

K.O.
  46   14 Jul 2004 Exaos Lee install problem of Makefile on MacOS X (Darwin 7.4.0, gcc 3.3)
I have compiled the sources on Darwin 7.4.0 with gcc 3.3. After the compilation of source codes, I 
try to execute "gmake install". I got the following message:
-------
Nothing to be done for "install".
-------

The install target could not be executed. Then I add the following line to the Makefile:
------
.PHONY: install
------

The install target can be executed. But when it is tring to copy "dio" to the proper directory, it 
cannot find the file. Then I found that the "utils" target isn't built. 
I try to build the target: darwin/bin/dio, I got the following error:
-------
cc -g -O2 -Wall -Iinclude -Idrivers -Ldarwin/lib -DINCLUDE_FTPLIB   -DOS_LINUX -DOS_DARWIN 
-DHAVE_STRLCPY -fPIC -Wno-unused-function -o darwin/bin/dio utils/dio.c
utils/dio.c:39:20: sys/io.h: No such file or directory
utils/dio.c: In function `main':
utils/dio.c:46: warning: implicit declaration of function `iopl'
gmake: *** [darwin/bin/dio] Error 1
--------
So, the include file "sys/io.h" may be changed under Darwin. I don't know how. I will try later. I 
hope somebody can notice this. 
Best regards.
  47   14 Jul 2004 Exaos Lee install problem of Makefile on MacOS X (Darwin 7.4.0, gcc 3.3)
There are not such a file "io.h" inside my MacOS X. In fact, I didn't find any file containing function iopl().
So what is the equivalent function of iopl() under MacOS X? The utility dio should be modified in order to be compiled under Darwin-
gcc platform. 
  48   14 Jul 2004 Konstantin Olchanski install problem of Makefile on MacOS X (Darwin 7.4.0, gcc 3.3)
> There are not such a file "io.h" inside my MacOS X. In fact, I didn't find any file containing function iopl().
> So what is the equivalent function of iopl() under MacOS X? The utility dio should be modified in order to be compiled under Darwin-
> gcc platform. 

"dio" is not supported under MacOSX. It is used to grant user programs access to PCI and ISA cards (usually CAMAC interfaces). We have
no MacOSX hardware with PCI or ISA slots so we cannot test and support this functionality.

The MacOSX Makefile should not try to build "dio". I will accept a patch to fix this Makefile bug.

K.O.
  1237   15 Feb 2017 NguyenMinhTruongBug Reportincrease event buffer size
Dear all,

I have problem in event buffer size.

When run MIDAS, I got error "total event size (1307072) larger than buffer size
(1048576)", so I guess that the EVENT_BUFFER_SIZE is small.

I change EVENT_BUFFER_SIZE in midas.h from 0x100000 to 0x200000. After compiling
and run MIDAS, I got other error "Shared memory segment with key 0x4d040761
already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" in
system.C

I check the shmget() function in system.C and it is said that error come from
Shared memory segments larger than 16,773,120 bytes and create teraspace shared
memory segments

Anyone has this problem before?
Thanks for your help

M.T
  1238   15 Feb 2017 NguyenMinhTruongBug Reportincrease event buffer size
Dear all,

I have problem in event buffer size.

When run MIDAS, I got error "total event size (1307072) larger than buffer size
(1048576)", so I guess that the EVENT_BUFFER_SIZE is small.

I change EVENT_BUFFER_SIZE in midas.h from 0x100000 to 0x200000. After compiling
and run MIDAS, I got other error "Shared memory segment with key 0x4d040761
already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" in
system.C

I check the shmget() function in system.C and it is said that error come from
Shared memory segments larger than 16,773,120 bytes and create teraspace shared
memory segments

Anyone has this problem before?
Thanks for your help

M.T
  1239   16 Feb 2017 Konstantin OlchanskiBug Reportincrease event buffer size
> I have problem in event buffer size.
> 
> When run MIDAS, I got error "total event size (1307072) larger than buffer size
> (1048576)", so I guess that the EVENT_BUFFER_SIZE is small.
>

Correct. You have a choice of sending smaller events or increasing the buffer size.

Increasing the buffer size consumes computer memory, how much memory do you have on your machine?

> 
> I change EVENT_BUFFER_SIZE in midas.h from 0x100000 to 0x200000. After compiling
> and run MIDAS, I got other error "Shared memory segment with key 0x4d040761
> already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" in
> system.C
> 

This is not normal. In recent versions of MIDAS (for the last few years)

a) buffer size is changed via ODB "/Experiment/buffer sizes", no need to edit midas.h
b) shared memory was switched from SYSV shared memory to POSIX shared memory, and you should not see any references to 
SYSV shared memory functions like "ipcrm", "shmget" and "segment key".

Are you using a very old version of MIDAS? Or maybe you have a MIDAS installation that still uses SYSV shared memory. Check 
the contents of .SHM_TYPE.TXT (in the same directory as .ODB.SHM), if would normally say "POSIXv2_SHM". If it says 
something else, it is best to convert to POSIX SHM. Simplest way is to stop everything, save odb to text file, delete 
.SHM_TYPE.TXT, restart odb with odbedit, reload from text file. Now check that .SHM_TYPE.TXT says "POSIXv2_SHM".

>
> I check the shmget() function in system.C and it is said that error come from
> Shared memory segments larger than 16,773,120 bytes and create teraspace shared
> memory segments
> 

What teraspace?!? You changed the size from 1 Mbyte to 2 Mbyte (0x200000), this is still below even the value you have above 
(16,773,120).

At the end, it is not clear what your problem is. After changing the shared memory size (via odb or via midas.h),
the midas *will* complain about the mismatch in size (existing vs expected) and will tell you how to fix it, (run "ipcrm").
After does this, is there still an error? Normally everything will just work. (you might also have to erase .SYSTEM.SHM,
midas will tell you to do so if it is needed).

So what is your final error? (After running ipcrm?)

K.O.
  Draft   19 Feb 2017 NguyenMinhTruongBug Reportincrease event buffer size

I am sorry for my late reply memory in my PC is 16 GB I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM". But there is no buffer sizes in "/Experiment" After run "ipcrm -M 0x4d040761 size0x204a3c", remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T

  1241   20 Feb 2017 NguyenMinhTruongBug Reportincrease event buffer size
I am sorry for my late reply 

memory in my PC is 16 GB 

I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM". 
But there is no buffer sizes in "/Experiment" 

After run "ipcrm -M 0x4d040761 size0x204a3c", remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T
  Draft   20 Feb 2017 Konstantin OlchanskiBug Reportincrease event buffer size
> memory in my PC is 16 GB 

You can safely go to buffer size 100 Mbytes or more.

> I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".

Good.

> But there is no buffer sizes in "/Experiment" 

This is strange. How old is your midas? What does it say on the "help" page in "Revision"?

> After run "ipcrm -M 0x4d040761 size0x204a3c"

This command is wrong. It probably gave you an error instead of removing the shared memory, that's why
nothing worked afterwards.

My copy of system.c reads this:
cm_msg(MERROR, "ss_shm_open", "Shared memory segment with key 0x%x already exists, please remove it manually: ipcrm -M 0x%x", key, key);

Note how there is no text "size0x..." in my copy? What does your copy say? Did somebody change it?

> remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
> with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T

Yes, that's because the ipcrm command is wrong and did not work,
it should read "ipcrm -M 0x4d040761" without the spurious "size..." text.

K.O.
  1243   20 Feb 2017 Konstantin OlchanskiBug Reportincrease event buffer size
> memory in my PC is 16 GB 

You can safely go to buffer size 100 Mbytes or more.

> I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".

Good.

> But there is no buffer sizes in "/Experiment" 

This is strange. How old is your midas? What does it say on the "help" page in "Revision"?

> After run "ipcrm -M 0x4d040761 size0x204a3c"

This command is wrong. It probably gave you an error instead of removing the shared memory, that's why
nothing worked afterwards.

My copy of system.c reads this:
cm_msg(MERROR, "ss_shm_open", "Shared memory segment with key 0x%x already exists, please remove it manually: ipcrm -M 0x%x", key, 
key);

Note how there is no text "size0x..." in my copy? What does your copy say? Did somebody change it?

> remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
> with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T

Yes, that's because the ipcrm command is wrong and did not work,
it should read "ipcrm -M 0x4d040761" without the spurious "size..." text.

K.O.
  1244   20 Feb 2017 Konstantin OlchanskiBug Reportincrease event buffer size
> > memory in my PC is 16 GB 
> 
> You can safely go to buffer size 100 Mbytes or more.
> 
> > I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".
> 
> Good.


No, wait, this is all wrong. If it says POSIX shared memory, how come it later
complains about SYSV shared memory and tells you to run SYSV shared memory
commands like ipcrm?!?


> > But there is no buffer sizes in "/Experiment" 


Now this kind of makes sense - you are probably running a strange mixture
of very old and recently new MIDAS. Probably you current version is so old
that it does not use .SHM_TYPE.TXT and can only do SYSV shared memory
and so old it does not have "/Experiment/buffer sizes".

But at some point you must have run a recent version of midas, or you would
not have the file .SHM_TYPE.TXT in your experiment directory.

I say:

a) run the correct ipcrm command (without the spurious "size..." text)
b) review your computer contents to identify all the versions of midas
   and to make sure you are using the midas you want to use (old or new,
   whatever), but not some wrong version by accident (incorrect PATH setting, etc)

As MIDAS developers, we usually recommend that you use the latest version of MIDAS,
certainly latest version is simpler to debug.

K.O.
  1663   14 Aug 2019 Konstantin OlchanskiBug Fixincorrect recursion in ss_suspend() via the user event handler
The ROOTANA midas analyzer uncovered a problem with recursive use of ss_suspend().

When running in graphical mode, the ROOT graphics main event loop was calling 
ss_suspend(0) (no MSG_BM) which would sometimes call the user event handler (if a new 
event arrives into the midas event buffer). Because this loop was already running in the 
user event handler, there was a crash.

This is the calling sequence leading to the incorrect recursion: (from system.cxx comments 
to ss_suspend())

analyzer ->
     -> cm_yield() in the main loop
     -> ss_suspend(0)
     -> MSG_BM message arrives in the UDP socket
     -> ss_suspend_process_ipc(0)
     -> cm_dispatch_ipc()
     -> bm_push_event()
     -> bm_push_buffer()
     -> bm_read_buffer()
     -> bm_dispatch_event()
     -> user event handler
     -> user event handler ROOT graphics main loop needs to sleep
     -> ss_suspend(0) <--- should be ss_suspend(MSG_BM)!!!     
     -> MSG_BM message arrives in the UDP socket
     -> ss_suspend_process_ipc(0) <- should be ss_suspend_process_ipc(MSG_BM)!!!
     -> cm_dispatch_ipc() <- without MSG_BM, calling cm_dispatch_ipc() again
     -> bm_push_event()
     -> bm_push_buffer()
     -> bm_read_buffer()
     -> bm_dispatch_event()
     -> user event handler <---- called recursively, very bad!

The proper fix for this is to always call ss_suspend(MSG_BM) from the user event handler 
and ss_suspend(MSG_ODB) from the user db_watch handler.

In this second case, ss_suspend(MSG_OBM) will lose/ignore/discard db_watch notifications, 
if you do not want that, call ss_suspend(0) and be ready for recursive calls to your 
db_watch handler. (this can happen in a frontend program that acts on changes in ODB and 
where some of these actions may require sleeping via ss_suspend()).

ss_suspend(MSG_BM) will discard MSG_BM messages, which is not a problem as 
bm_wait_for_events() and cm_yield() will immediately poll the event buffer and there will be 
no delay in receiving new events.

Until commit 99d6e90 there were problems in ss_suspend(MSG_BM) and recursive calls to 
the user event handler were still possible. It is now fixed in this and the previous commits.

K.O.
  1170   18 Mar 2016 William PageBug Reportincomplete copy using odbedit copy
Hi,

Attempting to copy a subtree to a new location in the ODB using odbedit with "copy <src> <dest>" is 
occasionally not copying the entire <src> subtree.

I am experiencing this issue consistently when trying to copy subtrees from the "/Equipment" ODB tree to 
a new location.  The first 2-3 variables/directories of the <src> subtree will be copied to <dest> but the 
full subtree will not be copied over.
  1171   22 Mar 2016 Stefan RittBug Reportincomplete copy using odbedit copy
> Hi,
> 
> Attempting to copy a subtree to a new location in the ODB using odbedit with "copy <src> <dest>" is 
> occasionally not copying the entire <src> subtree.
> 
> I am experiencing this issue consistently when trying to copy subtrees from the "/Equipment" ODB tree to 
> a new location.  The first 2-3 variables/directories of the <src> subtree will be copied to <dest> but the 
> full subtree will not be copied over.

I just tried myself and could successfully copy of even large trees in the /Equipment subtree. Need to reproduce the problem to fix 
it. Maybe close-to-full ODB?

Stefan
  1705   27 Sep 2019 Konstantin OlchanskiBug Fiximprovement for midas web page resource use (alarm sound)
> I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser 
> "task manager" and by other tools.
> 
> For example, size of the "status" page was observe to reach 200, 600 and even 900 Mbytes.
> [this was fixed by using set_if_changed(e, v);
> 
> Also I observed the midas web pages consume an unusual amount of CPU - 5-10-15% - all in inactive tabs in minimized 
> windows.
> 

The case of high CPU use turned out to be quite nasty.

The symptoms:
- the "programs" page in an inactive tab in a minimized window sits "doing nothing" for a day or two.
- uses about 0 to 0.1 to 1% CPU and 40-50-60 Mbytes of RAM (after the previous improvements)
- suddenly I see it use 10-15-20% CPU, continuously, non stop
- I open this tab
- suddenly, CPU use goes to 100%, memory use quickly grows from 40-50-60 Mbytes to 100-200 Mbytes.
- after a few seconds everything settles down, CPU use is back to 0-0.1-1%, but memory use does not go down.
- WTH?!?

The culprit turned out to be the playing of the alarm sound. (I have all tabs "muted" by default, also speakers usually powered down).

If I comment-out the playing of the alarm sound, this problem goes away completely. Pretty conclusive, I think.

After adding lots of debug console.log() calls, I think I identified the problem: audio objects were being created,
but they were not starting to play their sound files. When I opened the tab, all of them (about 400) at the same time
loaded the mp3 file (resulting in memory use going from 50 Mbytes to 190 Mbytes, typical) and started playing
(as seen on the audio event activity in the cpu profile traces from the google-chrome "performance" tool).

I think I am looking at an unexpected interaction between audio objects and google-chrome throttling of inactive tabs.

To muddy the waters some more, google-chrome periodically fails audio.play() with an exception to the effect of
"we will not play audio because user is not interacting with this page enough". See
https://bitbucket.org/tmidas/midas/issues/191/exception-on-audioplay

Now I think I have this sort of fixed. I have to handle the audio.play() failure (which is not a normal exception,
but a rejected promise, the handler is quite different), and I do not allow creating new audio objects if previous
audio object did not finish playing.

(note the "normal" timing: periodic update every 1 sec, playing of alarm sound event 60 seconds, length of alarm sound file is 3 sec,
two sound files should never overlap. now a console.log message is printed if overlap is detected)

This leaves us with the problem of alarm sound not playing "because the user didn't interact with the document first",
and I think there is nothing I can do about that.

K.O.

P.S. Another quirk is I discovered: go to the "config" page and press the new buttons "play test sound" and "speak test message". In muted
tabs, the test sound will not sound, but the test message will be shouted out loudly. This seems inconsistent to me. Unwanted audio/video ads
are blocked but loud shouting of "shave with burma-shave" is permitted. I also wonder if speaking is subject to this
"user did not interact" business. If not, we could replace the playing of our relaxing alarm beep with the yelling of "alarm! alarm! alarm!".

K.O.
ELOG V3.1.4-2e1708b5