ELOG Midas

Back Midas Rome Roody Rootana

Midas DAQ System, Page 72 of 159

Not logged in

Find | Login | Help

Full | Summary | Threaded | Hide attachments

3169 Entries

Goto page Previous 1, 2, 3 ... 71, 72, 73 ... 157, 158, 159 Next

ODB dump format: json - events 0x8000 and 0x8001 missing

(Please post messages in "plain" mode, they are much easier to answer)

Thank you for reporting this problem. I will try to reproduce it.

In addition, I will say a few words about your version of midas:

> GIT revision: midas-2019-09-i-1-gd93944ce-dirty on branch develop

I recommend that for production systems one used the tagged release versions of midas. 
(i.e. see https://midas.triumf.ca/elog/Midas/1750).

(Your midas is "1 commit after the latest tag" - the "-1" in the git revision).

I apply bug fixes to both the release branch and the develop branch, but for you to get 
these fixes, on the develop branch you will also "get" all the unrelated changes that may 
come with new bugs. On the release branch, you will only get the bug fixes.

In your midas version it says "-dirty" which means that you have local modifications to the 
midas sources. With luck those changes are not related to the bug that you see. (but I 
cannot tell). You can do "git status" and "git diff" to see what the local changes are.

It is much better if bugs are reported against "clean" builds of MIDAS (no "-dirty").


K.O.


<p>&nbsp;</p>

<table align="center" cellspacing="1" style="border:1px solid #486090; width:98%">
	<tbody>
		<tr>
			<td style="background-color:#486090">Peter Kunz wrote:</td>
		</tr>
		<tr>
			<td style="background-color:#FFFFB0">
			<p>MIDAS version: &nbsp; &nbsp; &nbsp;2.1<br />
			GIT revision: &nbsp; &nbsp; &nbsp; Tue Dec 31 17:40:14 2019 +0100 - 
midas-2019-09-i-1-gd93944ce-dirty on branch develop</p>

			<p>/Logger/Channels/0/Settings</p>

			<table border="3" cellpadding="1" class="dialogTable">
				<tbody>
					<tr>
						<td>ODB dump</td>
						<td>y</td>
					</tr>
					<tr>
						<td>ODB dump format</td>
						<td>json</td>
					</tr>
				</tbody>
			</table>

			<p>With the settings above the file last.json generated for a new run is 
empty and the events 0x8000 and 0x8001 are missing in the .mid file.</p>

			<p>When setting &quot;ODB dump format&quot; to &quot;xml&quot;, events 
0x8000 and 0x8001 are included in the .mid file, however, the file last.xml is not created.
</p>

			<p>&nbsp;</p>

			<p>&nbsp;</p>

			<p>&nbsp;</p>
			</td>
		</tr>
	</tbody>
</table>

<p>&nbsp;</p>

(please post messages in "plain" mode, they are much easier to answer)

- nvidia problems - this code was contributed by Joseph (I think?), with luck he will look into 
this problem.

- ROOT problem - it looks like the error is thrown by the ROOT header files, has nothing to do 
with MIDAS?

So what ROOT are you using? I recommend installing ROOT by following instructions at 
root.cern.ch.

Perhaps you used the ROOT packages from the EPEL repository? I have seen trouble with 
those packages before (miscompiled; important optional features turned off; very old 
versions; etc).

P.S.

Historically, ROOT has caused so many reports of "cannot build midas" that I consistently 
vote to "remove ROOT support from MIDAS". But Stefan's code for writing MIDAS data into 
ROOT files is so neat, cannot throw it away. And some people do use it. So at the latest MIDAS 
bash this Summer we decided to keep it.

(Only build targets to use ROOT are the rmlogger executable and the rmana.o object file (and 
it's one-man-army library)).

But.

In the past, one could use "make -k" to get past the errors caused by ROOT, everything will 
get built and installed, except for the code that failed to build.

Now with cmake, it is "all or nothing", if there is any compilation error, nothing gets installed 
into the "bin" directory. So one must discover and use "NO_ROOT=1" (which becomes sticky 
until the next "make cclean". Some people are not used to sticky "make" options, I just got 
burned by this very thing last week).

Perhaps there is a way to tell cmake to ignore compile errors for rmlogger and rmana.


K.O.



<p>&nbsp;</p>

<table align="center" cellspacing="1" style="border:1px solid #486090; width:98%">
	<tbody>
		<tr>
			<td style="background-color:#486090">Peter Kunz wrote:</td>
		</tr>
		<tr>
			<td style="background-color:#FFFFB0">
			<p>While upgrading to the latest MIDAS version</p>

			<p>MIDAS version: 2.1 GIT revision: Tue Dec 31 17:40:14 2019 +0100 - midas-
2019-09-i-1-gd93944ce-dirty on branch develop</p>

			<p>ODB version: 3</p>

			<p>I encountered two issues using cmake</p>

			<p>1. on machines with NVIDIA drivers:</p>

			<div style="background:#eee;border:1px solid #ccc;padding:5px 
10px;">nvml.h: no such file or directory</div>

			<p>(nvml.h doesn&#39;t seem to be part of the standard nvidia driver 
package.)</p>

			<p>2. Complile including ROOT throws an error with ROOT 6.12/06 on Centos7 
and with ROOT 6.18/04 on Fedora 31:</p>

			<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;">[ 29%] 
Building CXX object CMakeFiles/rmana.dir/src/mana.cxx.o In file included from 
/usr/include/root/TString.h:28, from /usr/include/root/TCollection.h:29, from 
/usr/include/root/TSeqCollection.h:25, from /usr/include/root/TList.h:25, from 
/usr/include/root/TQObject.h:40, from /usr/include/root/TApplication.h:30, from 
/home/pkunz/packages/midas/src/mana.cxx:60:</div>

			<div style="background:#eee;border:1px solid #ccc;padding:5px 
10px;">/usr/include/root/ROOT/RStringView.hxx:32:37: error: &lsquo;experimental&rsquo; in 
namespace &lsquo;std&rsquo; does not name a type 32 | using basic_string_view = 
::std::experimental::basic_string_view&lt;_CharT,_Traits&gt;;</div>

			<p>A workaround (which works for me) is to compile with</p>

			<p><tt>cmake .. -DNO_ROOT=1 -DNO_NVIDIA=1</tt></p>
			</td>
		</tr>
	</tbody>
</table>

<p>&nbsp;</p>

frontend issues with midas-2019-09

After upgrading to the lastes MIDAS version I got the DAQ frontend of my application running by changing all compiler directives from cc to g++ and using

#include "mfe.h"

extern HNDLE hDB

extern "C" {
#include <CAENComm.h>
}

With these changes everything seems to work fine.

However, I'm having trouble with a slow control frontend using a tcpip driver. It compiled well with the older MIDAS version. Even though all the functions in question are defined in the frontend code, the following error comes up:

g++ -o feMotor -DOS_LINUX -Dextname -g -O2 -fPIC -Wall -Wuninitialized -fpermissive -I/home/pkunz/packages/midas/include -I. -I/home/pkunz/packages/midas/drivers/bus /home/pkunz/packages/midas/lib/libmidas.a feMotor.o /home/pkunz/packages/midas/drivers/bus/tcpip.o cd_Galil.o /home/pkunz/packages/midas/lib/libmidas.a /home/pkunz/packages/midas/lib/mfe.o -lm -lz -lutil -lnsl -lpthread -lrt
/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `initialize_equipment()':
/home/pkunz/packages/midas/src/mfe.cxx:687: undefined reference to `interrupt_configure(int, int, long)'
/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `readout_enable(unsigned int)':
/home/pkunz/packages/midas/src/mfe.cxx:1236: undefined reference to `interrupt_configure(int, int, long)'
/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:1238: undefined reference to `interrupt_configure(int, int, long)'
/usr/bin/ld: /home/pkunz/packages/midas/lib/mfe.o: in function `main':
/home/pkunz/packages/midas/src/mfe.cxx:2791: undefined reference to `interrupt_configure(int, int, long)'
/usr/bin/ld: /home/pkunz/packages/midas/src/mfe.cxx:2792: undefined reference to `interrupt_configure(int, int, long)'
collect2: error: ld returned 1 exit status
make: *** [Makefile:36: feMotor] Error 1

I guess the the aforementioned DAQ frontend compiles because its equipment definitions don't call on the function `initialize_equipment()', but I can't figure out why it doesn't work. Help is appreciated. P.K.

ODB dump format: json - events 0x8000 and 0x8001 missing

MIDAS version: 2.1
GIT revision: Tue Dec 31 17:40:14 2019 +0100 - midas-2019-09-i-1-gd93944ce-dirty on branch develop

/Logger/Channels/0/Settings

ODB dump	y
ODB dump format	json

With the settings above the file last.json generated for a new run is empty and the events 0x8000 and 0x8001 are missing in the .mid file.

When setting "ODB dump format" to "xml", events 0x8000 and 0x8001 are included in the .mid file, however, the file last.xml is not created.

While upgrading to the latest MIDAS version

MIDAS version: 2.1 GIT revision: Tue Dec 31 17:40:14 2019 +0100 - midas-2019-09-i-1-gd93944ce-dirty on branch develop

ODB version: 3

I encountered two issues using cmake

1. on machines with NVIDIA drivers:

nvml.h: no such file or directory

(nvml.h doesn't seem to be part of the standard nvidia driver package.)

2. Complile including ROOT throws an error with ROOT 6.12/06 on Centos7 and with ROOT 6.18/04 on Fedora 31:

[ 29%] Building CXX object CMakeFiles/rmana.dir/src/mana.cxx.o In file included from /usr/include/root/TString.h:28, from /usr/include/root/TCollection.h:29, from /usr/include/root/TSeqCollection.h:25, from /usr/include/root/TList.h:25, from /usr/include/root/TQObject.h:40, from /usr/include/root/TApplication.h:30, from /home/pkunz/packages/midas/src/mana.cxx:60:

/usr/include/root/ROOT/RStringView.hxx:32:37: error: ‘experimental’ in namespace ‘std’ does not name a type 32 | using basic_string_view = ::std::experimental::basic_string_view<_CharT,_Traits>;

A workaround (which works for me) is to compile with

cmake .. -DNO_ROOT=1 -DNO_NVIDIA=1

midas on centos-8 status

I now have a centos-8 computer and I tried midas on it:

- the develop and midas-2019-09 branches build, mhttpd runs
- there are compiler warnings about use of strncpy() that need to be looked into, but see https://stackoverflow.com/questions/50198319/gcc-8-wstringop-truncation-what-is-
the-good-practice
- mhttpd built-in https support does not seem to work (see the other forum thread)
- apache httpd proxy for https can be made to work, but there are problems with certbot.

K.O.

SSL_ERROR_NO_CYPHER_OVERLAP

> > The Centos-7 instructions for setting up apache httpd proxy are here, with luck they work on centos-8:
> > https://daq.triumf.ca/DaqWiki/index.php/SLinstall#Configure_HTTPS_server_.28CentOS7.29

I now have a centos-8 computer, I followed my instructions and they generally worked.

There is a number of problems with the certbot package that prevent me from writing coherent production quality instructions for centos-8.

But at the end I was successful, httpd runs, gets "A+" rating from SSLlabs, forwards the requests to mhttpd, I can access the midas status page etc.

With luck the certbot packages for centos-8 will be sorted out soon (the apache plugin seems to be missing, this causes the automatic
certificate renewal to not work) and I will update my instructions to include centos-8.

Until then, I recommend that people continue to use centos-7 or the current Ubuntu LTS release.

K.O.

SSL_ERROR_NO_CYPHER_OVERLAP

> I am using CentOS 8 [and]
> Firefox 71.0

I now have a centos-8 machine, I successfully built midas and I confirm that there is a problem.

But I get different errors from you:

- google chrome - does not connect at all (without any useful error message: "This site can�t be reached. The 
connection was reset.")
- firefox complains about the self-signed certificate, but connects ok, I see the midas status page and it works. "page 
info" reports connection is TLS 1.3, encryption TLS_AES_128_GCM_SHA256. However, the function "view certificate" 
does not work (without any useful error message).

I tried to run the SSLlabs tool to get some more information from mhttpd, but it does not want to run against mhttpd on 
port 8443... I do have a port redirect program somewhere... need to find it...

K.O.

SSL_ERROR_NO_CYPHER_OVERLAP

Hi,
As, the link suggests, I perform "yum install -y mod_ssl certwatch crypto-utils" but it complains as:
No match for argument: certwatch
No match for argument: crypto-utils

You may have a look on this link: https://blog.cloudware.bg/en/whats-new-in-centos-linux-8/
What�s gone?
In with the new, out with the old. CentOS 8 also says goodbye to some features. The OS removes several security functionalities. Among them is the Clevis HTTP pin, Coolkey and crypto-utils.
Cent OS 8 comes with securetty disabled by default. The configuration file is no longer included. You can add it back, but you will have to do it yourself. Another change is that shadow-utils no longer allow all-numeric user and group names.

Thanks
Mehran

> Hi, I have not run midas on Centos-8 yet. Maybe there is a problem with the openssl library there. The Centos-7 
> instructions for setting up apache httpd proxy are here, with luck they work on centos-8:
> https://daq.triumf.ca/DaqWiki/index.php/SLinstall#Configure_HTTPS_server_.28CentOS7.29
> 
> K.O.
>

SSL_ERROR_NO_CYPHER_OVERLAP

Hi, I have not run midas on Centos-8 yet. Maybe there is a problem with the openssl library there. The Centos-7 
instructions for setting up apache httpd proxy are here, with luck they work on centos-8:
https://daq.triumf.ca/DaqWiki/index.php/SLinstall#Configure_HTTPS_server_.28CentOS7.29

K.O.


> Hi Konstantin,
> Thanks for your reply, 
> 
> > What Linux? (on most linuxes, run "lsb_release -a")
> > What version of midas? (run odbedit "ver" command)
> I am using CentOS 8
> 
> > What version of firefox? (from the "about firefox" menu)
> Firefox 71.0
> 
> Thanks 
> Mehran
> 
> > No you cannot fix it from inside firefox. The issue is that the overlap of encryption methods
> > supported by your firefox and by your openssl library (used by mhttpd) is an empty set.
> > No common language, so to say, communication is impossible.
> > 
> > So either you have a very old openssl but very new firefox, or a very new openssl but very old 
> > firefox. Both very old or both very new can talk to each other, difficulties start with greater  
> > difference in age, as new (better) encryption methods are added and old (no-longer-secure) 
> > methods are banished.
> > 
> > BTW, for good security we recommend using apache httpd as the https proxy (instead of built-in 
> > https support in mhttpd). (I am not sure what it says in the current documentation). (But apache 
> > httpd will use the same openssl library, so this may not solve your problem. Let's see what 
> > versions of software you are using, per questions above, first).
> > 
> > K.O.

SSL_ERROR_NO_CYPHER_OVERLAP

Hi Konstantin,
Thanks for your reply, 

> What Linux? (on most linuxes, run "lsb_release -a")
> What version of midas? (run odbedit "ver" command)
I am using CentOS 8

> What version of firefox? (from the "about firefox" menu)
Firefox 71.0

Thanks 
Mehran

> No you cannot fix it from inside firefox. The issue is that the overlap of encryption methods
> supported by your firefox and by your openssl library (used by mhttpd) is an empty set.
> No common language, so to say, communication is impossible.
> 
> So either you have a very old openssl but very new firefox, or a very new openssl but very old 
> firefox. Both very old or both very new can talk to each other, difficulties start with greater  
> difference in age, as new (better) encryption methods are added and old (no-longer-secure) 
> methods are banished.
> 
> BTW, for good security we recommend using apache httpd as the https proxy (instead of built-in 
> https support in mhttpd). (I am not sure what it says in the current documentation). (But apache 
> httpd will use the same openssl library, so this may not solve your problem. Let's see what 
> versions of software you are using, per questions above, first).
> 
> K.O.

SSL_ERROR_NO_CYPHER_OVERLAP

> I am quite new in both Linux and MIDAS.
> I have install MIDAS on my desktop by going through this link: 
> https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux 
> 
> in the last step when I send "mhttpd" command and try to open the link 
> https://localhost:8443 (of course, changing the localhost with my host name), it 
> failed to connect and shows this error: SSL_ERROR_NO_CYPHER_OVERLAP (please see 
> attached file includes a screenshot of the error).

What Linux? (on most linuxes, run "lsb_release -a")
What version of midas? (run odbedit "ver" command)
What version of firefox? (from the "about firefox" menu)

> I have tried many ways to solve this problem: In Firefox: going to option/privacy 
> and security/ security and uncheck the option "Block dangerous and deceptive 
> content". but it does not help.

No you cannot fix it from inside firefox. The issue is that the overlap of encryption methods
supported by your firefox and by your openssl library (used by mhttpd) is an empty set.
No common language, so to say, communication is impossible.

So either you have a very old openssl but very new firefox, or a very new openssl but very old 
firefox. Both very old or both very new can talk to each other, difficulties start with greater  
difference in age, as new (better) encryption methods are added and old (no-longer-secure) 
methods are banished.

BTW, for good security we recommend using apache httpd as the https proxy (instead of built-in 
https support in mhttpd). (I am not sure what it says in the current documentation). (But apache 
httpd will use the same openssl library, so this may not solve your problem. Let's see what 
versions of software you are using, per questions above, first).

K.O.

SSL_ERROR_NO_CYPHER_OVERLAP

Hello,

I am quite new in both Linux and MIDAS.
I have install MIDAS on my desktop by going through this link: 
https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux 

in the last step when I send "mhttpd" command and try to open the link 
https://localhost:8443 (of course, changing the localhost with my host name), it 
failed to connect and shows this error: SSL_ERROR_NO_CYPHER_OVERLAP (please see 
attached file includes a screenshot of the error).

I have tried many ways to solve this problem: In Firefox: going to option/privacy 
and security/ security and uncheck the option "Block dangerous and deceptive 
content". but it does not help.

Looking forward your help
Thanks
Mehran

Attachment 1: MIDAS_SSL_ERROR.png

midas-2019-09-i is here.

- the new sequencer web pages written in html+javascript (NewSequencer), the old c-generated sequencer pages still work (Sequencer)
- python-client from Ben Smith merged in, see documentation at https://bitbucket.org/tmidas/midas/src/develop/python/

To obtain this release, either checkout the top of branch feature/midas-2019-09 (recommended)
or checkout the tag midas-2019-09-i.

K.O.

midas-2019-09-g is here.

- the last bug in the new history plots is fixed, please try them out, plus
- "<<" and "<<<" buttons now work for going back to the old data
- "+" and "-" buttons are added for zooming in and out.

- the new sequencer web pages have been activated, the old sequencer page moved to "OldSequencer"

To obtain this release, either checkout the top of branch feature/midas-2019-09 (recommended)
or checkout the tag midas-2019-09-g.

K.O.


> > I created the release branch for midas-2019-09 and tag midas-2019-09-a.
> > Since the previous release midas-2019-06, some news:
> > 
> > - new history graphics (Stefan)
> > - c++ frontend framework mvodb.h and tmfe.h merged from ALPHA-g (K.O.)
> > - we think we have all the fallout from switching to cmake and to c++11 sorted out
> > 
> 
> midas-2019-09-e is here.
> 
> - the new history plots now work both for Stefan *and* for me, please try them out!
> - no new problems with cmake and c++11.
> - fixes for some reported bugs
> - some bugs remain to be fixed, so with luck, there will by a midas-2019-09-f.
> 
> > add all the pending missing code (the new sequencer web pages, the "m" analyzer, etc
> 
> pending for midas-2019-12:
> 
> - new sequencer web pages
> - the "m" analyzer merge (from rootana)
> - python-client branch merge (thanks to Ben!)
> - simplified odb settings for mlogger and mhttpd configuration
> - mhttpd update to mongoose 6.16
> 
> To obtain this release, either checkout the top of branch feature/midas-2019-09 (recommended)
> or checkout the tag midas-2019-09-e.
> 
> K.O.

c++11 for RHEL/SL/CentOS-6

> The default el6 (RHEL/SL/CentOS-6) compiler is gcc-4.4.7, it does not support c++11, not even a little bit.

The previously posted instructions are incomplete - one cannot cross-compile 32-bit executables (i.e. for running on 32-bit VME 
processors) because 64-bit packages are missing 4 files for the 32-bit C++ standard library (libstdc++_nonshared.a).

After a bit of searching I found the missing files, i.e. here:
https://copr-be.cloud.fedoraproject.org/results/mayeut/devtoolset-8/epel-6-i386/01045166-devtoolset-8-gcc/

There are 2 options:

a) install the 32-bit development package:
rpm -vh --install https://ladd00.triumf.ca/~olchansk/devtoolset-8/devtoolset-8-libstdc++-devel-8.3.1-3.1.el6.i686.rpm

b) install just the 4 missing files from here:
https://ladd00.triumf.ca/~olchansk/devtoolset-8/i686-redhat-linux/8/
into
/opt/rh/devtoolset-8/root/usr/lib/gcc/i686-redhat-linux/8/

After doing this, "make linux32" builds. (requires latest midas-2019-09 for minor Makefile fixes)

K.O.


> 
> Do this to install newer c++ compilers and build MIDAS with c++11:
> 
> ssh root@sl6machine
> # yum install centos-release-scl-rh
> # yum install devtoolset-8
> # yum install cmake3
> # scl -l
> devtoolset-8
> ...
> 
> $ ssh user@sl6machine
> $ scl enable devtoolset-8 bash
> $ gcc -v
> COLLECT_LTO_WRAPPER=/opt/rh/devtoolset-8/root/usr/libexec/gcc/x86_64-redhat-linux/8/lto-wrapper
> gcc version 8.3.1 20190311 (Red Hat 8.3.1-3) (GCC) 
> $ cd git/midas
> $ make cclean
> $ make cmake3
> $ ls -l bin/odbedit
> 
> K.O.

> I created the release branch for midas-2019-09 and tag midas-2019-09-a.
> Since the previous release midas-2019-06, some news:
> 
> - new history graphics (Stefan)
> - c++ frontend framework mvodb.h and tmfe.h merged from ALPHA-g (K.O.)
> - we think we have all the fallout from switching to cmake and to c++11 sorted out
> 

midas-2019-09-e is here.

- the new history plots now work both for Stefan *and* for me, please try them out!
- no new problems with cmake and c++11.
- fixes for some reported bugs
- some bugs remain to be fixed, so with luck, there will by a midas-2019-09-f.

> add all the pending missing code (the new sequencer web pages, the "m" analyzer, etc

pending for midas-2019-12:

- new sequencer web pages
- the "m" analyzer merge (from rootana)
- python-client branch merge (thanks to Ben!)
- simplified odb settings for mlogger and mhttpd configuration
- mhttpd update to mongoose 6.16

To obtain this release, either checkout the top of branch feature/midas-2019-09 (recommended)
or checkout the tag midas-2019-09-e.

K.O.

mfe.c: MIDAS frontend's 'Equipment name' can embed hostname, determined at run-time

A little advertised feature of the modifications needed support the msysmon program is 
that MIDAS equipment names can support the injecting of the hostname of the system 
running the frontend at runtime (register_equipment(void)).

https://midas.triumf.ca/MidasWiki/index.php/Equipment_List_Parameters#Equipment_Name

A special string ${HOSTNAME} can be put in any position in the equipment name. It will 
be replaced with the hostname of the computer running the frontend at run-time. Note, 
the frontend_name string will be trimmed down to 32 characters.
Example usage: msysmon


EQUIPMENT equipment[] = {

  { "${HOSTNAME}_msysmon",   /* equipment name */    {
      EVID_MONITOR, 0,      /* event ID, trigger mask */
      "SYSTEM",             /* event buffer */
      EQ_PERIODIC,          /* equipment type */
      0,                    /* event source */
      "MIDAS",              /* format */
      TRUE,                 /* enabled */
      RO_ALWAYS,            /* Read when running */
      10000,                /* poll every so milliseconds */
      0,                    /* stop run after this event limit */
      0,                    /* number of sub events */
      1,                    /* history period */
      "", "", ""
    },
    read_system_load,/* readout routine */
  },
  { "" }
};

improvement for midas web page resource use (alarm sound and fit_message)

> > > I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser
> > > "task manager" and by other tools.

The work on this problem has been blogged in the bitbucket issue tracker:
https://bitbucket.org/tmidas/midas/issues/158/midas-status-page-memory-leak

K.O.

Below is a dump of the issue for posterity ---

Team Midas MIDAS related packages midas Issues
midas status page memory leak
Create issue
Issue #158 RESOLVEDOpen Workflow More Edit
dd1 created an issue 2018-12-25
I have the midas status page (https://daq16.triumf.ca/) open in macos google chrome 71.0.3578.98 and I watch in the "task manager" how the memory use is
246 Mbytes and growing at around 1 Mbyte every 2-3 seconds. CPU use is around 3-5%, network use is 47 kBytes/sec. The slowly growing memory use
indicates that we have a memory leak. (Note that javascript uses "automatic garbage collection" memory management, which does not eliminate memory
leaks. Only capability to explicitly free unused memory is eliminated). K.O.

Comments (35)
dd1 REPORTER
Actual memory use goes up to around 250-something MBytes, then drops down to 240-something, them slowly grows back up, drops down, rinse, repeat.
This is the javascript garbage collection in action. So there is no memory leak on the status page, but still why do we generate around 1 Mbyte/sec of
javascript memory allocations? As comparison, the NYTimes front page consumes 270 Mbytes. One would expect the midas front page to be much more light
weight... K.O.

Edit Pin to top Mark as spam Delete 2018-12-27
dd1 REPORTER
Then there is a question of memory use by the "message" page. This page does grow infinitely large by design - as new messages are added to midas.log -
as as the user keeps scrolling the messages back in time. Perhaps we should somehow limit the total memory use there... K.O.

Edit Pin to top Mark as spam Delete 2018-12-27
Stefan Ritt
changed status to closed
I see the same behaviour. The relatively large memory allocation by Chrome probably comes from some bitmap caching. The browser prints the page
contents into some temporary bitmap and then flushes it to the screen. That can easily take a few MB. I monitor such behaviour since several years now (for
other processes) and concluded that I don't need to worry about JavaScript memory consumption.

Concerning the messages page: One line takes about 100 Bytes. If you scroll really fast, you can do maybe 30 lines per second, thus 3kB. If we allow the
browser to consume another 100 MB (should be easily possible these days), you have to continuously scroll for 100000kb/3kb=30000 seconds or eight
hours. Good luck!

Closing this topic if no complaints.

Pin to top Mark as spam Delete 2019-01-08
dd1 REPORTER
changed status to open
still see high memory use by midas pages. K.O.

Edit Pin to top Mark as spam Delete 2019-09-15
dd1 REPORTER
See high memory use from long running (days-weeks) web pages:

status page of my test experiment - 953 MB - 155 MB after reload
odb editor - 661 MB - 80 MB after reload
programs page - 602 MB - 64 MB after reload
sequencer - 253 MB - 151 MB after reload
sequencer - ??? (very big) - reloaded before I wrote it down
I think we are leaking memory somewhere. Or causing unnecessary allocations that the javascript garbage collector does not keep up with or does not
cleanup correctly. K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-09-15
dd1 REPORTER
I am suspicious of memory use trouble from periodic-update code that keeps setting innerHTML to the same value as it was before, unnecessarily. (this also
causes other problems - cannot cut-and-paste affected parts of the web page, high cpu use to redraw the (unchanged) page). K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-09-15
Stefan Ritt
For setting innerHTML we should always use

if (text !== control.inner HTML)

control.innerHTML = text;

&#8204;

I thought I caught most of the cases, but I might have missed some. Please add as needed.

&#8204;

Stefan

Pin to top Mark as spam Delete 2019-09-15
dd1 REPORTER
Strange things continue. Just say huge CPU usage from 3 midas web pages (odb editor, programs page and the new sequencer page). All 3 pages are tabs in
an iconized browser window. Suddenly machine feels slow, and I see all 3 use 25% CPU each (by the chrome-browser task manager window). Opened the
browser window, sent to the offending tabs, nothing looks amiss, CPU usage went back to 0%. WTH? (all 3 pages have 100 Mbyte memory use, all 3 pages
update at 1 Hz). K.O.

Edit Pin to top Mark as spam Delete 2019-09-16
dd1 REPORTER
looked at the �programs� page. learned how to use the google-chrome �performance� tool. I was definitely leaking html nodes. The leak was in an
unexpected place - innerHTML with a link was miscomparing because of unexpected string transformation:

xbad: "<a href='?cmd=odb&odb_path=System/Clients/" + key + "'>" + host + "</a>";
good: "<a href=\"?cmd=odb&amp;odb_path=System/Clients/" + key + "\">" + host + "</a>";
Now node leak from my periodic update went from 35 nodes to 2 nodes per update. The performance tool fails to identify where these last 2 nodes are
coming from.

K.O.

Edit Pin to top Mark as spam Delete 2019-09-17
dd1 REPORTER
Forgot to add - the periodic update from mhttpd_init() is also leaking nodes. I will look at it some other time. K.O.

Edit Pin to top Mark as spam Delete 2019-09-17
dd1 REPORTER
after improvement to the �programs� page, the tab is staying at 50-60 Mbytes. promising� K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
Fixed node leak in mhttpd_refresh(): the alarm display was setting e.innerHTML even if it did not change.

There only remains an unavoidable node leak with �mheader_last_updated� where we set the current time every 1 second. If I comment this out, there is no
node leak on the �programs� page.

K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
�programs� page memory use now sits around 40 Mbytes. K.O.

Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
Stefan points me to the use of e.firstChild.data instead of e.innerHTML, per https://medium.com/@ok.bayat/fixing-memory-leak-problem-in-javascript-
application-ed3a2d9d92df

K.O.

Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
implemented this for the timestamp update and (i.e.) the �programs� page now leaks 0 nodes. memory use for all pages sits around 40-60 Mbytes. K.O.

Edit Pin to top Mark as spam Delete 2019-09-26
dd1 REPORTER
see problem of high cpu usage again, after google-chrome restarted after an update to latest version. for example, �program� page is 65 Mbytes, uses 20%
CPU. (in an inactive tab). If I open this tab, for maybe 10 seconds, it goes to 100+ Mbytes with big CPU usage (>100%), then drops down to 90 Mbytes, 0%
CPU usage. I do not see any other web pages or tabs doing this. Only our midas pages. WTH!?! K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-10-13
dd1 REPORTER
figured out high cpu usage reported as �rendering�. Open �devtools�, goto �performance�, press Command-Shift-P, start typing �rendering�, select �fps
meter�. A black square will open in top-left, showing graphics activity (frame rate, GPU usage, etc).

Now wait for new message to appear in the top status bar. It will be �yellow� at first, that it will fade to �gray�. During this fading, GPU use is 100% during
about 1 second, FPS is about 50 frames/sec.

K.O.

Edit Pin to top Mark as spam Delete 2019-10-14
dd1 REPORTER
quick google search shows much discussion about css animations using �too much CPU�, i.e. google �css pulsing background�, but no clear way to tell the
browser to slow down. It looks to me like the background-color animation tries to run at maximum possible frame-rate, as if electricity is free. (Since I am
debugging high-cpu and high-memory use of inactive tabs, there is nobody looking at these animations). K.O.

Edit Pin to top Mark as spam Delete 2019-10-14
Stefan Ritt
New messages are displayed with a yellow background and fade to grey after 5 seconds. This is handeled in mhttpd.js around line 2144. You can try to
remove the lines

d.style.setProperty("-webkit-transition", "background-color 3s", "");
d.style.setProperty("transition", "background-color 3s", "");
and see if the CPU load goes down.

Pin to top Mark as spam Delete 2019-10-14
dd1 REPORTER
captured another trace of midas page using 20% cpu in an inactive tab, iconized browser window. capturing is difficult, requires very fast mousing to: select
the right tab, right-click to �inspect�, select �performance� tab, click on �start capture�, and hope that by this time the web page activity does not complete.
this time I got the last 200 ms or so.

what I see is again is �media activity� (only identified as �task�), GPU activity (only identified as �GPU activity�) and main thread activity (identified as an
infinitely repeating sequence of �receive response 206 audio/mpeg�, �receive data 39287 bytes�, �finish loading�, then the same sequence again. 39287 is
the file size of resources/beep.mp3. There is no corresponding network activity, so the loading of beep.mp3 must be coming from cache. On the javascript
console, there are the usual �not allowed to play audio because user did not interact� messages repeating about every 1-2-3 minutes.

I read this as: for reasons unknown, a huge number of audio requests becomes queued (the tab was inactive/iconized for many days) then they start trying to
play (load beep.mp3, do not play it because �not allowed�, move on to the next audio object, load � etc). This is consistent with the cpu use, with the
captured traces and with the quick growth in memory size (beep.mp3 objects are created, consume memory, cannot be free�d until garbage collector runs
later. much later).

The above scenario is impossible with how the current audio playing code is written (only one audio object can exist at a time, new audio object can only be
created after the previous one finished playing).

Two possible explanations: (a) the code running in the web page is not the same code as in mhttpd.js (running an old version from cache) or (b) the code
�one audio object at time� is not working correctly if javascript code is throttled /delayed/stopped in inactive tabs.

Following code will have this problem:

var only_one = null;
function foo() { /* runs from periodic timer */
if (!only_one) {
a = new Audio(�beep.mp3�);
/* throttled/suspended/delayed here */
/* multiple Audio objects created because "only_one" is still null */
only_one = a;
}
}
K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-10-17
dd1 REPORTER
fading background - yes, I found the code. pretty neat. I moved it around to remove the timer - I am suspicious of how the timers run in inactive tabs. but no
time to study it.

but the current problem is clearly with audio objects, and the only audio we have is the periodic playing of beep.mp3. who knew there will be so much trouble.

there is still the unexplained use of GPU, but maybe playing/decoding mp3 files uses the GPU.

I am also puzzled why the status page from midas-2019-03 does not show any of these problems. it just sits there using no memory (50 Mbytes) and no
CPU. perhaps we changed something in the playing of audio files since last March (when midas-2019-03 was tagged).

K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-10-17
dd1 REPORTER
For the first time I saw my message �mhttpd_alarm_play: Cannot play alarm sound: previous alarm sound did not finish playing yet� reported on the javascript
console. This confirms my guess that playing of audio is actually delayed and indeed we need to check that the previous audio finished playing before
creating new audio objects. But the check in the current code has a race condition. If the delay/stall is inside �new Audio()�, we will create multiple audio
objects as �last_audio� is still in the �finished playing� state, we only change it after the return from �new Audio()�. K.O.

Edit Pin to top Mark as spam Delete 2019-10-28
dd1 REPORTER
see big improvement. now inactive tabs grow from 50-ish Mbytes to 170-ish Mbytes, then when I open them, there is some cpu use (GC, I guess) and
memory use drops back to 50-ish Mbytes. So we are not leaking any memory anymore. Looking at the console messages, I see that my fixes are helping -
there is messages about attempts to create new Audio() when previous one did not finish yet. K.O.

Edit Pin to top Mark as spam Delete 2019-11-04
dd1 REPORTER
I guess, inactive tabs are throttled by google-chrome so much that their GC (memory garbage collection) does not keep up with our 1/sec data updates. I do
not think we need to keep updating inactive tabs at this high frequency, but I am not sure how to detect if we are active or inactive. Maybe I can detect the
throttling instead. K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-11-04
dd1 REPORTER
see consistent behaviour from google-chrome:

have all these midas tabs open, inactive, window iconized, typical tab size is 50-ish Mbytes.
google-chrome update arrives
update is installed, all windows and tabs automatically closed, then reopened.
the midas tabs are still inactive, window is iconized
after a few days, see behaviour as described before:
midas tabs use 20-30% CPU, size is 100-ish Mbytes
if I open one of these tabs, it�s cpu usage goes up to 160%, size grows to 250-ish Mbytes, then within 5-10 seconds drops to 100 Mbytes, CPU usage goes
from 160% to zero.
when looking at this, if I am quick enough, I can right-click �inspect�, go to the �performance� tab, and press the �start collecting data� button and I capture
the very tail end of all this strange activity. This is the traces I have been describing so far.
K.O.

Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
see big blob of activity:

timer activation
mhttpd_message()
first call to mhttpd_fit_message()
a long cycle of (maybe 10-20) �recalculate size�, �layout�, �parse html�
second call to mhttpd_fit_message()
same long cycle of �
The way I understand this, mhttpd_fit_message() changes the size of some html element that causes the whole window to be re-layed-out.

K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
trying to figure out what triggers a long run of the �rasterizer� thread. see a very strange call sequence

timer fires
mhttpd_alarm_play()
mhttpdConfig()
�Layout�
mhttpd_alarm_play() calls mhttpdConfig() (3 times) to find out if alarm sound is enabled, the period, the file name, etc. so far so good. but mhttpdConfig()
does not touch any DOM objects, so why is it shown as calling �Layout�?!?

other than this trace, I see nothing else that would trigger the rasterizer thread�

(note) this time, mhttpd_alarm_play() does not call mhttpd_alarm_play_now(), so �new Audio� and stuff does not enter this picture.

K.O.

Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
in the early part of the trace, where I think the meat of �tab is using cpu and memory� is, I see the audio events firing in rapid sequence: loadeddata, canplay,
canplaythrough, rinse, repeat.

It turns out that promise rejection from audio.play() does not stop the loading of the sound file. This is easy to see by attaching the event handlers to these
events and by observing these event handlers print something to the javascript console.

If that is what is happening, it explains what I see: all my previous attempts to prevent the piling up of sound files are unsuccessful, and when I open the
previously inactive tab, all the queued sound files start loading (and not playing per �user did not interact� policy).

google docs suggest using audio.src=�� to cancel loading of sound files and it does seem to work. testing it now.

K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
gotcha. came back home, found one tab using about 10% cpu. audio.src=�� is commented out, javascript console is full of this:

Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 234
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 234
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 234
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 235
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 235
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 235
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 236
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 236
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 236
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 237
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 237
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 237
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 238
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 238
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 238
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 239
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 239
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 239
the timestamp is exactly when I opened this tab. so confirmed, a whole bunch of audio files got queued, when I open the tab they all try to play. (there is no
actual sound, all tabs are muted).

now I uncomment audio.src=�� and see what happens.

K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
looks good. an update to google-chrome came in and after installing, I no longer see midas tabs show high cpu usage or high memory use. I think the
audio.src=�� fix is it. I will be committing these fixes to midas. K.O.

Edit Pin to top Mark as spam Delete 2019-11-12
Stefan Ritt
The loop in mhttpd_fit_message() is there for a good reason: I want to display the message in a single line. If it�s too long, I want first to cut the time stamp
and then display it. If it�s still too long, I want to truncate the message and display ��� at the end. The problem is what �too long� is. Nobody can tell you how
much pixel a message on your browser take, because this depends on the installed fonts, the exact character spacing of your browser and so on. So the only
way I could make this happen is to add one char at a time, until we get close to the maximum allowed space. If course this requires a re-layout of the page for
10-20 times, but when your window is in the foreground this is not a problem, since a browser can do this with small CPU load. The �scope� application I use
does 70 frames per second at 30% CPU load. So one could make the loop a bit smarter, like binary search, which would drop the 10-20 iterations to log2(10-
20) ~ 4-5, but still there would be a loop.

How that the update of the messages in the background is suppressed with the hidden API, do you still have that problem or can we consider it fixed?

Stefan

Pin to top Mark as spam Delete 2019-11-18
dd1 REPORTER
see new behaviour - after many days, inactive page size is ~180 Mbytes. 0% CPU use (an improvement from before where there was large CPU use). activate
the tab, nothing much happens, 0% CPU use (again an improvement from before). after about 30 seconds, memory use drops down to the normal 50-70-80
Mbytes. I think what we see is the garbage collector is throttled down and does not keep up with our allocations. Stefan�s new fix reducing polling in inactive
pages from 1/sec to 1/10sec should help with this. K.O.

&#8204;

Edit Pin to top Mark as spam Delete 2019-11-19
dd1 REPORTER
mhttpd_fit_message() - confirmed. I was confused about the function argument.

I thought it is passed an array of messages. no, it is one message string and the loop is over the message string length. The loop is done twice (second time,
with the time/date stamp removed). google-chrome debugger does show that this uses large amount of CPU, mainly to compute d.offsetWidth.

I think I will refactor these loops - instead of growing the message, I will shrink it.

K.O.

Edit Pin to top Mark as spam Delete 4 hours ago
dd1 REPORTER
rewrote mhttpd_fit_message() to reduce CPU use: try to fit complete message, if too long, try to fit message without timestamp, if too long, guess the desired
length assuming all chars have same width, then grow or shrink the message until the size is right. K.O.

Edit Pin to top Mark as spam Delete 17 minutes ago
dd1 REPORTER
changed status to resolved
The main fix is to set audio.src="" in the promise rejection. K.O.

Edit Pin to top Mark as spam Delete just now

What would you like to say?
Assignee
�
Type
bug
Priority
major
Status
resolved
Votes
0 Vote for this issue
Watchers
1 Stop watching
Dismiss this bannerJira Software: the preferred issue tracker for Bitbucket. Join the team!

improvement for midas web page resource use (alarm sound and fit_message)

> > I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser 
> > "task manager" and by other tools.

The final fix is in. (plus a fix from Stefan).

When the audio.play() promise is rejected, one must clear audio.src, otherwise the browser will continue
with loading the audio file (but will not play it at the end).

Normally, this should not be a problem, but in inactive tabs, all activity is throttled down, and it so happens
that these audio objects accumulate (they are in the state of "we are trying to load the sound file, but
browser slows us down so much!"), consume huge amounts of memory (page memory use goes from ~50 Mbytes
to ~100-200 Mbytes) and consume huge amounts of CPU (not clear how, probably it's the firing of "loading", "canplay", etc
event handlers).

It does not help that mhttpd_fit_message() had a performance bug and consumed large amounts of CPU causing even
more slowing down by the be browser.

After adding audio.src="", all this is gone. I see no special CPU use, and I do not see any strange large memory use.

I still sometimes see inactive tabs grow from ~50 Mbytes to amount ~100 Mbytes. After I open them (activate them),
they quickly shrink back to ~50 Mbytes. I conclude that the browser is slowing down the garbage collector in inactive
tabs so much that it does not keep up with our 1/sec data polling.

So Stefan's fix to reduce polling from 1/sec to 1/10sec should help with this, too. (plus reduction of CPU use by fit_message() should
leave more time for the garbage collector to run).

P.S. General rules for browser slow down of inactive tabs seem to be written here:
https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API

Slowdown of timers is written here:
https://developer.mozilla.org/en-US/docs/Web/API/WindowOrWorkerGlobalScope/setTimeout#Notes

K.O.

Goto page Previous 1, 2, 3 ... 71, 72, 73 ... 157, 158, 159 Next

ELOG V3.1.4-2e1708b5