ID |
Date |
Author |
Topic |
Subject |
1736
|
15 Nov 2019 |
Konstantin Olchanski | Bug Report | Newly installed MIDAS on OSX: mhttpd crahes | > It is reproducible alright.
Thanks. At first blush, a guess, read_passwords() is not thread-safe and is called from multiple threads, not protected by semaphore. Crash report shows 2 active threads
(one made it is far as processing the mjson rpc, the other one crashed in read_passwords()).
K.O.
> Here are the core dump and the backtrace (I think the former is more informative).
>
>
>
> > > Context: out of the box MIDAS (using cmake) on OSX Mojave.
> > >
> > > Running with mongoose/opensslm installation following instruction here:
> > > https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux
> > >
> > > mhttpd crashing when midas webpage opened with Safari (12.1.2). Usually when opening the "chat" tab but sometimes also with the "message" tab.
> > > mhttpd(11109,0x70000827a000) malloc: *** error for object 0x7f8669501ef0: pointer being freed was not allocated
> > > mhttpd(11109,0x70000827a000) malloc: *** set a breakpoint in malloc_error_break to debug
> > >
> > > No crash if using firefox (70.0.1 (64-bit))
> >
> > I think we also have reports of mhttpd crash on macos with safari from the Dragon experiment,
> > but cannot reproduce the problem.
> >
> > If you can reproduce this, can you capture the crash stack trace?
> >
> > One way to do this is to enable core dumps in odb "/expt/enable core dumps" set to "y", restart mhttpd,
> > wait for the crash. I think macos writes core dumps into /cores/... Or you can run mhttpd inside lldb
> > and wait for the crash. the lldb command to show the stack trace is "bt", but you may need
> > to switch to different threads to see which one actually crashed. I forget what the command
> > for that is.
> >
> > BTW, the mhttpd networking code has not changed in a long time, but an update
> > of mongoose web server library is overdue (to fix a memory leak, at least).
> >
> > K.O. |
1735
|
15 Nov 2019 |
Stefan Ritt | Suggestion | javascript comunication | Very good idea. And thanks for finding the document.hidden solution. I put it in, so give it a try.
Best,
Stefan
> I am currently testing the new history system on the mhttpd side and stumbled over the following issue: typically our user open a lot of midas web-page tabs and keep them open. With the current version this leads after a night typically to a state where the browser is busy with itself and not reacting anymore.
>
> One important reason seems to be that ALL tabs trying to communicate all the time which is totally unnecessary, since I think a hidden tab should stay in a sleeping mode.
>
> I was browsing if there is a way to find out if a tab is active or not, and found the following API which exactly does this:
>
> https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API
>
> Furthermore, the simple
>
> document.hidden
>
> tag, could be used to find out if the page is currently active.
>
> Wouldn't it a good idea to send all midas tabs which are not active into a sleep mode and only reactivate them if they come into focus?
>
> I had a quick look at the JavaScript libs of midas, but I am not quite certain where to best inject this. |
1734
|
15 Nov 2019 |
Pierre Gorel | Bug Report | Newly installed MIDAS on OSX: mhttpd crahes | It is reproducible alright.
Here are the core dump and the backtrace (I think the former is more informative).
> > Context: out of the box MIDAS (using cmake) on OSX Mojave.
> >
> > Running with mongoose/opensslm installation following instruction here:
> > https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux
> >
> > mhttpd crashing when midas webpage opened with Safari (12.1.2). Usually when opening the "chat" tab but sometimes also with the "message" tab.
> > mhttpd(11109,0x70000827a000) malloc: *** error for object 0x7f8669501ef0: pointer being freed was not allocated
> > mhttpd(11109,0x70000827a000) malloc: *** set a breakpoint in malloc_error_break to debug
> >
> > No crash if using firefox (70.0.1 (64-bit))
>
> I think we also have reports of mhttpd crash on macos with safari from the Dragon experiment,
> but cannot reproduce the problem.
>
> If you can reproduce this, can you capture the crash stack trace?
>
> One way to do this is to enable core dumps in odb "/expt/enable core dumps" set to "y", restart mhttpd,
> wait for the crash. I think macos writes core dumps into /cores/... Or you can run mhttpd inside lldb
> and wait for the crash. the lldb command to show the stack trace is "bt", but you may need
> to switch to different threads to see which one actually crashed. I forget what the command
> for that is.
>
> BTW, the mhttpd networking code has not changed in a long time, but an update
> of mongoose web server library is overdue (to fix a memory leak, at least).
>
> K.O. |
1733
|
15 Nov 2019 |
Andreas Suter | Suggestion | javascript comunication | I am currently testing the new history system on the mhttpd side and stumbled over the following issue: typically our user open a lot of midas web-page tabs and keep them open. With the current version this leads after a night typically to a state where the browser is busy with itself and not reacting anymore.
One important reason seems to be that ALL tabs trying to communicate all the time which is totally unnecessary, since I think a hidden tab should stay in a sleeping mode.
I was browsing if there is a way to find out if a tab is active or not, and found the following API which exactly does this:
https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API
Furthermore, the simple
document.hidden
tag, could be used to find out if the page is currently active.
Wouldn't it a good idea to send all midas tabs which are not active into a sleep mode and only reactivate them if they come into focus?
I had a quick look at the JavaScript libs of midas, but I am not quite certain where to best inject this. |
1732
|
12 Nov 2019 |
Konstantin Olchanski | Bug Report | Newly installed MIDAS on OSX: mhttpd crahes | > Context: out of the box MIDAS (using cmake) on OSX Mojave.
>
> Running with mongoose/opensslm installation following instruction here:
> https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux
>
> mhttpd crashing when midas webpage opened with Safari (12.1.2). Usually when opening the "chat" tab but sometimes also with the "message" tab.
> mhttpd(11109,0x70000827a000) malloc: *** error for object 0x7f8669501ef0: pointer being freed was not allocated
> mhttpd(11109,0x70000827a000) malloc: *** set a breakpoint in malloc_error_break to debug
>
> No crash if using firefox (70.0.1 (64-bit))
I think we also have reports of mhttpd crash on macos with safari from the Dragon experiment,
but cannot reproduce the problem.
If you can reproduce this, can you capture the crash stack trace?
One way to do this is to enable core dumps in odb "/expt/enable core dumps" set to "y", restart mhttpd,
wait for the crash. I think macos writes core dumps into /cores/... Or you can run mhttpd inside lldb
and wait for the crash. the lldb command to show the stack trace is "bt", but you may need
to switch to different threads to see which one actually crashed. I forget what the command
for that is.
BTW, the mhttpd networking code has not changed in a long time, but an update
of mongoose web server library is overdue (to fix a memory leak, at least).
K.O. |
1731
|
08 Nov 2019 |
Pierre Gorel | Bug Report | Newly installed MIDAS on OSX: mhttpd crahes | Context: out of the box MIDAS (using cmake) on OSX Mojave.
Running with mongoose/opensslm installation following instruction here:
https://midas.triumf.ca/MidasWiki/index.php/Quickstart_Linux
mhttpd crashing when midas webpage opened with Safari (12.1.2). Usually when opening the "chat" tab but sometimes also with the "message" tab.
mhttpd(11109,0x70000827a000) malloc: *** error for object 0x7f8669501ef0: pointer being freed was not allocated
mhttpd(11109,0x70000827a000) malloc: *** set a breakpoint in malloc_error_break to debug
No crash if using firefox (70.0.1 (64-bit)) |
1730
|
24 Oct 2019 |
Konstantin Olchanski | Bug Report | lazylogger in cmake & max_event_size | > > > The compile option -DHAVE_FTPLIB checked in mdsupport.cxx disappeared if you
> > > compile with cmake.
> >
> > Hi, Stefan - do we still need to support FTP in the logger? In the lazylogger, special support for
> > FTP is not needed, they can you the "script" method and do FTP without our help.
> >
> > I move to remove FTP support from MIDAS. (second? other opinions?)
>
> I oppose to remove FTP support from lazylogger.
Confirmed. FTP support in lazylogger stays.
K.O. |
1729
|
23 Oct 2019 |
Konstantin Olchanski | Forum | Data for key truncated | > I keep on getting messages like this:
> 16:25:35 [fecaen,ERROR] [odb.c:4567:db_get_data,ERROR] data for key
> "/DAQ/params/VX1730/custom/Board 0/Channel 0/Input range" truncated
>
> [ bool fInputRange... ]
> size = sizeof(fInputRange);
> db_get_data(hDb, hSubKey, &fInputRange, &size, TID_BOOL);
>
The error is correct. size of TID_BOOL is 4 byte (uint32_t) and you give is sizeof(bool) instead which is probably not 4.
Note that sizeof(bool) is not well defined, sometimes it is 1 (you need 4), sometimes something else, see
https://stackoverflow.com/questions/4897844/is-sizeofbool-defined-in-the-c-language-standard
A good fix would be to change fInputRange from bool to uint32_t (which is always 4 byte size).
#include <stdint.h>
...
uint32_t fInputRange;
K.O. |
1728
|
21 Oct 2019 |
Vinzenz Bildstein | Forum | Data for key truncated | I keep on getting messages like this:
16:25:35 [fecaen,ERROR] [odb.c:4567:db_get_data,ERROR] data for key
"/DAQ/params/VX1730/custom/Board 0/Channel 0/Input range" truncated
whenever I start my frontend. Input range is defined to be a BOOL and using
odbedit to read it shows:
Key name Type #Val Size Last Opn Mode Value
---------------------------------------------------------------------------
Input range BOOL 1 4 75h 0 RWD y
without any error message. The entry is read using
size = sizeof(fInputRange);
db_get_data(hDb, hSubKey, &fInputRange, &size, TID_BOOL);
where fInputRange is a bool.
Where does this message come from and how can I resolve this? |
1727
|
18 Oct 2019 |
Joseph McKenna | Info | sysmon: New system monitor and performance logging frontend added to MIDAS |
I have written a system monitor tool for MIDAS, that has been merged in the develop branch today: sysmon
https://bitbucket.org/tmidas/midas/pull-requests/8/system-monitoring-a-new-frontend-to-log/diff
To use it, simply run the new program
sysmon
on any host that you want to monitor, no configuring required.
The program is a frontend for MIDAS, there is no need for configuration, as upon initialisation it builds a history display for you. Simply run one instance per machine you want to monitor. By default, it only logs once per 10 seconds.
The equipment name is derived from the hostname, so multiple instances can be run across multiple machines without conflict. A new history display will be created for each host.
sysmon uses the /proc pseudo-filesystem, so unfortunately only linux is supported. It does however work with multiple architectures, so x86 and ARM processors are supported.
If the build machine has NVIDIA drivers installed, there is an additional version of sysmon that gets built: sysmon-nvidia. This will log the GPU temperature and usage, as well as CPU, memory and swap. A host should only run either sysmon or sysmon-nvidia
elog:1727/1 shows the History Display generated by sysmon-nvidia. sysmon would only generate the first two displays (sysmon/localhost and sysmon/localhost-CPU) |
1726
|
15 Oct 2019 |
Stefan Ritt | Suggestion | recover daq and hardware safety. | There is a not-so-well-known function in the ODB to write protect some keys. You can do
odbedit> chmod 1 /Equipment/HV/Demand
which will write protect your Demand values. You see that by doing
odbedit> ls -ls
where you only see a "R" at the right end instead a "RWD". I haven't tried it yet (so better do a dry run yourself), but that should prevent an odb load to overwrite your demand values. To change the values, put some logic on a custom page to unprotect the
values, change them, and then protect them again.
Stefan |
Draft
|
14 Oct 2019 |
Joseph McKenna | Forum | tmfe.cxx - Future frontend design | Hi,
I have been looking at the 2019 workshop slides, I am interested in the C++ future of MIDAS.
I am quite interested in using the object oriented
ALPHA will start data taking in 2021 |
1724
|
14 Oct 2019 |
Stefan Ritt | Bug Report | lazylogger in cmake & max_event_size | > > The compile option -DHAVE_FTPLIB checked in mdsupport.cxx disappeared if you
> > compile with cmake.
>
> Hi, Stefan - do we still need to support FTP in the logger? In the lazylogger, special support for
> FTP is not needed, they can you the "script" method and do FTP without our help.
>
> I move to remove FTP support from MIDAS. (second? other opinions?)
I oppose to remove FTP support from lazylogger. We still use it heavily at PSI. In comparison to the "script" method, it
shows the current speed in MB/s which helps us to diagnose some network problem by writing this number into the
history. The "script" method only give you an integral transfer speed after a file has be completely written.
I'm however not sure who FTP is used in lazylogger. It goes into mdsupport.cxx and I seem to remember that Pierre
wrote the FTP code by hand, so no external library is necessary.
Stefan |
1723
|
10 Oct 2019 |
Stefan Ritt | Bug Report | History data size mismatch | > Yes, we could have
> kept that apart, yes, in this case a double would also work (and not break things), but a bug is a bug...
> I could think of senisble use cases where doubles and ints are mixed and I also know quite a few areas where it makes
> sense to use floats...
I agree with Nik that we should fix this on the midas level. Since it happens in history_schema.cxx which was written by KO, maybe he can have a look.
Stefan |
1722
|
10 Oct 2019 |
Nik Berger | Bug Report | History data size mismatch | >I wonder why do you this via ODB links. The "standard" way of writing to the history should be to create events for an equipment and flag this equipment as being written to the
>history. All variables under /Equipment/<name>/Variables then automatically go into the history and you don't have to worry about ODB links. Only variables not fitting the
>equipment/variables scheme should be dealt with via ODB links, like variables under equipment/statistics or parameters in another ODB tree. In a typical midas experiment, only
>very few variables typically go into the 'System' event. This is however probably not a solution to your problem. If you have a similar structure (doubles plus an odd number of floats)
>under 'variables', you might get the same error.
>
> In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit)
>
We do this in the MuX DAQ and mix things that come directly from MIDAS (the MIDAS trigger rate) and things from the
analyzer (rates in the self-triggering detectors) and some temperatures from yet somewhere else. Yes, we could have
kept that apart, yes, in this case a double would also work (and not break things), but a bug is a bug...
I could think of senisble use cases where doubles and ints are mixed and I also know quite a few areas where it makes
sense to use floats...
Nik |
1721
|
10 Oct 2019 |
Konstantin Olchanski | Bug Report | History data size mismatch | >
> In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit)
>
Padding trouble, mixing "double" and "float" trouble. Ouch.
Best wisdom I received on this: never use "float", always use "double".
I was burned by "float" with following code, which produced the same result from
analyzing 100 files as from analyzing 1000 files. (why did we take data for 10 weeks
instead of 1 week?). Hint: "float" overflows way too quickly, after overflow sum+=1 does not change
the value of "sum". The actual code used ROOT TH1F. Lesson: always use TH1D.
float sum = 0; // should always be "double" !!!
foreach data_file {
foreach data from current data file {
sum += data;
}
}
print sum;
K.O. |
1720
|
06 Oct 2019 |
Stefan Ritt | Bug Report | History data size mismatch | I wonder why do you this via ODB links. The "standard" way of writing to the history should be to create events for an equipment and flag this equipment as being written to the
history. All variables under /Equipment/<name>/Variables then automatically go into the history and you don't have to worry about ODB links. Only variables not fitting the
equipment/variables scheme should be dealt with via ODB links, like variables under equipment/statistics or parameters in another ODB tree. In a typical midas experiment, only
very few variables typically go into the 'System' event. This is however probably not a solution to your problem. If you have a similar structure (doubles plus an odd number of floats)
under 'variables', you might get the same error. I'n in contact with KO to fix this problem at the root level.
Stefan
> Logging a list of variables to the history via links in the history ODB subtree,
> we get messages as follows at every run start:
>
> 19:43:24.009 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes
>
> 19:43:24.008 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes
>
> 19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes
>
> 19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes
>
> The history calculates the size of a record from the size of the individual variables, (history_schema.cxx, L2666 ff), whereas the ODB delivers the data aligned/padded to the size of the largest value in the record.
> In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit), leading to a padded response from the ODB, 4 byte longer than the history expects.
> Quick fix: Add another 32 bit dummy variable to the history. Gets rid of the error messages...
> Should probably be fixed at a deeper level... |
1719
|
06 Oct 2019 |
Nik Berger | Bug Report | History data size mismatch | Logging a list of variables to the history via links in the history ODB subtree,
we get messages as follows at every run start:
19:43:24.009 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes
19:43:24.008 2019/10/06 [Logger,ERROR] [history_schema.cxx:2676:hs_write_event,ERROR] Event 'System' data size mismatch: expected 412 bytes, got 416 bytes
19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes
19:43:23.850 2019/10/06 [Logger,ERROR] [history_schema.cxx:455:hs_write_event,ERROR] Event 'System' data size mismatch count: 25, expected 412 bytes, hs_write_event() called with as much as 416 bytes
The history calculates the size of a record from the size of the individual variables, (history_schema.cxx, L2666 ff), whereas the ODB delivers the data aligned/padded to the size of the largest value in the record.
In our history, a long list of doubles (64 Bit) fas followed by three floats (32 bit), leading to a padded response from the ODB, 4 byte longer than the history expects.
Quick fix: Add another 32 bit dummy variable to the history. Gets rid of the error messages...
Should probably be fixed at a deeper level... |
1718
|
30 Sep 2019 |
Konstantin Olchanski | Forum | MIDAS interface for WAGASCI online monitor | >
> As Thomas said, maybe the simplest thing would be to use the ROOT THttpServer. Honestly, I do
> not think that ROOT was ever meant to act as an online monitor due to its wacky memory
> management and abysmal multithread support. In other words, I think that by using ROOT we would
> inevitably lose some performance.
>
Yes. The previous data analysis frameworks - PAW/PAW++/CERNlib (CERN), NOVA (TRIUMF) - certainly had
support for online monitoring. In CERNlib/PAW the histograms were stored in shared memory,
the analyzer running in the background was filling them, the PAW/PAW++ display was displaying
them "live". I was very surprised to find this function removed/not implemented in ROOT, given
that the same people were behind both projects (Rene Brun & co).
We tried to roll our own implementation of this in ROOTANA/ROODY, with mixed success.
I am glad the JSROOT project finally gained traction and web based "I can see the data" is now
available in ROOT.
>
> Perhaps there are better ways of achieving the same goal. For example, I was leaning towards a
> plotly.js based approach where
>
There are many web/javascript graphics libraries out there, all have the weak spot - how do you
get your data into them?
Going forward, I see us standardizing on JSROOT: https://root.cern.ch/js/
>
> ... send them to the client through the MJSONRPC mechanism ...
>
I am not sure JSROOT have any support for interacting from the web page to the back-end analyzer. Perhaps
we can use the MIDAS MJSONRPC library for this. Hmm... (Note that the ROOT HTTP server is a derivative
of the mongoose web server library, which we use in mhttpd, so I already know how to work it)
>
> I would encode a series of vectors in base64 strings (for better transmission performance)
>
We looked into this when deciding on the data encoding for the midas history data. There is a tradeoff
between network use and cpu use - to save on the network, you try to reduce the data size by using
compressed binary data - to save on the CPU you try to minimize data encoding.
For history data, we gave up on binary json (extra decoding needed), gave up on text json (extra decoding
needed), gave up on compression (extra cpu use for decompression) and use javascript native binary processing
("arraybuffer").
Our thinking is that network bandwidth is usually quite big and is getting bigger, but cpu resource is limited
and is expensive. (mobile devices seems to be stuck with ~2 GHz CPUs; cpu use means battery use and
battery capacity is limited, not improving quickly)
>
> So ROOT it is. I will use the ROOTANA javascript display as a reference. Do you happen to know
> who wrote that part?
>
Yes. See "Contact" at https://root.cern.ch/js/
>
> In that example, you have some "static" histograms that you keep always in memory, while in our
> case the number of channels is so big that we have to dynamically generate the histograms only
> when needed (when the user select a single channel).
>
This requires interaction with the analyzer, requires some kind of RPC mechanism. I am now curious what jsroot
have, also it would not be too hard to add the mjsonrpc library to rootana. Cooperation from ROOT multithreading
is not required: I can queue the RPC requests in a separate (thread safe, non-ROOT) buffer, then process
them in the ROOT main event loop (this is how the ROOTANA histogram server worked in the days when
ROOT had no multithread support at all).
K.O. |
1717
|
29 Sep 2019 |
Pintaudi Giorgio | Forum | MIDAS interface for WAGASCI online monitor | Dear Thomas and Konstantin,
thank you very much for the feedback. I found the ROOTANA javascript display a good source of
information and references.
As Thomas said, maybe the simplest thing would be to use the ROOT THttpServer. Honestly, I do
not think that ROOT was ever meant to act as an online monitor due to its wacky memory
management and abysmal multithread support. In other words, I think that by using ROOT we would
inevitably lose some performance.
Perhaps there are better ways of achieving the same goal. For example, I was leaning towards a
plotly.js based approach where I would encode a series of vectors in base64 strings (for better
transmission performance), send them to the client through the MJSONRPC mechanism, decode them
and then feed them to plotly.js. But in this case, I should study many new libraries
(plotly.js, the library for the base64 encoding, the Gaussian fitting, etc...) and I do not
have the time to do that now: "beam is coming".
So ROOT it is. I will use the ROOTANA javascript display as a reference. Do you happen to know
who wrote that part?
In that example, you have some "static" histograms that you keep always in memory, while in our
case the number of channels is so big that we have to dynamically generate the histograms only
when needed (when the user select a single channel).
Best regards
Giorgio |
|