Back Midas Rome Roody Rootana
  Midas DAQ System, Page 112 of 138  Not logged in ELOG logo
ID Date Author Topic Subjectdown
  1669   29 Aug 2019 Ben SmithForumHistory plot problems for frontend with multiple indicies
Hi Nick,

I confirm that this issue appears when using the MIDAS history driver. The issue does not appear when using the MYSQL history driver.

One solution is to give each frontend instance a different Event ID (see example code below for doing this in frontend_init). The history system did still seem to be confused by the existing FeDummy equipments/events even after making this change. However, after changing EQ_NAME from FeDummy to FeDum (i.e. starting from a clean state history-wise) things behaved normally.

I will note that some experiments definitely have a need for the "-i" method, especially those that run on distributed clusters.

Ben


```
INT frontend_init()
{
   sprintf(eq_name, "%s%02d", EQ_NAME, get_frontend_index());

   // Ensure each FE gets a different Event ID in the history system (951, 952 etc)
   char keyname[128];
   HNDLE hkey;
   int status;
   sprintf(keyname, "/Equipment/%s/Common/Event ID", eq_name);
   status = db_find_key(hDB, 0, keyname, &hkey);
   if (status != DB_SUCCESS) abort();

   WORD new_ev_id = 950 + get_frontend_index();
   status = db_set_data_index(hDB, hkey, &new_ev_id, 2, 0, TID_WORD);
   if (status != DB_SUCCESS) abort();
   return SUCCESS;
}
```
  Draft   29 Aug 2019 Nick HastingsForumHistory plot problems for frontend with multiple indicies
Hi LCP,

thanks for the suggestion and link. Unfortunatly I don't think this explains it.

Nick.
  Draft   29 Aug 2019 Nick HastingsForumHistory plot problems for frontend with multiple indicies
Hi Ben,

thanks for your reply. I can confirm that your suggested workaround does indeed
make the problem dissapear.

I guess this issue hasn't been seen at T2K since we use MYSQL for the history.

Thanks,

Nick.
  1672   01 Sep 2019 Nick HastingsForumHistory plot problems for frontend with multiple indicies
Hi Ben,

thanks for your reply. I can confirm that your suggested workaround does indeed
make the problem dissapear.

I guess this issue hasn't been seen at T2K since we use MYSQL for the history.

Thanks,

Nick.
  1689   16 Sep 2019 Konstantin OlchanskiForumHistory plot problems for frontend with multiple indicies
> My first question would be why are you using several font-ends at all? That makes things more 
> complicated than needed. In the normal FE framework, you can define either several equipment 
> served by one frontend, or even one equipment linked to several devices.

I am the culprit here, as I wrote the original code for T2K/ND280 that Nick is looking at.

At the time, we needed to control multiple units of identical equipment. Most of these equipments
needed to be controlled independently from each other, so we could not/did not want to use
one single frontend executable to control all of them at the same time. For example, for equipment
not in use, we can stop the corresponding frontend. In case of trouble, we can restart
the corresponding frontend without disrupting the frontends for the other equipments.

The successful operation of the T2K/ND280 experiment is sufficient defence for the validity of this approach.

One lesson learned was that the MIDAS frontend framework did not make it easy to have multiple identical frontends 
for controlling multiple identical equipments. (typical use is control of 2-3 Wiener power supplies, 1-2-3 UPS 
devices, etc). At the time (and today), only the "i NNN" flag is available to tell the frontend "who am I?". To make it 
work, one has to use the hard to "%02d" stuff in the equipment name, and there are other complications. For my 
"next generation" of frontends, I tried to specialize the frontend executables at compile time using C/C++ 
preprocessor defines (-Dwiener01, -Dwiener02, etc), this worked better, but still not super happy.

My current solution as implemented by the tmfe frontend framework is to give the user full control
over the command line arguments (mfe.c did not permit any "user arguments" and did not allow
access to argc/argv) and full control over the equipment names (mfe.c equipment names are fixed at compile time).

K.O.
  1691   16 Sep 2019 Konstantin OlchanskiForumHistory plot problems for frontend with multiple indicies
> it's probably better to run a multi-threaded setup, than individual frontends.

I recommend against using multiple threads if at all possible and unless absolutely required.

Only for one reason: multithreaded c++ programs are notoriously hard to debug.

In addition, one has to face several classes of bugs absent in single-threaded applications:

a) which thread "owns" which object
b) locking of all shared data
c) huge overheads from locking at high data rates (a performance bug)
d) correct locking order, dead locks, live locks
e) incomprehensible core dumps and stack traces
f) race conditions

To control 2 power supplies, run 2 frontend programs, 1 per power supply.

To control 64 frontend cards, run 1 frontend with many threads: 64 (per device) + 1 (main thread) + 1 (RPC handler) + 1 
(watchdog) + 1 (common event generator/data transmitter) + 1 (odb/web page status update). You *will* bump into each 
and every one of the problems (a) to (f) above.

K.O.
  1692   16 Sep 2019 Konstantin OlchanskiForumHistory plot problems for frontend with multiple indicies
> thanks for your reply. I can confirm that your suggested workaround does indeed
> make the problem dissapear.
> I guess this issue hasn't been seen at T2K since we use MYSQL for the history.

I think you found the source of the problem, confused event id assignments. To confirm,
can you email me (or post here) the output of odbedit "ls -l /History/Events".

If that's the problem, you can avoid it completely by switching to a history storage method
that does not rely on magic mapping between equipment names and numeric event id's:
try the "FILE" method (set odb /Logger/History/FILE/Active to "y", restart the logger) or
the "MYSQL" method (you will need to setup a mysql database). You tell mhttpd and mhist which 
history data to read by setting ODB /History/LoggerHistoryChannel to one of the channel names 
from /logger/history/, restart mhttpd. (mhttpd and mhist used to print a message "reading history 
data from channel XXX", but somebody removed this message).

K.O.
  1693   16 Sep 2019 Nick HastingsForumHistory plot problems for frontend with multiple indicies
Hi Konstantin,

thanks for your reply.

> > thanks for your reply. I can confirm that your suggested workaround does indeed
> > make the problem dissapear.
> > I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
> 
> I think you found the source of the problem, confused event id assignments. To confirm,
> can you email me (or post here) the output of odbedit "ls -l /History/Events".

Sorry, do you want this for after I've applied the fix suggested by Ben or with the original code 
that I posted.

With the original code it only shows one fe even though both are running:

[local:e666:S]History>ls -l /History/Events
Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
1                               STRING  1     10    2m   0   RWD  FeDummy02
0                               STRING  1     16    2m   0   RWD  Run transitions

[local:e666:S]History> scl
Name                Host
mhttpd              localhost           
fedummy01           localhost           
fedummy02           localhost           
ODBEdit             localhost           
Logger              localhost           
[local:e666:S]History>ls "/History/Display/Default/Dummy/
Timescale                       1h
Zero ylow                       n
Show run markers                y
Show values                     y
Sort Vars                       n
Log axis                        n
Minimum                         0
Maximum                         0
Variables
                                FeDummy01:Data
                                FeDummy02:Data
Label
                                
                                
Colour
                                #00AAFF
                                #FF9000
Factor
                                0
                                0
Offset
                                0
                                0
Buttons
                                10m
                                1h
                                3h
                                12h
                                24h
                                3d
                                7d
Formula
                                
                                
Show old vars                   n

> If that's the problem, you can avoid it completely by switching to a history storage method
> that does not rely on magic mapping between equipment names and numeric event id's:
> try the "FILE" method (set odb /Logger/History/FILE/Active to "y", restart the logger) or
> the "MYSQL" method (you will need to setup a mysql database). You tell mhttpd and mhist which 
> history data to read by setting ODB /History/LoggerHistoryChannel to one of the channel names 
> from /logger/history/, restart mhttpd. (mhttpd and mhist used to print a message "reading history 
> data from channel XXX", but somebody removed this message).

Using the orginal code I posted and switching from MIDAS history to FILE history did not seem to 
change the random behaviour in the history plots.

Regards,

Nick.
  1696   17 Sep 2019 Konstantin OlchanskiForumHistory plot problems for frontend with multiple indicies
> [local:e666:S]History>ls -l /History/Events
> Key name                        Type    #Val  Size  Last Opn Mode Value
> ---------------------------------------------------------------------------
> 1                               STRING  1     10    2m   0   RWD  FeDummy02
> 0                               STRING  1     16    2m   0   RWD  Run transitions

Something is very broken. There should be more entries here, at least
there should be entries for "FeDummy01" and usually there is also an entry
for "FeDummy" because one invariably runs fedummy without "-i" at least once.

The fact that changing from "midas" storage to "file" storage makes no difference
also indicates that something is very broken.

I want to debug this.

Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" files
from the "file" storage.

K.O.
  Draft   18 Sep 2019 Nick HastingsForumHistory plot problems for frontend with multiple indicies
> > [local:e666:S]History>ls -l /History/Events
> > Key name                        Type    #Val  Size  Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > 1                               STRING  1     10    2m   0   RWD  FeDummy02
> > 0                               STRING  1     16    2m   0   RWD  Run transitions
> 
> Something is very broken. There should be more entries here, at least
> there should be entries for "FeDummy01" and usually there is also an entry
> for "FeDummy" because one invariably runs fedummy without "-i" at least once.
> 
> The fact that changing from "midas" storage to "file" storage makes no difference
> also indicates that something is very broken.
> 
> I want to debug this.
> 
> Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" files
> from the "file" storage.


> 
> K.O.
  1699   18 Sep 2019 Nick HastingsForumHistory plot problems for frontend with multiple indicies
Hi Konstantin,

> > [local:e666:S]History>ls -l /History/Events
> > Key name                        Type    #Val  Size  Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > 1                               STRING  1     10    2m   0   RWD  FeDummy02
> > 0                               STRING  1     16    2m   0   RWD  Run transitions
> 
> Something is very broken. There should be more entries here, at least
> there should be entries for "FeDummy01" and usually there is also an entry
> for "FeDummy" because one invariably runs fedummy without "-i" at least once.

This is a fresh experiment that I started just to test this this issue, that is why there are not many 
entries in /History/Events. I agree though that we should expect to see a FeDummy01 entry.
 
> The fact that changing from "midas" storage to "file" storage makes no difference
> also indicates that something is very broken.
> 
> I want to debug this.
> 
> Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" 
files
> from the "file" storage.

When I started this experiment yesterday(?) I disabled the Midas history when I enbled the file 
history. Jsut now I reenabled the Midas history, so they are currently both active.

% ls -l work/online/{*.hst,mhf*.dat}
-rw-r--r-- 1 hastings hastings  14996 Sep 17 10:21 work/online/190917.hst
-rw-r--r-- 1 hastings hastings   3292 Sep 18 16:29 work/online/190918.hst
-rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy01.dat
-rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy02.dat
-rw-r--r-- 1 hastings hastings    166 Sep 17 10:17 
work/online/mhf_1568683062_20190917_run_transitions.dat

And again, just as a sanity check:

% odbedit -c 'ls -l /History/Events'
Key name                        Type    #Val  Size  Last Opn Mode Value
---------------------------------------------------------------------------
1                               STRING  1     10    1m   0   RWD  FeDummy02
0                               STRING  1     16    1m   0   RWD  Run transitions

Regards,

Nick.
  1710   27 Sep 2019 Konstantin OlchanskiForumHistory plot problems for frontend with multiple indicies
We should fix this for midas-2019-10.

https://bitbucket.org/tmidas/midas/issues/193/confusion-in-history-event-ids

K.O.





> Hi Konstantin,
> 
> > > [local:e666:S]History>ls -l /History/Events
> > > Key name                        Type    #Val  Size  Last Opn Mode Value
> > > ---------------------------------------------------------------------------
> > > 1                               STRING  1     10    2m   0   RWD  FeDummy02
> > > 0                               STRING  1     16    2m   0   RWD  Run transitions
> > 
> > Something is very broken. There should be more entries here, at least
> > there should be entries for "FeDummy01" and usually there is also an entry
> > for "FeDummy" because one invariably runs fedummy without "-i" at least once.
> 
> This is a fresh experiment that I started just to test this this issue, that is why there are not many 
> entries in /History/Events. I agree though that we should expect to see a FeDummy01 entry.
>  
> > The fact that changing from "midas" storage to "file" storage makes no difference
> > also indicates that something is very broken.
> > 
> > I want to debug this.
> > 
> > Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> > with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" 
> files
> > from the "file" storage.
> 
> When I started this experiment yesterday(?) I disabled the Midas history when I enbled the file 
> history. Jsut now I reenabled the Midas history, so they are currently both active.
> 
> % ls -l work/online/{*.hst,mhf*.dat}
> -rw-r--r-- 1 hastings hastings  14996 Sep 17 10:21 work/online/190917.hst
> -rw-r--r-- 1 hastings hastings   3292 Sep 18 16:29 work/online/190918.hst
> -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy01.dat
> -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy02.dat
> -rw-r--r-- 1 hastings hastings    166 Sep 17 10:17 
> work/online/mhf_1568683062_20190917_run_transitions.dat
> 
> And again, just as a sanity check:
> 
> % odbedit -c 'ls -l /History/Events'
> Key name                        Type    #Val  Size  Last Opn Mode Value
> ---------------------------------------------------------------------------
> 1                               STRING  1     10    1m   0   RWD  FeDummy02
> 0                               STRING  1     16    1m   0   RWD  Run transitions
> 
> Regards,
> 
> Nick.
  1986   24 Aug 2020 Konstantin OlchanskiForumHistory plot problems for frontend with multiple indicies
This turned out to be a tricky problem. I am adding a warning about it in mlogger. This should go into midas-
2020-07. Closing bug #193. K.O.


> We should fix this for midas-2019-10.
> 
> https://bitbucket.org/tmidas/midas/issues/193/confusion-in-history-event-ids
> 
> K.O.
> 
> 
> 
> 
> 
> > Hi Konstantin,
> > 
> > > > [local:e666:S]History>ls -l /History/Events
> > > > Key name                        Type    #Val  Size  Last Opn Mode Value
> > > > ---------------------------------------------------------------------------
> > > > 1                               STRING  1     10    2m   0   RWD  FeDummy02
> > > > 0                               STRING  1     16    2m   0   RWD  Run transitions
> > > 
> > > Something is very broken. There should be more entries here, at least
> > > there should be entries for "FeDummy01" and usually there is also an entry
> > > for "FeDummy" because one invariably runs fedummy without "-i" at least once.
> > 
> > This is a fresh experiment that I started just to test this this issue, that is why there are not many 
> > entries in /History/Events. I agree though that we should expect to see a FeDummy01 entry.
> >  
> > > The fact that changing from "midas" storage to "file" storage makes no difference
> > > also indicates that something is very broken.
> > > 
> > > I want to debug this.
> > > 
> > > Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> > > with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" 
> > files
> > > from the "file" storage.
> > 
> > When I started this experiment yesterday(?) I disabled the Midas history when I enbled the file 
> > history. Jsut now I reenabled the Midas history, so they are currently both active.
> > 
> > % ls -l work/online/{*.hst,mhf*.dat}
> > -rw-r--r-- 1 hastings hastings  14996 Sep 17 10:21 work/online/190917.hst
> > -rw-r--r-- 1 hastings hastings   3292 Sep 18 16:29 work/online/190918.hst
> > -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy01.dat
> > -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy02.dat
> > -rw-r--r-- 1 hastings hastings    166 Sep 17 10:17 
> > work/online/mhf_1568683062_20190917_run_transitions.dat
> > 
> > And again, just as a sanity check:
> > 
> > % odbedit -c 'ls -l /History/Events'
> > Key name                        Type    #Val  Size  Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > 1                               STRING  1     10    1m   0   RWD  FeDummy02
> > 0                               STRING  1     16    1m   0   RWD  Run transitions
> > 
> > Regards,
> > 
> > Nick.
  2015   19 Nov 2020 Joseph McKennaForumHistory plot consuming too much memory

A user reported an issue that if they were to plot some history data from 
2019 (a range of one day), the plot would spend ~4 minutes loading then 
crash the browser tab. This seems to effect chrome (under default settings) 
and not firefox

I can reproduce the issue, "Data Being Loaded" shows, then the page and 
canvas loads, then all variables get a correct "last data" timestamp, then 
the 'Updating data ...' status shows... then the tab crashes (chrome)


It seems that the browser is loading all data until the present day (maybe 4 
Gb of data in this case). In chrome the tab then crashes. In firefox, I do 
not suffer the same crash, but I can see the single tab is using ~3.5 Gb of 
RAM

Tested with midas-2020-08-a up until the HEAD of develop

I could propose the user use firefox, or increase the memory limit in 
chrome, however are there plans to limit the data loaded when specifically 
plotting between two dates?
  2016   19 Nov 2020 Stefan RittForumHistory plot consuming too much memory
The history code is right now programmes in such a way that when you request
an old time window, then all data from that window until the present date
gets loaded. When we implemented that, this worked fine for data ranges of 
several years with a delay of just a few seconds. Of course one can only
load that specific window, but when the user then scrolls right, one has to
append new data to the "right side" of the array stored in the browser. If the
user jumps to another location, then the browser has to keep track of which 
windows are loaded and which windows not, making the history code much more 
complicated. Therefore I'm only willing to spend a few days of solid work
if this really becomes a problem. 

Are you sure that the delay comes from the browser or actually from mhttpd
digging through GBytes of history data? I realized that you need solid state
disks to get a real quick response.

Stefan
  2017   20 Nov 2020 Joseph McKennaForumHistory plot consuming too much memory
Poking at the behavior of this, its fairly clear the slow response is from the data 
being loaded off an HDD, when we upgrade this system we will allocate enough SSD 
storage for the histories.

Using Firefox has resolved this issue for the user's project here

Taking this down a tangent, I have a mild concern that a user could temporarily 
flood our gigabit network if we do have faster disks to read the history data. Have 
there been any plans or thoughts on limiting the bandwidth users can pull from 
mhttpd? I do not see this as a critical item as I can plan the future network 
infrastructure at the same time as the next system upgrade (putting critical data 
taking traffic on a separate physical network).

> Of course one can only
> load that specific window, but when the user then scrolls right, one has to
> append new data to the "right side" of the array stored in the browser. If the
> user jumps to another location, then the browser has to keep track of which 
> windows are loaded and which windows not, making the history code much more 
> complicated. Therefore I'm only willing to spend a few days of solid work
> if this really becomes a problem. 

For now the user here has retrieved all the data they need, and I can direct others 
towards mhist in the near future. Being able to load just a specific window would be 
very useful in the future, but I comprehend how it would be a spike in complexity.
  2018   20 Nov 2020 Stefan RittForumHistory plot consuming too much memory
 > Taking this down a tangent, I have a mild concern that a user could temporarily 
> flood our gigabit network if we do have faster disks to read the history data. Have 
> there been any plans or thoughts on limiting the bandwidth users can pull from 
> mhttpd?

I guess this will not be network limiting but CPU limiting of the mhttpd process. But I'm 
not 100% sure, depends on the actual hardware. But even if we improve the history 
retrieval to "window only", the user could request all data form 2010 to 2020. So one 
would need some code which estimates the amount of data, then tell the user "do you really 
want that?". But still, a novice user can simply click "yes" without much of a thought. So 
in conclusion I believe proper user training is better than software limits. Like the 
other guy "I did 'rm -rf /', and now nothing works any more, can you help?".

Stefan
  2028   27 Nov 2020 Konstantin OlchanskiForumHistory plot consuming too much memory
> 
> Tested with midas-2020-08-a up until the HEAD of develop
> 

Just so you know, it took myself and Stefan quite a bit of effort
to improve memory and data handling in the new history plots
to be able to plot 1 year of data without bogging down too much. I got
to learn the google-chrome javascript cpu profiler, memory profiler
and the intricacies of javascript shift() and unshift() operators.

Before midas-2020-08-a, pressing the zoom-out button you would never
reach the javascript memory limit, the code would go into "100% cpu use"
and the browser tab will become progressively unresponsive well before
running out of memory. With the original code, our alpha-g history plots
could go back a few weeks at most, with the current code, we can go back
about 11 months. Compared to the old "C" history plots that can
do "last 10 years", no problem.

Loading all the history data into the browser is a design choice.

It has benefits and downsides.

The main benefit is that looking at immediate live data is much easier.

The main downside is that "plot last 10 years" becomes impossible.

As they say "appetite comes during eating", we have learned about these
downsides as we developed the new system. When we started, we did not
know much about javascript memory limits, cpu limits, etc. We did learn
a lot, though.

With the current code, we are limited to loading history data up to 50% of
the javascript memory limit. I know how to change the code to get up to 100%,
but I think it is not worth it, it still does not get as to plot "last 10 year".

We think the solution to recovering "last 10 years" capability is to use
binned data (which the history system can already deliver to javascript).
With binned data, the data volume in Mbytes remains constant, javascript
memory use has an upper-bound (we never use more memory than X Mbytes)
and data movement over the network is reduced.

Another way to look at this - typical display has only 1000-4000 vertical pixels,
it cannot physically display a bigger number of data points (no more
then 1 data point per pixel). So why load 1000000 data points when we only
can plot 1000-4000 of them?

So all the infrastructure for plotting binned data is already there,
but the javascript code still needs to be written. I think the biggest
challenge will be in blending or combining binned and unbinned data
on the same plot or in seamlessly switching the plot between binned and
unbinned data.

K.O.
  2029   27 Nov 2020 Konstantin OlchanskiForumHistory plot consuming too much memory
>
> With the current code, we are limited to loading history data up to 50% of
> the javascript memory limit.
>

The javascript memory limit itself seems to be a moving target. (google javascript 
memory limit, and good luck!).

Historically, javascript did not have any memory or cpu use limits, but with
the raise of abusive web sites, bitcoin miners, etc, I see browsers clamp down
on allowed/allocated CPU use (inactive tabs are throttled down). memory use
is already clamped down severely, on a 64 GB computer, a browser tab
can only allocate a handful of GBs.

This throttling of browser tabs is already intrusive enough that we need
to be careful in programming midas web pages. for examples throttled events
are not firing at the same rate or in the same order as in active tabs.

One logical conclusion of these restrictions could be that, eventually,
google-chrome permits only just enough cpu and memory to run gmail.

K.O.
  2030   27 Nov 2020 Konstantin OlchanskiForumHistory plot consuming too much memory
> 
> Are you sure that the delay comes from the browser or actually from mhttpd
> digging through GBytes of history data?
>

I think we will need to address this question "head-on". The history plot
will need to display the following information:

"time to load data from disk: N seconds, time to transfer data to javascript: M 
seconds, time to make the plot: Q seconds".

The second and third items are already available, the first one will need
to be computed in mhttpd and passed to javascript.

K.O.
ELOG V3.1.4-2e1708b5