ID |
Date |
Author |
Topic |
Subject |
1667
|
28 Aug 2019 |
Nick Hastings | Forum | History plot problems for frontend with multiple indicies | Hi Stefan,
thanks for you quick reply.
> My first question would be why are you using several font-ends at all?
Becuase I was following the model used for many of the frontends for the ND280 FGD.
> That makes things more
> complicated than needed. In the normal FE framework, you can define either several equipment
> served by one frontend, or even one equipment linked to several devices. In the MEG experiment
> we have one slow control frontend controlling ~100 devices without problem. In the old days
there
> was a problem that some slow devices could throttle the readout, but since the invention of
multi-
> threaded slow control equipment, each device gets its own thread so they don't block each
other.
Perhaps things have changed in the 10 years since the FGD SC code was written. I can do it
differently but doing it that way seemed naturual since around 90% of the frontend code that I
have see does it that way.
Nick. |
1668
|
28 Aug 2019 |
lcp | Forum | History plot problems for frontend with multiple indicies | hi,
> > That makes things more
> > complicated than needed. In the normal FE framework, you can define either several equipment
> > served by one frontend, or even one equipment linked to several devices. In the MEG experiment
> > we have one slow control frontend controlling ~100 devices without problem. In the old days
> there
> > was a problem that some slow devices could throttle the readout, but since the invention of
> multi-
> > threaded slow control equipment, each device gets its own thread so they don't block each
> other.
>
I agree with Stefan, that it's probably better to run a multi-threaded setup, than individual frontends.
The only place I've ever used the frontend index on startup is when I was testing and building
an eventbuilder.
https://midas.triumf.ca/MidasWiki/index.php/Event_Builder#Example
This might explain, why your history is swapping between frontends, as in the event builder, it gets
reconstructed.
Maybe this helps...
LCP
> Perhaps things have changed in the 10 years since the FGD SC code was written. I can do it
> differently but doing it that way seemed naturual since around 90% of the frontend code that I
> have see does it that way. |
1669
|
29 Aug 2019 |
Ben Smith | Forum | History plot problems for frontend with multiple indicies | Hi Nick,
I confirm that this issue appears when using the MIDAS history driver. The issue does not appear when using the MYSQL history driver.
One solution is to give each frontend instance a different Event ID (see example code below for doing this in frontend_init). The history system did still seem to be confused by the existing FeDummy equipments/events even after making this change. However, after changing EQ_NAME from FeDummy to FeDum (i.e. starting from a clean state history-wise) things behaved normally.
I will note that some experiments definitely have a need for the "-i" method, especially those that run on distributed clusters.
Ben
```
INT frontend_init()
{
sprintf(eq_name, "%s%02d", EQ_NAME, get_frontend_index());
// Ensure each FE gets a different Event ID in the history system (951, 952 etc)
char keyname[128];
HNDLE hkey;
int status;
sprintf(keyname, "/Equipment/%s/Common/Event ID", eq_name);
status = db_find_key(hDB, 0, keyname, &hkey);
if (status != DB_SUCCESS) abort();
WORD new_ev_id = 950 + get_frontend_index();
status = db_set_data_index(hDB, hkey, &new_ev_id, 2, 0, TID_WORD);
if (status != DB_SUCCESS) abort();
return SUCCESS;
}
``` |
Draft
|
29 Aug 2019 |
Nick Hastings | Forum | History plot problems for frontend with multiple indicies | Hi LCP,
thanks for the suggestion and link. Unfortunatly I don't think this explains it.
Nick. |
Draft
|
29 Aug 2019 |
Nick Hastings | Forum | History plot problems for frontend with multiple indicies | Hi Ben,
thanks for your reply. I can confirm that your suggested workaround does indeed
make the problem dissapear.
I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
Thanks,
Nick. |
1672
|
01 Sep 2019 |
Nick Hastings | Forum | History plot problems for frontend with multiple indicies | Hi Ben,
thanks for your reply. I can confirm that your suggested workaround does indeed
make the problem dissapear.
I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
Thanks,
Nick. |
1689
|
16 Sep 2019 |
Konstantin Olchanski | Forum | History plot problems for frontend with multiple indicies | > My first question would be why are you using several font-ends at all? That makes things more
> complicated than needed. In the normal FE framework, you can define either several equipment
> served by one frontend, or even one equipment linked to several devices.
I am the culprit here, as I wrote the original code for T2K/ND280 that Nick is looking at.
At the time, we needed to control multiple units of identical equipment. Most of these equipments
needed to be controlled independently from each other, so we could not/did not want to use
one single frontend executable to control all of them at the same time. For example, for equipment
not in use, we can stop the corresponding frontend. In case of trouble, we can restart
the corresponding frontend without disrupting the frontends for the other equipments.
The successful operation of the T2K/ND280 experiment is sufficient defence for the validity of this approach.
One lesson learned was that the MIDAS frontend framework did not make it easy to have multiple identical frontends
for controlling multiple identical equipments. (typical use is control of 2-3 Wiener power supplies, 1-2-3 UPS
devices, etc). At the time (and today), only the "i NNN" flag is available to tell the frontend "who am I?". To make it
work, one has to use the hard to "%02d" stuff in the equipment name, and there are other complications. For my
"next generation" of frontends, I tried to specialize the frontend executables at compile time using C/C++
preprocessor defines (-Dwiener01, -Dwiener02, etc), this worked better, but still not super happy.
My current solution as implemented by the tmfe frontend framework is to give the user full control
over the command line arguments (mfe.c did not permit any "user arguments" and did not allow
access to argc/argv) and full control over the equipment names (mfe.c equipment names are fixed at compile time).
K.O. |
1691
|
16 Sep 2019 |
Konstantin Olchanski | Forum | History plot problems for frontend with multiple indicies | > it's probably better to run a multi-threaded setup, than individual frontends.
I recommend against using multiple threads if at all possible and unless absolutely required.
Only for one reason: multithreaded c++ programs are notoriously hard to debug.
In addition, one has to face several classes of bugs absent in single-threaded applications:
a) which thread "owns" which object
b) locking of all shared data
c) huge overheads from locking at high data rates (a performance bug)
d) correct locking order, dead locks, live locks
e) incomprehensible core dumps and stack traces
f) race conditions
To control 2 power supplies, run 2 frontend programs, 1 per power supply.
To control 64 frontend cards, run 1 frontend with many threads: 64 (per device) + 1 (main thread) + 1 (RPC handler) + 1
(watchdog) + 1 (common event generator/data transmitter) + 1 (odb/web page status update). You *will* bump into each
and every one of the problems (a) to (f) above.
K.O. |
1692
|
16 Sep 2019 |
Konstantin Olchanski | Forum | History plot problems for frontend with multiple indicies | > thanks for your reply. I can confirm that your suggested workaround does indeed
> make the problem dissapear.
> I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
I think you found the source of the problem, confused event id assignments. To confirm,
can you email me (or post here) the output of odbedit "ls -l /History/Events".
If that's the problem, you can avoid it completely by switching to a history storage method
that does not rely on magic mapping between equipment names and numeric event id's:
try the "FILE" method (set odb /Logger/History/FILE/Active to "y", restart the logger) or
the "MYSQL" method (you will need to setup a mysql database). You tell mhttpd and mhist which
history data to read by setting ODB /History/LoggerHistoryChannel to one of the channel names
from /logger/history/, restart mhttpd. (mhttpd and mhist used to print a message "reading history
data from channel XXX", but somebody removed this message).
K.O. |
1693
|
16 Sep 2019 |
Nick Hastings | Forum | History plot problems for frontend with multiple indicies | Hi Konstantin,
thanks for your reply.
> > thanks for your reply. I can confirm that your suggested workaround does indeed
> > make the problem dissapear.
> > I guess this issue hasn't been seen at T2K since we use MYSQL for the history.
>
> I think you found the source of the problem, confused event id assignments. To confirm,
> can you email me (or post here) the output of odbedit "ls -l /History/Events".
Sorry, do you want this for after I've applied the fix suggested by Ben or with the original code
that I posted.
With the original code it only shows one fe even though both are running:
[local:e666:S]History>ls -l /History/Events
Key name Type #Val Size Last Opn Mode Value
---------------------------------------------------------------------------
1 STRING 1 10 2m 0 RWD FeDummy02
0 STRING 1 16 2m 0 RWD Run transitions
[local:e666:S]History> scl
Name Host
mhttpd localhost
fedummy01 localhost
fedummy02 localhost
ODBEdit localhost
Logger localhost
[local:e666:S]History>ls "/History/Display/Default/Dummy/
Timescale 1h
Zero ylow n
Show run markers y
Show values y
Sort Vars n
Log axis n
Minimum 0
Maximum 0
Variables
FeDummy01:Data
FeDummy02:Data
Label
Colour
#00AAFF
#FF9000
Factor
0
0
Offset
0
0
Buttons
10m
1h
3h
12h
24h
3d
7d
Formula
Show old vars n
> If that's the problem, you can avoid it completely by switching to a history storage method
> that does not rely on magic mapping between equipment names and numeric event id's:
> try the "FILE" method (set odb /Logger/History/FILE/Active to "y", restart the logger) or
> the "MYSQL" method (you will need to setup a mysql database). You tell mhttpd and mhist which
> history data to read by setting ODB /History/LoggerHistoryChannel to one of the channel names
> from /logger/history/, restart mhttpd. (mhttpd and mhist used to print a message "reading history
> data from channel XXX", but somebody removed this message).
Using the orginal code I posted and switching from MIDAS history to FILE history did not seem to
change the random behaviour in the history plots.
Regards,
Nick. |
1696
|
17 Sep 2019 |
Konstantin Olchanski | Forum | History plot problems for frontend with multiple indicies | > [local:e666:S]History>ls -l /History/Events
> Key name Type #Val Size Last Opn Mode Value
> ---------------------------------------------------------------------------
> 1 STRING 1 10 2m 0 RWD FeDummy02
> 0 STRING 1 16 2m 0 RWD Run transitions
Something is very broken. There should be more entries here, at least
there should be entries for "FeDummy01" and usually there is also an entry
for "FeDummy" because one invariably runs fedummy without "-i" at least once.
The fact that changing from "midas" storage to "file" storage makes no difference
also indicates that something is very broken.
I want to debug this.
Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" files
from the "file" storage.
K.O. |
Draft
|
18 Sep 2019 |
Nick Hastings | Forum | History plot problems for frontend with multiple indicies | > > [local:e666:S]History>ls -l /History/Events
> > Key name Type #Val Size Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > 1 STRING 1 10 2m 0 RWD FeDummy02
> > 0 STRING 1 16 2m 0 RWD Run transitions
>
> Something is very broken. There should be more entries here, at least
> there should be entries for "FeDummy01" and usually there is also an entry
> for "FeDummy" because one invariably runs fedummy without "-i" at least once.
>
> The fact that changing from "midas" storage to "file" storage makes no difference
> also indicates that something is very broken.
>
> I want to debug this.
>
> Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat" files
> from the "file" storage.
>
> K.O. |
1699
|
18 Sep 2019 |
Nick Hastings | Forum | History plot problems for frontend with multiple indicies | Hi Konstantin,
> > [local:e666:S]History>ls -l /History/Events
> > Key name Type #Val Size Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > 1 STRING 1 10 2m 0 RWD FeDummy02
> > 0 STRING 1 16 2m 0 RWD Run transitions
>
> Something is very broken. There should be more entries here, at least
> there should be entries for "FeDummy01" and usually there is also an entry
> for "FeDummy" because one invariably runs fedummy without "-i" at least once.
This is a fresh experiment that I started just to test this this issue, that is why there are not many
entries in /History/Events. I agree though that we should expect to see a FeDummy01 entry.
> The fact that changing from "midas" storage to "file" storage makes no difference
> also indicates that something is very broken.
>
> I want to debug this.
>
> Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat"
files
> from the "file" storage.
When I started this experiment yesterday(?) I disabled the Midas history when I enbled the file
history. Jsut now I reenabled the Midas history, so they are currently both active.
% ls -l work/online/{*.hst,mhf*.dat}
-rw-r--r-- 1 hastings hastings 14996 Sep 17 10:21 work/online/190917.hst
-rw-r--r-- 1 hastings hastings 3292 Sep 18 16:29 work/online/190918.hst
-rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy01.dat
-rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy02.dat
-rw-r--r-- 1 hastings hastings 166 Sep 17 10:17
work/online/mhf_1568683062_20190917_run_transitions.dat
And again, just as a sanity check:
% odbedit -c 'ls -l /History/Events'
Key name Type #Val Size Last Opn Mode Value
---------------------------------------------------------------------------
1 STRING 1 10 1m 0 RWD FeDummy02
0 STRING 1 16 1m 0 RWD Run transitions
Regards,
Nick. |
1710
|
27 Sep 2019 |
Konstantin Olchanski | Forum | History plot problems for frontend with multiple indicies | We should fix this for midas-2019-10.
https://bitbucket.org/tmidas/midas/issues/193/confusion-in-history-event-ids
K.O.
> Hi Konstantin,
>
> > > [local:e666:S]History>ls -l /History/Events
> > > Key name Type #Val Size Last Opn Mode Value
> > > ---------------------------------------------------------------------------
> > > 1 STRING 1 10 2m 0 RWD FeDummy02
> > > 0 STRING 1 16 2m 0 RWD Run transitions
> >
> > Something is very broken. There should be more entries here, at least
> > there should be entries for "FeDummy01" and usually there is also an entry
> > for "FeDummy" because one invariably runs fedummy without "-i" at least once.
>
> This is a fresh experiment that I started just to test this this issue, that is why there are not many
> entries in /History/Events. I agree though that we should expect to see a FeDummy01 entry.
>
> > The fact that changing from "midas" storage to "file" storage makes no difference
> > also indicates that something is very broken.
> >
> > I want to debug this.
> >
> > Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> > with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat"
> files
> > from the "file" storage.
>
> When I started this experiment yesterday(?) I disabled the Midas history when I enbled the file
> history. Jsut now I reenabled the Midas history, so they are currently both active.
>
> % ls -l work/online/{*.hst,mhf*.dat}
> -rw-r--r-- 1 hastings hastings 14996 Sep 17 10:21 work/online/190917.hst
> -rw-r--r-- 1 hastings hastings 3292 Sep 18 16:29 work/online/190918.hst
> -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy01.dat
> -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy02.dat
> -rw-r--r-- 1 hastings hastings 166 Sep 17 10:17
> work/online/mhf_1568683062_20190917_run_transitions.dat
>
> And again, just as a sanity check:
>
> % odbedit -c 'ls -l /History/Events'
> Key name Type #Val Size Last Opn Mode Value
> ---------------------------------------------------------------------------
> 1 STRING 1 10 1m 0 RWD FeDummy02
> 0 STRING 1 16 1m 0 RWD Run transitions
>
> Regards,
>
> Nick. |
1986
|
24 Aug 2020 |
Konstantin Olchanski | Forum | History plot problems for frontend with multiple indicies | This turned out to be a tricky problem. I am adding a warning about it in mlogger. This should go into midas-
2020-07. Closing bug #193. K.O.
> We should fix this for midas-2019-10.
>
> https://bitbucket.org/tmidas/midas/issues/193/confusion-in-history-event-ids
>
> K.O.
>
>
>
>
>
> > Hi Konstantin,
> >
> > > > [local:e666:S]History>ls -l /History/Events
> > > > Key name Type #Val Size Last Opn Mode Value
> > > > ---------------------------------------------------------------------------
> > > > 1 STRING 1 10 2m 0 RWD FeDummy02
> > > > 0 STRING 1 16 2m 0 RWD Run transitions
> > >
> > > Something is very broken. There should be more entries here, at least
> > > there should be entries for "FeDummy01" and usually there is also an entry
> > > for "FeDummy" because one invariably runs fedummy without "-i" at least once.
> >
> > This is a fresh experiment that I started just to test this this issue, that is why there are not many
> > entries in /History/Events. I agree though that we should expect to see a FeDummy01 entry.
> >
> > > The fact that changing from "midas" storage to "file" storage makes no difference
> > > also indicates that something is very broken.
> > >
> > > I want to debug this.
> > >
> > > Since you tried the "file" storage, can you send me the output of "ls -l mhf*.dat" in the directory
> > > with the history files? (it should have the "*.hst" files from the "midas" storage and "mhf*.dat"
> > files
> > > from the "file" storage.
> >
> > When I started this experiment yesterday(?) I disabled the Midas history when I enbled the file
> > history. Jsut now I reenabled the Midas history, so they are currently both active.
> >
> > % ls -l work/online/{*.hst,mhf*.dat}
> > -rw-r--r-- 1 hastings hastings 14996 Sep 17 10:21 work/online/190917.hst
> > -rw-r--r-- 1 hastings hastings 3292 Sep 18 16:29 work/online/190918.hst
> > -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy01.dat
> > -rw-r--r-- 1 hastings hastings 867288 Sep 18 16:29 work/online/mhf_1568683062_20190917_fedummy02.dat
> > -rw-r--r-- 1 hastings hastings 166 Sep 17 10:17
> > work/online/mhf_1568683062_20190917_run_transitions.dat
> >
> > And again, just as a sanity check:
> >
> > % odbedit -c 'ls -l /History/Events'
> > Key name Type #Val Size Last Opn Mode Value
> > ---------------------------------------------------------------------------
> > 1 STRING 1 10 1m 0 RWD FeDummy02
> > 0 STRING 1 16 1m 0 RWD Run transitions
> >
> > Regards,
> >
> > Nick. |
2015
|
19 Nov 2020 |
Joseph McKenna | Forum | History plot consuming too much memory |
A user reported an issue that if they were to plot some history data from
2019 (a range of one day), the plot would spend ~4 minutes loading then
crash the browser tab. This seems to effect chrome (under default settings)
and not firefox
I can reproduce the issue, "Data Being Loaded" shows, then the page and
canvas loads, then all variables get a correct "last data" timestamp, then
the 'Updating data ...' status shows... then the tab crashes (chrome)
It seems that the browser is loading all data until the present day (maybe 4
Gb of data in this case). In chrome the tab then crashes. In firefox, I do
not suffer the same crash, but I can see the single tab is using ~3.5 Gb of
RAM
Tested with midas-2020-08-a up until the HEAD of develop
I could propose the user use firefox, or increase the memory limit in
chrome, however are there plans to limit the data loaded when specifically
plotting between two dates? |
2016
|
19 Nov 2020 |
Stefan Ritt | Forum | History plot consuming too much memory | The history code is right now programmes in such a way that when you request
an old time window, then all data from that window until the present date
gets loaded. When we implemented that, this worked fine for data ranges of
several years with a delay of just a few seconds. Of course one can only
load that specific window, but when the user then scrolls right, one has to
append new data to the "right side" of the array stored in the browser. If the
user jumps to another location, then the browser has to keep track of which
windows are loaded and which windows not, making the history code much more
complicated. Therefore I'm only willing to spend a few days of solid work
if this really becomes a problem.
Are you sure that the delay comes from the browser or actually from mhttpd
digging through GBytes of history data? I realized that you need solid state
disks to get a real quick response.
Stefan |
2017
|
20 Nov 2020 |
Joseph McKenna | Forum | History plot consuming too much memory | Poking at the behavior of this, its fairly clear the slow response is from the data
being loaded off an HDD, when we upgrade this system we will allocate enough SSD
storage for the histories.
Using Firefox has resolved this issue for the user's project here
Taking this down a tangent, I have a mild concern that a user could temporarily
flood our gigabit network if we do have faster disks to read the history data. Have
there been any plans or thoughts on limiting the bandwidth users can pull from
mhttpd? I do not see this as a critical item as I can plan the future network
infrastructure at the same time as the next system upgrade (putting critical data
taking traffic on a separate physical network).
> Of course one can only
> load that specific window, but when the user then scrolls right, one has to
> append new data to the "right side" of the array stored in the browser. If the
> user jumps to another location, then the browser has to keep track of which
> windows are loaded and which windows not, making the history code much more
> complicated. Therefore I'm only willing to spend a few days of solid work
> if this really becomes a problem.
For now the user here has retrieved all the data they need, and I can direct others
towards mhist in the near future. Being able to load just a specific window would be
very useful in the future, but I comprehend how it would be a spike in complexity. |
2018
|
20 Nov 2020 |
Stefan Ritt | Forum | History plot consuming too much memory | > Taking this down a tangent, I have a mild concern that a user could temporarily
> flood our gigabit network if we do have faster disks to read the history data. Have
> there been any plans or thoughts on limiting the bandwidth users can pull from
> mhttpd?
I guess this will not be network limiting but CPU limiting of the mhttpd process. But I'm
not 100% sure, depends on the actual hardware. But even if we improve the history
retrieval to "window only", the user could request all data form 2010 to 2020. So one
would need some code which estimates the amount of data, then tell the user "do you really
want that?". But still, a novice user can simply click "yes" without much of a thought. So
in conclusion I believe proper user training is better than software limits. Like the
other guy "I did 'rm -rf /', and now nothing works any more, can you help?".
Stefan |
2028
|
27 Nov 2020 |
Konstantin Olchanski | Forum | History plot consuming too much memory | >
> Tested with midas-2020-08-a up until the HEAD of develop
>
Just so you know, it took myself and Stefan quite a bit of effort
to improve memory and data handling in the new history plots
to be able to plot 1 year of data without bogging down too much. I got
to learn the google-chrome javascript cpu profiler, memory profiler
and the intricacies of javascript shift() and unshift() operators.
Before midas-2020-08-a, pressing the zoom-out button you would never
reach the javascript memory limit, the code would go into "100% cpu use"
and the browser tab will become progressively unresponsive well before
running out of memory. With the original code, our alpha-g history plots
could go back a few weeks at most, with the current code, we can go back
about 11 months. Compared to the old "C" history plots that can
do "last 10 years", no problem.
Loading all the history data into the browser is a design choice.
It has benefits and downsides.
The main benefit is that looking at immediate live data is much easier.
The main downside is that "plot last 10 years" becomes impossible.
As they say "appetite comes during eating", we have learned about these
downsides as we developed the new system. When we started, we did not
know much about javascript memory limits, cpu limits, etc. We did learn
a lot, though.
With the current code, we are limited to loading history data up to 50% of
the javascript memory limit. I know how to change the code to get up to 100%,
but I think it is not worth it, it still does not get as to plot "last 10 year".
We think the solution to recovering "last 10 years" capability is to use
binned data (which the history system can already deliver to javascript).
With binned data, the data volume in Mbytes remains constant, javascript
memory use has an upper-bound (we never use more memory than X Mbytes)
and data movement over the network is reduced.
Another way to look at this - typical display has only 1000-4000 vertical pixels,
it cannot physically display a bigger number of data points (no more
then 1 data point per pixel). So why load 1000000 data points when we only
can plot 1000-4000 of them?
So all the infrastructure for plotting binned data is already there,
but the javascript code still needs to be written. I think the biggest
challenge will be in blending or combining binned and unbinned data
on the same plot or in seamlessly switching the plot between binned and
unbinned data.
K.O. |
|