17 Jun 2019, Konstantin Olchanski, Bug Fix, removed modbset() from mhttpd.js
|
If it's a function intended for general use, it should be in midas.js.
The documentation for such a function should be made very clear that:
a) it does not actually write to ODB, instead it queues a request for writing, this request is executed at a later (undefined) time.
b) the following javascript code results in undefined behaviour:
modbset("/foo/bar", 1); // queue rpc request
modbset("/foo/bar", 2); // queue rpc request
// is ODB /foo/bar set to 1 or 2?
Why? The best I know javascript does not require for RPCs to execute in-order, so the second RPC may be issued
before the first one. More likely both RPCs are started roughly at the same time (i.e. in different RPC worker
threads), in which case we do not know in which order they will be processed by mhttpd, which is also
multithreaded and does not necessarily execute requests in the same order as (i.e.) they connect to the rpc port 8080.
To answer the question "1 or 2", the answer is neither, as at that point in the code, the RPC requests
probably have not started executing yet, and even if they did, mhttpd most likely did not write anything
into odb yet, as processing RPC requests takes much longer than executing a few lines of javascript.
So to ensure correct sequence of writes and to ensure that something was actually written to odb,
one has to roll out the full ladder of promise event handlers.
Is correct sequence of writes important? Maybe yes, maybe no.
But if you use modbset() without being aware of these issues, you will write code
for ramping high voltage like this:
modbset("/eq/hv/set/voltage", 0); // set voltage to zero
modbset("/eq/hv/set/hv_enable", 1); // enable high voltage
modbset("/eq/hv/set/voltage", 50); // start slow
modbset("/eq/hv/set/voltage", 1000); // ramp to half-way
modbset("/eq/hv/set/voltage", 1900); // stop a bit before the final voltage
modbset("/eq/hv/set/voltage", 2000); // !!! ramping from 0 to 2000 should never be done in one step !!!
And as the author of all this RPC code, I promise that some day you will see the voltage
on the detector go directly from 0 to 2000 (then up and down to 50, 1000 and 1900).
To me, this makes helpful helper functions actually dangerous to use.
Now that I have experience with sync RPC (from C/C++) and async RPC (with Promises in javascript),
I would say that synchronous C/C++ style programming is much easier
and much less verbose and much easier to read and to modify/adjust compared to the javascript-style
ladder of nested event handlers.
I now see that synchronous requests are again permitted if one uses the "Web Worker API". Maybe we should revisit
the MIDAS javascript RPC code and see if we can use this web worker stuff to officially support synchronous RPC requests,
again.
About the sync rpc deprecation, read more here:
https://stackoverflow.com/questions/30876093/will-chrome-and-other-browsers-drop-support-for-synchronous-xmlhttprequest
K.O.
> I disagree. The modbset() function is used in many custom pages at PSI because people are tired of typing mjsonrpc_db_paste([path],[value]) vs. modbset(path, value). We need to keep
> modbset() which is well documented at
>
> https://midas.triumf.ca/MidasWiki/index.php/Custom_Page#modbset
>
> Since modbset() does call the underlying mjsonrpc_db_paste(), it is as good or bad as that function. Plus it adds standard error handling to avoid the need of catching errors for each and
> every mjsonrpc_db_paste() call. If it is believed that modbset() has a problem, then this should be fixed in the source code of modbset(). Removing that function is not an option.
>
> Stefan |
17 Jun 2019, Konstantin Olchanski, Bug Fix, restored modbset() in midas.js
|
> The modbset() function is used in many custom pages at PSI ...
I restored this function in midas.js with a documentation blurb warning about it's asynchronous nature and about the possibility of out-of-order writes.
The more I think about this, the more it looks to me that we should look at this web worker api business to support
synchronous communications for MIDAS web pages.
K.O. |
17 Jun 2019, Stefan Ritt, Bug Fix, removed modbset() from mhttpd.js
|
A ladder of promise event handlers is certainly one possibility to enforce the order of ODB writes, but I wonder if we could so something simpler:
- modbset creates an object remembering the status of the RPC request. Initially, this object receives the status "open request"
- when the rpc call got executed successfully, the callback sets the state of the above object to "request succeeded" or "request failed" (in case of error)
- if a new modbset comes BEFORE the previous one has completed, the function queues the new request in a data field of the above object
- if a rpc call finishes, and a queued new rpc request is present, it gets executed
This would be relatively easy to be implemented and keep the order of the rpc calls. Does that make sense?
Best,
Stefan |
18 Jun 2019, Konstantin Olchanski, Bug Fix, removed modbset() from mhttpd.js
|
> A ladder of promise event handlers is certainly one possibility to enforce the order of ODB writes, but I wonder if we could so something simpler:
>
> - modbset creates an object remembering the status of the RPC request. Initially, this object receives the status "open request"
> - when the rpc call got executed successfully, the callback sets the state of the above object to "request succeeded" or "request failed" (in case of error)
> - if a new modbset comes BEFORE the previous one has completed, the function queues the new request in a data field of the above object
> - if a rpc call finishes, and a queued new rpc request is present, it gets executed
>
> This would be relatively easy to be implemented and keep the order of the rpc calls. Does that make sense?
>
Yes, this is a neat idea, I am really happy with how a complete rpc request can be held by one object, and we can make queues of them, etc.
Anyhow here is the proof of the pudding. I added a test to example.html, there are two buttons, one makes 5 modbset() calls, second has a ladder of 5 db_paste calls. Then I watch
the result in odbedit. 1, 2, 3, 4, 5 is the modbset(), 6, 7, 8, 9, 10 is the ladder of db_paste calls:
$ odbedit
[local:javascript1:S]/>watch Example/int
Watch key "/Example/int" to be modified, abort with any key
/Example/int = 1
/Example/int = 2
/Example/int = 3
/Example/int = 4
/Example/int = 5
/Example/int = 1
/Example/int = 5 <== fault
/Example/int = 5
/Example/int = 5
/Example/int = 5
/Example/int = 1
/Example/int = 2
/Example/int = 3
/Example/int = 5 <== 4 and 5 reversed
/Example/int = 4 <== 4 and 5 reversed
/Example/int = 6
/Example/int = 7
/Example/int = 8
/Example/int = 9
/Example/int = 10
/Example/int = 6
/Example/int = 8 <== should be 7
/Example/int = 8
/Example/int = 9
/Example/int = 10
/Example/int = 6
/Example/int = 7
/Example/int = 8
/Example/int = 9
/Example/int = 10
I immediately notice that we have a race condition between the RPCs, db_watch notifications and db_get_value() in the watch handler:
there are 5 rpcs, 5 watch notifications, 5 calls to db_get_value() in the watch handler, but sometimes the handler is too slow
and the data in odb changes before it reads it, thus duplicate values (missing "7" above). (The old db_open_record() had a "hidden"
db_get_value() inside it, while db_watch() requires an explicit db_get_value() call, making it obvious why we get
the wrong (newer) data sometimes).
Possible fixes for this is to slow down the RPCs (the race condition is still there, probability is reduced) or send the changed
data as part of the notification. If this were C/C++, a "sleep(1)" between modbset() calls would have fixed it,
but there is no sleeping and waiting in javascript. (I guess one could use a ladder of timers).
Other than that, I am surprised how easy it was to see that indeed out-of-order RPCs can happen, see the case
of out-of-order 4 and 5 above. It only took me maybe 5-10 clicks on the button to see that. I expected that I would
need to try several browsers or use a slow network connection, but here it is, on my home mac, localhost network,
google chrome browser.
Below is the test code. I do NOT vote that everybody should use ladders of db_paste calls.
function test_modbset() {
modbset("/example/int", 1);
modbset("/example/int", 2);
modbset("/example/int", 3);
modbset("/example/int", 4);
modbset("/example/int", 5);
}
function test_chained_db_paste() {
var paths = [ "/example/int" ];
mjsonrpc_db_paste(paths,[6]).then(function(rpc) {
mjsonrpc_db_paste(paths,[7]).then(function(rpc) {
mjsonrpc_db_paste(paths,[8]).then(function(rpc) {
mjsonrpc_db_paste(paths,[9]).then(function(rpc) {
mjsonrpc_db_paste(paths,[10]).then(function(rpc) {
// nothing
}).catch(function(error){mjsonrpc_error_alert(error);});
}).catch(function(error){mjsonrpc_error_alert(error);});
}).catch(function(error){mjsonrpc_error_alert(error);});
}).catch(function(error){mjsonrpc_error_alert(error);});
}).catch(function(error){mjsonrpc_error_alert(error);});
}
</script>
<input type=button value='test modbset()' onClick='test_modbset();'></input>
<input type=button value='test chained db_paste()' onClick='test_chained_db_paste();'></input>
K.O. |
18 Jun 2019, Stefan Ritt, Bug Fix, removed modbset() from mhttpd.js
|
Just to make this point clear: The "write-to-odb-read-via-hotlink" was never meant to guarantee the receiving side to see each change. If changes happen too often, updates might get lost. If one relies on the
sequence of updates, one should use direct RPC calls to the frontend or use a midas buffer and encode updates in events.
Stefan |
18 Jun 2019, Konstantin Olchanski, Bug Fix, removed modbset() from mhttpd.js
|
> Just to make this point clear: The "write-to-odb-read-via-hotlink" was never meant to guarantee the receiving side to see each change. If changes happen too often, updates might get lost. If one relies on the
> sequence of updates, one should use direct RPC calls to the frontend or use a midas buffer and encode updates in events.
I recommend that people use the jrpc mechanism that does an RPC directly from javascript into the frontend.
It passes 2 strings as arguments (command and data value). Arbitrary objects can be passed by encoding
the data in json (use mjson.h to decode it in the frontend). A string is returned back to javascript (again, encode
arbitrary data as json, use the mjson.h library).
Call sequence:
javascript -> (http) -> mhttpd -> (MIDAS RPC call) -> frontend -> (write, read, frob hardware) -> frontend -> (MIDAS RPC reply) -> mhttpd -> (http reply) -> javascript
Example of all this is in example.html and fetest.cxx:
javascript side code: mjsonrpc_call("jrpc", { "client_name":"fetest", "cmd":"xxx", "args":"xxx" })
frontend side code:
INT rpc_callback(INT index, void *prpc_param[])
{
const char* cmd = CSTRING(0);
const char* args = CSTRING(1);
char* return_buf = CSTRING(2);
int return_max_length = CINT(3);
cm_msg(MINFO, "rpc_callback", "--------> rpc_callback: index %d, max_length %d, cmd [%s], args [%s]", index, return_max_length, cmd, args);
... do stuff ... put result into string "tmp"
strlcpy(return_buf, tmp, return_max_length);
return RPC_SUCCESS;
}
... somewhere in frontend_init(), register the RPC:
#ifdef RPC_JRPC
status = cm_register_function(RPC_JRPC, rpc_callback);
assert(status == SUCCESS);
#endif
K.O. |
14 Aug 2019, Konstantin Olchanski, Bug Fix, incorrect recursion in ss_suspend() via the user event handler
|
The ROOTANA midas analyzer uncovered a problem with recursive use of ss_suspend().
When running in graphical mode, the ROOT graphics main event loop was calling
ss_suspend(0) (no MSG_BM) which would sometimes call the user event handler (if a new
event arrives into the midas event buffer). Because this loop was already running in the
user event handler, there was a crash.
This is the calling sequence leading to the incorrect recursion: (from system.cxx comments
to ss_suspend())
analyzer ->
-> cm_yield() in the main loop
-> ss_suspend(0)
-> MSG_BM message arrives in the UDP socket
-> ss_suspend_process_ipc(0)
-> cm_dispatch_ipc()
-> bm_push_event()
-> bm_push_buffer()
-> bm_read_buffer()
-> bm_dispatch_event()
-> user event handler
-> user event handler ROOT graphics main loop needs to sleep
-> ss_suspend(0) <--- should be ss_suspend(MSG_BM)!!!
-> MSG_BM message arrives in the UDP socket
-> ss_suspend_process_ipc(0) <- should be ss_suspend_process_ipc(MSG_BM)!!!
-> cm_dispatch_ipc() <- without MSG_BM, calling cm_dispatch_ipc() again
-> bm_push_event()
-> bm_push_buffer()
-> bm_read_buffer()
-> bm_dispatch_event()
-> user event handler <---- called recursively, very bad!
The proper fix for this is to always call ss_suspend(MSG_BM) from the user event handler
and ss_suspend(MSG_ODB) from the user db_watch handler.
In this second case, ss_suspend(MSG_OBM) will lose/ignore/discard db_watch notifications,
if you do not want that, call ss_suspend(0) and be ready for recursive calls to your
db_watch handler. (this can happen in a frontend program that acts on changes in ODB and
where some of these actions may require sleeping via ss_suspend()).
ss_suspend(MSG_BM) will discard MSG_BM messages, which is not a problem as
bm_wait_for_events() and cm_yield() will immediately poll the event buffer and there will be
no delay in receiving new events.
Until commit 99d6e90 there were problems in ss_suspend(MSG_BM) and recursive calls to
the user event handler were still possible. It is now fixed in this and the previous commits.
K.O. |
27 Sep 2019, Konstantin Olchanski, Bug Fix, improvement for midas web page resource use
|
I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser
"task manager" and by other tools.
For example, size of the "status" page was observe to reach 200, 600 and even 900 Mbytes. The "programs" page (which
does not have nearly as much stuff as the status page), was observed to reach 200-600 Mbytes. This is comparable to the
New York Times front page, which has much more stuff, but usually runs at about 200 Mbytes. (they do force a periodic full
page reload, to deal with exactly this same type of trouble, I suspect).
Also I observed the midas web pages consume an unusual amount of CPU - 5-10-15% - all in inactive tabs in minimized
windows.
All this was quite noticeable in my oldish mac laptop with only 8 GBytes of RAM.
Using the google-chrome performance analyzer I was able to identify the reason of high memory use - our 1/sec periodic
updates leak "too many" DOM "nodes" and I suspect that due to throttling of inactive tabs, the garbage collector simply
does not keep up with us.
(Note that javascript features automatic memory management with garbage collection. In practice in means that where in
C/C++ we have malloc() and free(), in javascript we only have malloc() and no free(), and cannot explicitly release memory
we know we no longer need. In the C/C++ sense, all memory allocations are leaked, and one relies on a janitor to "clean it all
up" eventually, later).
The source of node leakage was unexpected (unexpected to me). It turns out that each assignment to e.innerHTML creates
a new node, even if the new contents is the same as the old contents. (also the html parser has to run, consuming extra cpu
cycles).
Obvious solution is to write code like this:
if (v !== e.innerHTML) { e.innerHTML = v };
This helped quite a bit on the "programs" page, but not as much as expected, and hardly at all on the "status" page.
It turns out, that read of innerHTML does not necessarily return the same string as it was written into it.
For example, if "v" is "a&b", e.innerHTML will return "a&b" and the comparison will misfire.
There is more cases like this, see the section "Test set and get e.innerHTML" on the "example" midas page.
To help dealing with this, I suggest that instead of "inline" comparison (as above), one writes this:
mhttpd_set_if_changed(e, v);
Then to check that the comparison is effective, go to mhttpd.js and uncomment the console.log() call in
mhttpd_set_if_changed(), reload the page and look at the javascript console to see all calls that result
in assignment of innerHTML (and leakage of DOM nodes).
This done, after replacing many "&" with "&" and many "\'" with "\"", node leakage on the "programs" page was reduced
to 1 node per 1/sec update: the unavoidable change to the timestamp on the top-right of the page.
Luckily, Stefan pointed me to the solution for this: use of e.firstChild.data instead of e.innerHTML. The only quirk is that the
node should not be empty, which was easy to arrange by setting the initial value of the timestamp to a dummy value.
With these changes, the "programs" page (and most other pages) now leak 0 nodes (from the 1/sec periodic updates).
There is still some small memory leakage from making the RPC requests and from receiving the RPC replies, but the
garbage collector seems to have no trouble with them.
Typical memory use for all midas pages is now 50-60 Mbytes (down from 100-200 Mbytes).
The "status" page took a bit more work to fix due to it's curious coding, but it, too now uses 50-60 Mbytes as well. It still
leaks quite a few nodes (to be fixed!), but the garbage collector seems to keep up with the allocations.
K.O. |
27 Sep 2019, Konstantin Olchanski, Bug Fix, improvement for midas web page resource use (alarm sound)
|
> I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser
> "task manager" and by other tools.
>
> For example, size of the "status" page was observe to reach 200, 600 and even 900 Mbytes.
> [this was fixed by using set_if_changed(e, v);
>
> Also I observed the midas web pages consume an unusual amount of CPU - 5-10-15% - all in inactive tabs in minimized
> windows.
>
The case of high CPU use turned out to be quite nasty.
The symptoms:
- the "programs" page in an inactive tab in a minimized window sits "doing nothing" for a day or two.
- uses about 0 to 0.1 to 1% CPU and 40-50-60 Mbytes of RAM (after the previous improvements)
- suddenly I see it use 10-15-20% CPU, continuously, non stop
- I open this tab
- suddenly, CPU use goes to 100%, memory use quickly grows from 40-50-60 Mbytes to 100-200 Mbytes.
- after a few seconds everything settles down, CPU use is back to 0-0.1-1%, but memory use does not go down.
- WTH?!?
The culprit turned out to be the playing of the alarm sound. (I have all tabs "muted" by default, also speakers usually powered down).
If I comment-out the playing of the alarm sound, this problem goes away completely. Pretty conclusive, I think.
After adding lots of debug console.log() calls, I think I identified the problem: audio objects were being created,
but they were not starting to play their sound files. When I opened the tab, all of them (about 400) at the same time
loaded the mp3 file (resulting in memory use going from 50 Mbytes to 190 Mbytes, typical) and started playing
(as seen on the audio event activity in the cpu profile traces from the google-chrome "performance" tool).
I think I am looking at an unexpected interaction between audio objects and google-chrome throttling of inactive tabs.
To muddy the waters some more, google-chrome periodically fails audio.play() with an exception to the effect of
"we will not play audio because user is not interacting with this page enough". See
https://bitbucket.org/tmidas/midas/issues/191/exception-on-audioplay
Now I think I have this sort of fixed. I have to handle the audio.play() failure (which is not a normal exception,
but a rejected promise, the handler is quite different), and I do not allow creating new audio objects if previous
audio object did not finish playing.
(note the "normal" timing: periodic update every 1 sec, playing of alarm sound event 60 seconds, length of alarm sound file is 3 sec,
two sound files should never overlap. now a console.log message is printed if overlap is detected)
This leaves us with the problem of alarm sound not playing "because the user didn't interact with the document first",
and I think there is nothing I can do about that.
K.O.
P.S. Another quirk is I discovered: go to the "config" page and press the new buttons "play test sound" and "speak test message". In muted
tabs, the test sound will not sound, but the test message will be shouted out loudly. This seems inconsistent to me. Unwanted audio/video ads
are blocked but loud shouting of "shave with burma-shave" is permitted. I also wonder if speaking is subject to this
"user did not interact" business. If not, we could replace the playing of our relaxing alarm beep with the yelling of "alarm! alarm! alarm!".
K.O. |
28 Nov 2019, Konstantin Olchanski, Bug Fix, improvement for midas web page resource use (alarm sound and fit_message)
|
> > I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser
> > "task manager" and by other tools.
The final fix is in. (plus a fix from Stefan).
When the audio.play() promise is rejected, one must clear audio.src, otherwise the browser will continue
with loading the audio file (but will not play it at the end).
Normally, this should not be a problem, but in inactive tabs, all activity is throttled down, and it so happens
that these audio objects accumulate (they are in the state of "we are trying to load the sound file, but
browser slows us down so much!"), consume huge amounts of memory (page memory use goes from ~50 Mbytes
to ~100-200 Mbytes) and consume huge amounts of CPU (not clear how, probably it's the firing of "loading", "canplay", etc
event handlers).
It does not help that mhttpd_fit_message() had a performance bug and consumed large amounts of CPU causing even
more slowing down by the be browser.
After adding audio.src="", all this is gone. I see no special CPU use, and I do not see any strange large memory use.
I still sometimes see inactive tabs grow from ~50 Mbytes to amount ~100 Mbytes. After I open them (activate them),
they quickly shrink back to ~50 Mbytes. I conclude that the browser is slowing down the garbage collector in inactive
tabs so much that it does not keep up with our 1/sec data polling.
So Stefan's fix to reduce polling from 1/sec to 1/10sec should help with this, too. (plus reduction of CPU use by fit_message() should
leave more time for the garbage collector to run).
P.S. General rules for browser slow down of inactive tabs seem to be written here:
https://developer.mozilla.org/en-US/docs/Web/API/Page_Visibility_API
Slowdown of timers is written here:
https://developer.mozilla.org/en-US/docs/Web/API/WindowOrWorkerGlobalScope/setTimeout#Notes
K.O. |
28 Nov 2019, Konstantin Olchanski, Bug Fix, improvement for midas web page resource use (alarm sound and fit_message)
|
> > > I noticed that midas web pages consume unexpectedly large amount of resources, as observed by the chrome browser
> > > "task manager" and by other tools.
The work on this problem has been blogged in the bitbucket issue tracker:
https://bitbucket.org/tmidas/midas/issues/158/midas-status-page-memory-leak
K.O.
Below is a dump of the issue for posterity ---
Team Midas MIDAS related packages midas Issues
midas status page memory leak
Create issue
Issue #158 RESOLVEDOpen Workflow More Edit
dd1 created an issue 2018-12-25
I have the midas status page (https://daq16.triumf.ca/) open in macos google chrome 71.0.3578.98 and I watch in the "task manager" how the memory use is
246 Mbytes and growing at around 1 Mbyte every 2-3 seconds. CPU use is around 3-5%, network use is 47 kBytes/sec. The slowly growing memory use
indicates that we have a memory leak. (Note that javascript uses "automatic garbage collection" memory management, which does not eliminate memory
leaks. Only capability to explicitly free unused memory is eliminated). K.O.
Comments (35)
dd1 REPORTER
Actual memory use goes up to around 250-something MBytes, then drops down to 240-something, them slowly grows back up, drops down, rinse, repeat.
This is the javascript garbage collection in action. So there is no memory leak on the status page, but still why do we generate around 1 Mbyte/sec of
javascript memory allocations? As comparison, the NYTimes front page consumes 270 Mbytes. One would expect the midas front page to be much more light
weight... K.O.
Edit Pin to top Mark as spam Delete 2018-12-27
dd1 REPORTER
Then there is a question of memory use by the "message" page. This page does grow infinitely large by design - as new messages are added to midas.log -
as as the user keeps scrolling the messages back in time. Perhaps we should somehow limit the total memory use there... K.O.
Edit Pin to top Mark as spam Delete 2018-12-27
Stefan Ritt
changed status to closed
I see the same behaviour. The relatively large memory allocation by Chrome probably comes from some bitmap caching. The browser prints the page
contents into some temporary bitmap and then flushes it to the screen. That can easily take a few MB. I monitor such behaviour since several years now (for
other processes) and concluded that I don't need to worry about JavaScript memory consumption.
Concerning the messages page: One line takes about 100 Bytes. If you scroll really fast, you can do maybe 30 lines per second, thus 3kB. If we allow the
browser to consume another 100 MB (should be easily possible these days), you have to continuously scroll for 100000kb/3kb=30000 seconds or eight
hours. Good luck!
Closing this topic if no complaints.
Pin to top Mark as spam Delete 2019-01-08
dd1 REPORTER
changed status to open
still see high memory use by midas pages. K.O.
Edit Pin to top Mark as spam Delete 2019-09-15
dd1 REPORTER
See high memory use from long running (days-weeks) web pages:
status page of my test experiment - 953 MB - 155 MB after reload
odb editor - 661 MB - 80 MB after reload
programs page - 602 MB - 64 MB after reload
sequencer - 253 MB - 151 MB after reload
sequencer - ??? (very big) - reloaded before I wrote it down
I think we are leaking memory somewhere. Or causing unnecessary allocations that the javascript garbage collector does not keep up with or does not
cleanup correctly. K.O.
‌
‌
Edit Pin to top Mark as spam Delete 2019-09-15
dd1 REPORTER
I am suspicious of memory use trouble from periodic-update code that keeps setting innerHTML to the same value as it was before, unnecessarily. (this also
causes other problems - cannot cut-and-paste affected parts of the web page, high cpu use to redraw the (unchanged) page). K.O.
‌
Edit Pin to top Mark as spam Delete 2019-09-15
Stefan Ritt
For setting innerHTML we should always use
if (text !== control.inner HTML)
control.innerHTML = text;
‌
I thought I caught most of the cases, but I might have missed some. Please add as needed.
‌
Stefan
Pin to top Mark as spam Delete 2019-09-15
dd1 REPORTER
Strange things continue. Just say huge CPU usage from 3 midas web pages (odb editor, programs page and the new sequencer page). All 3 pages are tabs in
an iconized browser window. Suddenly machine feels slow, and I see all 3 use 25% CPU each (by the chrome-browser task manager window). Opened the
browser window, sent to the offending tabs, nothing looks amiss, CPU usage went back to 0%. WTH? (all 3 pages have 100 Mbyte memory use, all 3 pages
update at 1 Hz). K.O.
Edit Pin to top Mark as spam Delete 2019-09-16
dd1 REPORTER
looked at the “programs” page. learned how to use the google-chrome “performance” tool. I was definitely leaking html nodes. The leak was in an
unexpected place - innerHTML with a link was miscomparing because of unexpected string transformation:
xbad: "<a href='?cmd=odb&odb_path=System/Clients/" + key + "'>" + host + "</a>";
good: "<a href=\"?cmd=odb&odb_path=System/Clients/" + key + "\">" + host + "</a>";
Now node leak from my periodic update went from 35 nodes to 2 nodes per update. The performance tool fails to identify where these last 2 nodes are
coming from.
K.O.
Edit Pin to top Mark as spam Delete 2019-09-17
dd1 REPORTER
Forgot to add - the periodic update from mhttpd_init() is also leaking nodes. I will look at it some other time. K.O.
Edit Pin to top Mark as spam Delete 2019-09-17
dd1 REPORTER
after improvement to the “programs” page, the tab is staying at 50-60 Mbytes. promising… K.O.
‌
Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
Fixed node leak in mhttpd_refresh(): the alarm display was setting e.innerHTML even if it did not change.
There only remains an unavoidable node leak with “mheader_last_updated” where we set the current time every 1 second. If I comment this out, there is no
node leak on the “programs” page.
K.O.
‌
Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
“programs” page memory use now sits around 40 Mbytes. K.O.
Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
Stefan points me to the use of e.firstChild.data instead of e.innerHTML, per https://medium.com/@ok.bayat/fixing-memory-leak-problem-in-javascript-
application-ed3a2d9d92df
K.O.
Edit Pin to top Mark as spam Delete 2019-09-18
dd1 REPORTER
implemented this for the timestamp update and (i.e.) the “programs” page now leaks 0 nodes. memory use for all pages sits around 40-60 Mbytes. K.O.
Edit Pin to top Mark as spam Delete 2019-09-26
dd1 REPORTER
see problem of high cpu usage again, after google-chrome restarted after an update to latest version. for example, “program” page is 65 Mbytes, uses 20%
CPU. (in an inactive tab). If I open this tab, for maybe 10 seconds, it goes to 100+ Mbytes with big CPU usage (>100%), then drops down to 90 Mbytes, 0%
CPU usage. I do not see any other web pages or tabs doing this. Only our midas pages. WTH!?! K.O.
‌
Edit Pin to top Mark as spam Delete 2019-10-13
dd1 REPORTER
figured out high cpu usage reported as “rendering”. Open “devtools”, goto “performance”, press Command-Shift-P, start typing “rendering”, select “fps
meter”. A black square will open in top-left, showing graphics activity (frame rate, GPU usage, etc).
Now wait for new message to appear in the top status bar. It will be “yellow” at first, that it will fade to “gray”. During this fading, GPU use is 100% during
about 1 second, FPS is about 50 frames/sec.
K.O.
Edit Pin to top Mark as spam Delete 2019-10-14
dd1 REPORTER
quick google search shows much discussion about css animations using “too much CPU”, i.e. google “css pulsing background”, but no clear way to tell the
browser to slow down. It looks to me like the background-color animation tries to run at maximum possible frame-rate, as if electricity is free. (Since I am
debugging high-cpu and high-memory use of inactive tabs, there is nobody looking at these animations). K.O.
Edit Pin to top Mark as spam Delete 2019-10-14
Stefan Ritt
New messages are displayed with a yellow background and fade to grey after 5 seconds. This is handeled in mhttpd.js around line 2144. You can try to
remove the lines
d.style.setProperty("-webkit-transition", "background-color 3s", "");
d.style.setProperty("transition", "background-color 3s", "");
and see if the CPU load goes down.
Pin to top Mark as spam Delete 2019-10-14
dd1 REPORTER
captured another trace of midas page using 20% cpu in an inactive tab, iconized browser window. capturing is difficult, requires very fast mousing to: select
the right tab, right-click to “inspect”, select “performance” tab, click on “start capture”, and hope that by this time the web page activity does not complete.
this time I got the last 200 ms or so.
what I see is again is “media activity” (only identified as “task”), GPU activity (only identified as “GPU activity”) and main thread activity (identified as an
infinitely repeating sequence of “receive response 206 audio/mpeg”, “receive data 39287 bytes”, “finish loading”, then the same sequence again. 39287 is
the file size of resources/beep.mp3. There is no corresponding network activity, so the loading of beep.mp3 must be coming from cache. On the javascript
console, there are the usual “not allowed to play audio because user did not interact” messages repeating about every 1-2-3 minutes.
I read this as: for reasons unknown, a huge number of audio requests becomes queued (the tab was inactive/iconized for many days) then they start trying to
play (load beep.mp3, do not play it because “not allowed”, move on to the next audio object, load … etc). This is consistent with the cpu use, with the
captured traces and with the quick growth in memory size (beep.mp3 objects are created, consume memory, cannot be free’d until garbage collector runs
later. much later).
The above scenario is impossible with how the current audio playing code is written (only one audio object can exist at a time, new audio object can only be
created after the previous one finished playing).
Two possible explanations: (a) the code running in the web page is not the same code as in mhttpd.js (running an old version from cache) or (b) the code
“one audio object at time” is not working correctly if javascript code is throttled /delayed/stopped in inactive tabs.
Following code will have this problem:
var only_one = null;
function foo() { /* runs from periodic timer */
if (!only_one) {
a = new Audio(“beep.mp3”);
/* throttled/suspended/delayed here */
/* multiple Audio objects created because "only_one" is still null */
only_one = a;
}
}
K.O.
‌
Edit Pin to top Mark as spam Delete 2019-10-17
dd1 REPORTER
fading background - yes, I found the code. pretty neat. I moved it around to remove the timer - I am suspicious of how the timers run in inactive tabs. but no
time to study it.
but the current problem is clearly with audio objects, and the only audio we have is the periodic playing of beep.mp3. who knew there will be so much trouble.
there is still the unexplained use of GPU, but maybe playing/decoding mp3 files uses the GPU.
I am also puzzled why the status page from midas-2019-03 does not show any of these problems. it just sits there using no memory (50 Mbytes) and no
CPU. perhaps we changed something in the playing of audio files since last March (when midas-2019-03 was tagged).
K.O.
‌
Edit Pin to top Mark as spam Delete 2019-10-17
dd1 REPORTER
For the first time I saw my message “mhttpd_alarm_play: Cannot play alarm sound: previous alarm sound did not finish playing yet” reported on the javascript
console. This confirms my guess that playing of audio is actually delayed and indeed we need to check that the previous audio finished playing before
creating new audio objects. But the check in the current code has a race condition. If the delay/stall is inside “new Audio()”, we will create multiple audio
objects as “last_audio” is still in the “finished playing” state, we only change it after the return from “new Audio()”. K.O.
Edit Pin to top Mark as spam Delete 2019-10-28
dd1 REPORTER
see big improvement. now inactive tabs grow from 50-ish Mbytes to 170-ish Mbytes, then when I open them, there is some cpu use (GC, I guess) and
memory use drops back to 50-ish Mbytes. So we are not leaking any memory anymore. Looking at the console messages, I see that my fixes are helping -
there is messages about attempts to create new Audio() when previous one did not finish yet. K.O.
Edit Pin to top Mark as spam Delete 2019-11-04
dd1 REPORTER
I guess, inactive tabs are throttled by google-chrome so much that their GC (memory garbage collection) does not keep up with our 1/sec data updates. I do
not think we need to keep updating inactive tabs at this high frequency, but I am not sure how to detect if we are active or inactive. Maybe I can detect the
throttling instead. K.O.
‌
Edit Pin to top Mark as spam Delete 2019-11-04
dd1 REPORTER
see consistent behaviour from google-chrome:
have all these midas tabs open, inactive, window iconized, typical tab size is 50-ish Mbytes.
google-chrome update arrives
update is installed, all windows and tabs automatically closed, then reopened.
the midas tabs are still inactive, window is iconized
after a few days, see behaviour as described before:
midas tabs use 20-30% CPU, size is 100-ish Mbytes
if I open one of these tabs, it’s cpu usage goes up to 160%, size grows to 250-ish Mbytes, then within 5-10 seconds drops to 100 Mbytes, CPU usage goes
from 160% to zero.
when looking at this, if I am quick enough, I can right-click “inspect”, go to the “performance” tab, and press the “start collecting data” button and I capture
the very tail end of all this strange activity. This is the traces I have been describing so far.
K.O.
Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
see big blob of activity:
timer activation
mhttpd_message()
first call to mhttpd_fit_message()
a long cycle of (maybe 10-20) “recalculate size”, “layout”, “parse html”
second call to mhttpd_fit_message()
same long cycle of …
The way I understand this, mhttpd_fit_message() changes the size of some html element that causes the whole window to be re-layed-out.
K.O.
‌
Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
trying to figure out what triggers a long run of the “rasterizer” thread. see a very strange call sequence
timer fires
mhttpd_alarm_play()
mhttpdConfig()
“Layout”
mhttpd_alarm_play() calls mhttpdConfig() (3 times) to find out if alarm sound is enabled, the period, the file name, etc. so far so good. but mhttpdConfig()
does not touch any DOM objects, so why is it shown as calling “Layout”?!?
other than this trace, I see nothing else that would trigger the rasterizer thread…
(note) this time, mhttpd_alarm_play() does not call mhttpd_alarm_play_now(), so “new Audio” and stuff does not enter this picture.
K.O.
Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
in the early part of the trace, where I think the meat of “tab is using cpu and memory” is, I see the audio events firing in rapid sequence: loadeddata, canplay,
canplaythrough, rinse, repeat.
It turns out that promise rejection from audio.play() does not stop the loading of the sound file. This is easy to see by attaching the event handlers to these
events and by observing these event handlers print something to the javascript console.
If that is what is happening, it explains what I see: all my previous attempts to prevent the piling up of sound files are unsuccessful, and when I open the
previously inactive tab, all the queued sound files start loading (and not playing per “user did not interact” policy).
google docs suggest using audio.src=”” to cancel loading of sound files and it does seem to work. testing it now.
K.O.
‌
Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
gotcha. came back home, found one tab using about 10% cpu. audio.src=”” is commented out, javascript console is full of this:
Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 234
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 234
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 234
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 235
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 235
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 235
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 236
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 236
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 236
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 237
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 237
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 237
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 238
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 238
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 238
mhttpd.js:2759 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_loadeddata: counter 239
mhttpd.js:2763 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplay: counter 239
mhttpd.js:2767 Thu Nov 07 2019 22:17:30 GMT-0800 (Pacific Standard Time): mhttpd_audio_canplaythrough: counter 239
the timestamp is exactly when I opened this tab. so confirmed, a whole bunch of audio files got queued, when I open the tab they all try to play. (there is no
actual sound, all tabs are muted).
now I uncomment audio.src=”” and see what happens.
K.O.
‌
Edit Pin to top Mark as spam Delete 2019-11-07
dd1 REPORTER
looks good. an update to google-chrome came in and after installing, I no longer see midas tabs show high cpu usage or high memory use. I think the
audio.src=”” fix is it. I will be committing these fixes to midas. K.O.
Edit Pin to top Mark as spam Delete 2019-11-12
Stefan Ritt
The loop in mhttpd_fit_message() is there for a good reason: I want to display the message in a single line. If it’s too long, I want first to cut the time stamp
and then display it. If it’s still too long, I want to truncate the message and display “…” at the end. The problem is what “too long” is. Nobody can tell you how
much pixel a message on your browser take, because this depends on the installed fonts, the exact character spacing of your browser and so on. So the only
way I could make this happen is to add one char at a time, until we get close to the maximum allowed space. If course this requires a re-layout of the page for
10-20 times, but when your window is in the foreground this is not a problem, since a browser can do this with small CPU load. The “scope” application I use
does 70 frames per second at 30% CPU load. So one could make the loop a bit smarter, like binary search, which would drop the 10-20 iterations to log2(10-
20) ~ 4-5, but still there would be a loop.
How that the update of the messages in the background is suppressed with the hidden API, do you still have that problem or can we consider it fixed?
Stefan
Pin to top Mark as spam Delete 2019-11-18
dd1 REPORTER
see new behaviour - after many days, inactive page size is ~180 Mbytes. 0% CPU use (an improvement from before where there was large CPU use). activate
the tab, nothing much happens, 0% CPU use (again an improvement from before). after about 30 seconds, memory use drops down to the normal 50-70-80
Mbytes. I think what we see is the garbage collector is throttled down and does not keep up with our allocations. Stefan’s new fix reducing polling in inactive
pages from 1/sec to 1/10sec should help with this. K.O.
‌
Edit Pin to top Mark as spam Delete 2019-11-19
dd1 REPORTER
mhttpd_fit_message() - confirmed. I was confused about the function argument.
I thought it is passed an array of messages. no, it is one message string and the loop is over the message string length. The loop is done twice (second time,
with the time/date stamp removed). google-chrome debugger does show that this uses large amount of CPU, mainly to compute d.offsetWidth.
I think I will refactor these loops - instead of growing the message, I will shrink it.
K.O.
Edit Pin to top Mark as spam Delete 4 hours ago
dd1 REPORTER
rewrote mhttpd_fit_message() to reduce CPU use: try to fit complete message, if too long, try to fit message without timestamp, if too long, guess the desired
length assuming all chars have same width, then grow or shrink the message until the size is right. K.O.
Edit Pin to top Mark as spam Delete 17 minutes ago
dd1 REPORTER
changed status to resolved
The main fix is to set audio.src="" in the promise rejection. K.O.
Edit Pin to top Mark as spam Delete just now
What would you like to say?
Assignee
–
Type
bug
Priority
major
Status
resolved
Votes
0 Vote for this issue
Watchers
1 Stop watching
Dismiss this bannerJira Software: the preferred issue tracker for Bitbucket. Join the team! |
24 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
I am updating the history plots. Main changes:
- the old history display code should again be easily usable (use the "open in old history display" checkbox)
- the history plot editor has an "edit in ODB" button that takes as to the plot definition in ODB (sometimes it is
easier to editing things in the ODB editor)
- error in history plot editor that created "formula" entry of incorrect size should be fixed
- "reorder" (and "delete entry") functions in the history plot editor should work again (plus added explanation text)
- "factor" and "offset" restored in the history plot editor
- added the long desired "voffset" to simplify plot scaling and positioning
- (factor, offset and voffset do not yet work in the new history plots, TBI ASAP)
- history plot editor and generate_hist_graph() now use the same code to read plot definitions from ODB. There should
be no more confusion about content of history plot entries in ODB and what each entry is supposed to do.
These changes have been precipitated by our inability to plot high voltage voltage and current on the same plot,
see bug https://bitbucket.org/tmidas/midas/issues/308/history-plot-formula-cannot-be-used-to
Voltage is in the range 0..1000 (volts) and current is in the range 0..50 and 0..0.100, autoscaling on voltage
makes the currents invisible at the zero line. In the past, we used the "factor" setting to scale
the graphs so we can see both voltage and currents at the same time (currents scaled up by factor 25 and 600,
as example).
The new "formula" feature was supposed to replace (and improve upon) the "factor" and "offset". But if I use
the formula "x*25", suddenly the plot is telling us that current values are not 50 uA, but 1250 uA (50*25),
and this is just wrong. We do not want to scale the micro-amps, we want to better position the plot on the graph,
like the old "factor" and "offset" allowed us to do.
So the idea is to use this computation:
y_position_on_plot = offset + factor*(formula(history_value) - voffset)
- "formula" is to transform history values into physical values (i.e. pressure meter reports bars, but we want atm, or
voltmeter is reading in discrete units of 0.125V, we want to see volts)
- "factor" and "offset" is to position the graphs on the plot for best visual presentation of data
- I also added is the much desired "voffset", you only know it is needed if you have a non-zero "offset" and you need
to change the "factor", surprise, "offset" has ot be changed, too, and good luck recalculating it correctly in one
try.
The way to use this stuff:
- adjust "voffset" to bring the graph to around y=0
- increase the "factor" to zoom-in on features and stuff
- adjust "offset" to move the graph up and down relative to all the other graphs on the plot
- now one can zoom in and out as needed by changing the "factor" and the plot will stay roughly in the right place
without having to readjust the offsets.
K.O. |
24 Jun 2021, Stefan Ritt, Bug Fix, changes in history plots
|
I disagree with the proposed change to scale the HV current for a "nice" display. If values are scaled, the axis should be
scaled in the same way. Otherwise people might read the current from the plot, look at the axis, and again get the wrong
value (the factor of 25x you mention). Sure you can hover with the cursor over the graph, and see the right value, but think
of taking a screen shot, putting this into a publication, and get complaints from the reviewer.
The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a
new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For
the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.
Stefan |
25 Jun 2021, Stefan Ritt, Bug Fix, changes in history plots
|
A general warning: With the recent history changes implemented in the develop branch, starting from a fresh ODB and editing
any history panel, on gets tons of errors and debug output from mhttpd:
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Log axis" returned status 312
MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
Load from ODB History/Display/Default/Trigger rate: hist plot: 2 variables
timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, show_run_markers: 1, show_values: 1,
show_fill: 1
var[0] event [System][Trigger per sec.] formula [], colour [#00AAFF] label [] factor 1.000000 offset 0.000000 voffset
0.000000 order 10
var[1] event [System][Trigger kB per sec.] formula [], colour [#FF9000] label [] factor 1.000000 offset 0.000000 voffset
0.000000 order 20
This has to be fixed by the original author. I strongly recommend to make such modifications on a separate branch not to
break running experiments.
Stefan |
25 Jun 2021, Marco Francesconi, Bug Fix, changes in history plots
|
We are using the new history formula as a quick way to convert signals from sensors to actual physical values (for example Voltage->Temperature, Voltage->relative humidity
...), so it is great that the shown voltage is the calculated one.
I would like to add a point to this discussion.
In our collaboration people attach images of history plots to elogs, meeting presentation and/or physical logbooks.
The proposed scaling formula may work fine online using the cursors, but, once an image is created, I do not understand how it is possible to extract the value for a scaled
variables.
Suppose you see a graph in a presentation with a current increase by some PSU and the current was scaled to be in the same plot of the voltage.
Looking at the delta in the image, how can you judge the current increase without any axis/grid to refer to?
So I support Stefan proposal for a secondary axis, as long as it is clear which value belong to which axis.
Maybe marking the channels in the description or using different line styles/thickness?
Best,
Marco |
25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
> A general warning: With the recent history changes implemented in the develop branch, starting from a fresh ODB and editing
> any history panel, on gets tons of errors and debug output from mhttpd: ...
This is the reason most projects have separate development and production branches.
I recommend everybody to use the released tagged versions of midas for production.
> I strongly recommend to make such modifications on a separate branch not to
> break running experiments.
Is there something that does not work anymore? Did I break something? The debug messages I am still
tuning.
K.O.
>
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Log axis" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
> Load from ODB History/Display/Default/Trigger rate: hist plot: 2 variables
> timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, show_run_markers: 1, show_values: 1,
> show_fill: 1
> var[0] event [System][Trigger per sec.] formula [], colour [#00AAFF] label [] factor 1.000000 offset 0.000000 voffset
> 0.000000 order 10
> var[1] event [System][Trigger kB per sec.] formula [], colour [#FF9000] label [] factor 1.000000 offset 0.000000 voffset
> 0.000000 order 20
>
>
>
> This has to be fixed by the original author. I strongly recommend to make such modifications on a separate branch not to
> break running experiments.
>
> Stefan |
25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
> I disagree ...
I am happy with disagreement and differences of opinions. Zest of life, driver of progress and improvements, etc.
I am even more happy with solutions to problems. The current problem is that the offset and factor feature
of history plots has been removed without much discussion.
I stress, we have been using this feature to run experiments for the last 20 years.
I do not understand objections to it being restored. If you do not want to use it, do not use it.
K.O.
> with the proposed change to scale the HV current for a "nice" display. If values are scaled, the axis should be
> scaled in the same way. Otherwise people might read the current from the plot, look at the axis, and again get the wrong
> value (the factor of 25x you mention). Sure you can hover with the cursor over the graph, and see the right value, but think
> of taking a screen shot, putting this into a publication, and get complaints from the reviewer.
>
> The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a
> new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For
> the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.
>
> Stefan |
25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
> > The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a
> > new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For
> > the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.
In the past, we have done some useful plots with maybe 10 variables plotted
at the same time with different scaling and positioning on the graph.
Having 2 vertical axis is maybe useful for the specific case of plotting high voltages,
but not in the general case.
Actually, just 2 vertical axis will not work to plot high voltages in ALPHA-g, because
we have anode currents on the scale 0..0.1 uA and cathode currents on the scale 50..60 uA.
K.O. |
25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
I will have to post an example of a scaled plot. I figure everybody forgot how they look like.
K.O.
> We are using the new history formula as a quick way to convert signals from sensors to actual physical values (for example Voltage->Temperature, Voltage->relative humidity
> ...), so it is great that the shown voltage is the calculated one.
>
> I would like to add a point to this discussion.
> In our collaboration people attach images of history plots to elogs, meeting presentation and/or physical logbooks.
> The proposed scaling formula may work fine online using the cursors, but, once an image is created, I do not understand how it is possible to extract the value for a scaled
> variables.
> Suppose you see a graph in a presentation with a current increase by some PSU and the current was scaled to be in the same plot of the voltage.
> Looking at the delta in the image, how can you judge the current increase without any axis/grid to refer to?
>
> So I support Stefan proposal for a secondary axis, as long as it is clear which value belong to which axis.
> Maybe marking the channels in the description or using different line styles/thickness?
>
> Best,
> Marco |
30 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
> I am updating the history plots.
> So the idea is to use this computation:
> y_position_on_plot = offset + factor*(formula(history_value) - voffset)
Stefan and myself did some brain storming on zoom. Writing it down the way I remember it.
- we distilled the gist of the problem - the numerical values we show in the plot labels and in hover-over-the-graph
are before formula is applied or after the formula is applied?
- I suggested a universal solution using a double formula: use formula1 for one case;
use formula2 for the other case;
use formula1 for "physics calibration", use formula2 for factor and offset for composite plots:
numeric_value = formula1(history_value)
plotted_value = formula2(numeric_value)
- we agree that this is way too complicated, difficult to explain and difficult to coherently present in the history editor
- Stefan suggested a simple solution, a checkbox labeled "show raw value" next to each history variable. by default, the
value after the formula is plotted and displayed. if checked, the raw value (before the formula) is displayed, and the
value after the formula is plotted. (so this works the same as the factor and offset on the old history plots).
- if "show raw value" is enabled, the numerical values shown will be inconsistent against the labels on the vertical axis.
Our solution it to turn the axis labels off. (for composite plots, like oscillator frequency in Hz vs oscillator
temperature in degC, both scaled to see their correlation, the vertical axis is unit-less "arbitrary units", of course)
- to simplify migration of old history plots that use custom factor and offset settings, we think in the direction of
automatically moving them to the "formula". (factor=2, offset=10 automatically populates formula with "2*x+10", "show raw
value" checked/enabled). Thus we can avoid implementing factor and offset in the new history code (an unwelcome
complication).
- I think this covers all the use cases I have seen in the past, so we will move in this direction.
K.O. |
|