26 Apr 2017, Francesco Renga, Forum, Problem with logger at run start
|
Dear experts,
we have a problem when trying to run a MIDAS DAQ which worked in the past on the same PC (but on a different
network). We get the following error messages when starting a new run:
Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:9106:rpc_client_connect,ERROR] cannot connect to host "scar
lett", port 44858: connect() returned -1, errno 113 (No route to host)
Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:3539:cm_transition_call,ERROR] cannot connect to client "Lo
gger" on host scarlett, port 44858, status 503
(scarlett is indeed the hostname of the PC). The error occurs even if the PC is disconnected from the network.
Any suggestion?
Best Regards,
Francesco |
26 Apr 2017, Stefan Ritt, Forum, Problem with logger at run start
|
Dear Francesco,
Your error (No route to host) typically means that you have a network problem outside of MIDAS. Your computer has to "find itself" and
this is probably broken. Try to do a "ping scarlett" or "nslookup scarlett" and you will see that the DNS server can't be reached or is
wrongly configured. Sometimes it helps to put scarlett explicitly into /etc/hosts
Stefan
> Dear experts,
> we have a problem when trying to run a MIDAS DAQ which worked in the past on the same PC (but on a different
> network). We get the following error messages when starting a new run:
>
> Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:9106:rpc_client_connect,ERROR] cannot connect to host "scar
> lett", port 44858: connect() returned -1, errno 113 (No route to host)
> Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:3539:cm_transition_call,ERROR] cannot connect to client "Lo
> gger" on host scarlett, port 44858, status 503
>
> (scarlett is indeed the hostname of the PC). The error occurs even if the PC is disconnected from the network.
>
> Any suggestion?
>
> Best Regards,
> Francesco |
26 Apr 2017, Francesco Renga, Forum, Problem with logger at run start
|
Dear Stefan,
thank you very much for your reply. We could finally fix the problem by replacing "scarlett" with "scarlett.localdomain" in our
hostname configuration file /etc/hostname (under debian).
Best Regards,
Francesco
> Dear Francesco,
>
> Your error (No route to host) typically means that you have a network problem outside of MIDAS. Your computer has to "find itself" and
> this is probably broken. Try to do a "ping scarlett" or "nslookup scarlett" and you will see that the DNS server can't be reached or is
> wrongly configured. Sometimes it helps to put scarlett explicitly into /etc/hosts
>
> Stefan
>
>
> > Dear experts,
> > we have a problem when trying to run a MIDAS DAQ which worked in the past on the same PC (but on a different
> > network). We get the following error messages when starting a new run:
> >
> > Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:9106:rpc_client_connect,ERROR] cannot connect to host "scar
> > lett", port 44858: connect() returned -1, errno 113 (No route to host)
> > Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:3539:cm_transition_call,ERROR] cannot connect to client "Lo
> > gger" on host scarlett, port 44858, status 503
> >
> > (scarlett is indeed the hostname of the PC). The error occurs even if the PC is disconnected from the network.
> >
> > Any suggestion?
> >
> > Best Regards,
> > Francesco |
02 May 2017, Konstantin Olchanski, Forum, Problem with logger at run start
|
> Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:9106:rpc_client_connect,ERROR] cannot connect to host "scar
> lett", port 44858: connect() returned -1, errno 113 (No route to host)
Forgot to reply to this: if you read the error messages, you will see the actual problem is "no route to host". Next step
is to ping the same hostname or try "telnet hostname 22" (cut-and-paste the hostname from the error message
to avoid the common pitfall of not seeing a typo, i.e. ping host00 works while midas connect to hostOO does not (zero vs capital-o)).
In your case you had the wrong hostname ("foo" and "foo.localdomain" resolve to different IP addresses, one works the other
one does not). You can also try to use the IP address instead of hostname, this will avoid hostname resolution problems
(inconsistency between /etc/hosts and hostnames in DNS is very easy to have when using self-made private networks).
K.O. |
02 May 2017, Konstantin Olchanski, Info, mhttpd inline-editor change
|
I changed the mhttpd odb inline editor to use the json-rpc interface. Good things:
- browser no longer complains about obsolete synchronous ajax calls
- can edit strings of arbitrary length (was limited to the max URL length)
- funny characters " (quote), > and < (angle brackets) are correctly escaped.
- after editing, the actual value from odb is loaded and displayed (confirming that the edit "took").
K.O. |
17 Mar 2017, Pierre Gorel, Bug Report, badly managed case in history_schema.cxx: dat file empty
|
For an unknown reason, Logger died few days ago while writing the history. The
file mhf_1489577446_20170315_system.dat was created, but was empty.
When trying to restart Logger, I would get a seg fault without any special error
message.
I tracked the issue to the "read_file_schema" function in history_schema.cxx
* L4731, a pointer to HsFileSchema *s is declared.
* L4747, We enter a while(1) loop.
* L4749, get char on the filename.
In our case, the file was empty, so the variable "b" gets NULL and the loop breaks.
Problem: the memory allocation for "s" is later in the loop, L4768.
Upon exiting the loop, L4854, we try to access record_size on a NULL pointer ==>
SegFault.
It would be nice to at least have a message before breaking the loop... |
15 Apr 2017, Konstantin Olchanski, Bug Report, badly managed case in history_schema.cxx: dat file empty
|
> For an unknown reason, Logger died few days ago while writing the history. The
> file mhf_1489577446_20170315_system.dat was created, but was empty.
I ran into same problem installing new midas in the alpha experiment at cern. It should be fixed now:
https://bitbucket.org/tmidas/midas/commits/788021d9cb39a348a40e36f1b35b1440e06aa744
K.O.
>
> When trying to restart Logger, I would get a seg fault without any special error
> message.
>
> I tracked the issue to the "read_file_schema" function in history_schema.cxx
>
> * L4731, a pointer to HsFileSchema *s is declared.
> * L4747, We enter a while(1) loop.
> * L4749, get char on the filename.
> In our case, the file was empty, so the variable "b" gets NULL and the loop breaks.
>
> Problem: the memory allocation for "s" is later in the loop, L4768.
> Upon exiting the loop, L4854, we try to access record_size on a NULL pointer ==>
> SegFault.
>
> It would be nice to at least have a message before breaking the loop... |
14 Apr 2017, Wes Gohn, Forum, mhttpd lag
|
Hi everyone,
We have recently been experiencing a lot of lag with our midas control webpage,
which is making it very frustrating to use. Has anyone experienced this, and do
you have any advice to speed it up? Are there particular web browsers that work
better than others, or certain settings that can make respond more quickly?
Thanks!
Wes |
14 Apr 2017, Pierre Gorel, Forum, mhttpd lag
|
> Hi everyone,
>
> We have recently been experiencing a lot of lag with our midas control webpage,
> which is making it very frustrating to use. Has anyone experienced this, and do
> you have any advice to speed it up? Are there particular web browsers that work
> better than others, or certain settings that can make respond more quickly?
>
> Thanks!
> Wes
We saw this happening as well. In our case, we could track this down to mhttpd
taking a lot of CPU. A kill/restart of mhttpd is usually doing the trick (without
disturbing data taking). We did not find an obvious reason for this happening. |
15 Apr 2017, Konstantin Olchanski, Forum, mhttpd lag
|
> > Hi everyone,
> >
> > We have recently been experiencing a lot of lag with our midas control webpage,
> > which is making it very frustrating to use. Has anyone experienced this, and do
> > you have any advice to speed it up? Are there particular web browsers that work
> > better than others, or certain settings that can make respond more quickly?
> >
> > Thanks!
> > Wes
>
> We saw this happening as well. In our case, we could track this down to mhttpd
> taking a lot of CPU. A kill/restart of mhttpd is usually doing the trick (without
> disturbing data taking). We did not find an obvious reason for this happening.
One place where mhttpd can be stalled (and even go into infornite loop) is making history plots.
If you ask for a history plot of 10 variables across 1 year, nobody can access any midas web page
until mhttpd finishes grinding through the history data. (with the old .hst history format is was exceedingly
slow, with the new "file" format, it is pretty quick, but everybody still has to wait). If you leave this page
open, it will autorefresh every so many minutes ensuring continuing delays for other mhttpd users.
The other place for stalling mhttpd was in the run transitions (mhttpd was unresponsive while executing a
run transition), this was fixed by the multithreaded transitions.
To fix the unresponsive history requests, you can try to setup a separate "history mhttpd", run a second
mhttpd on a different port (with "-H" if desired), put this URL of this mhttpd in ODB "/history/url". (if you
are using my instructions for setting up the apache httpd proxy, you can see provisions for this.
/history/url will be set to "https://proxy.host.net/history/").
If neither of the above, there is the usual culprits of bad networking somewhere, etc.
Best way to test if delays are in midas or elsewhere is to stand in front of your midas computer, run a
current version of google-chrome or firefox right on it, there should be no delays.
K.O. |
15 Apr 2017, Konstantin Olchanski, Forum, mhttpd lag, which browser
|
> > >
> > > We have recently been experiencing a lot of lag with our midas control webpage,
> > > which is making it very frustrating to use. Has anyone experienced this, and do
> > > you have any advice to speed it up? Are there particular web browsers that work
> > > better than others, or certain settings that can make respond more quickly?
> > >
Wes, you provided excessive information. Who is "we", what is your location (internet in africa is different from internet in canada),
what is your computer (rpi3 is different from mac mini), what is your os (fedora-1 is different from centos-7), what
is your browser (netscape is different from google-chrome).
As to "what browser should work", on MacOS, google-chrome and firefox should be ok (that's what I test), on Linux,
stock firefox (usually an oldish esr version) should work, on el7 and ubuntu google-chrome works. On windows, google-chrome
and firefox should be ok. microsoft browsers probably not ok (no testing). cellphone browsers also not tested (but google-chrome and firefox
should be ok).
K.O. |
05 Apr 2017, Andreas Suter, Bug Report, Equipment Expand doesn't work anymore
|
I'd liked very much the possibility to hide away Equipment on the main page. It
is also nice to have the '+' to get it quickly back when needed. However, this
seems not to work anymore (git c9d9d604803). Is this a feature or something went
wrong? |
10 Apr 2017, Stefan Ritt, Bug Report, Equipment Expand doesn't work anymore
|
> I'd liked very much the possibility to hide away Equipment on the main page. It
> is also nice to have the '+' to get it quickly back when needed. However, this
> seems not to work anymore (git c9d9d604803). Is this a feature or something went
> wrong?
The expansion of the equipment list is handled by a Cookie ("expeq" being 1 or 0). When Konstantin
implemented the mongoose server instead of the internal mhttp server, he neglected to evaluate
this cookie. I fixed this now (also renamed the cookie to "midas_expeq") in the current development
branch. Please check if it's working.
Stefan |
10 Apr 2017, Andreas Suter, Bug Report, Equipment Expand doesn't work anymore
|
> > I'd liked very much the possibility to hide away Equipment on the main page. It
> > is also nice to have the '+' to get it quickly back when needed. However, this
> > seems not to work anymore (git c9d9d604803). Is this a feature or something went
> > wrong?
>
> The expansion of the equipment list is handled by a Cookie ("expeq" being 1 or 0). When Konstantin
> implemented the mongoose server instead of the internal mhttp server, he neglected to evaluate
> this cookie. I fixed this now (also renamed the cookie to "midas_expeq") in the current development
> branch. Please check if it's working.
>
> Stefan
Tested it on two machines and expansion is back and working! Thanks a lot!
Andreas |
15 Apr 2017, Konstantin Olchanski, Bug Report, Equipment Expand doesn't work anymore
|
> > > I'd liked very much the possibility to hide away Equipment on the main page. It
> > > is also nice to have the '+' to get it quickly back when needed. However, this
> > > seems not to work anymore (git c9d9d604803). Is this a feature or something went
> > > wrong?
> >
> > The expansion of the equipment list is handled by a Cookie ("expeq" being 1 or 0). When Konstantin
> > implemented the mongoose server instead of the internal mhttp server, he neglected to evaluate
> > this cookie. I fixed this now (also renamed the cookie to "midas_expeq") in the current development
> > branch. Please check if it's working.
> >
> > Stefan
>
> Tested it on two machines and expansion is back and working! Thanks a lot!
>
Confirmed fixed. Thanks. Not sure how this got lost.
K.O. |
05 Apr 2017, Andreas Suter, Suggestion, nicer header?!
|
We use the customHeader to display some useful information. Currently I do not
like its style. What about to make it more alike the footer?
I just changed in resources/mhttpd.css
diff --git a/resources/mhttpd.css b/resources/mhttpd.css
index fb0070d..f3264c8 100644
--- a/resources/mhttpd.css
+++ b/resources/mhttpd.css
@@ -280,6 +280,15 @@ table.headerTable td{
border: none;
}
+div.headerDiv{
+ background-color: #6F6F6F;
+ text-align: center;
+ padding:1em;
+ color:#EEEEEE;
+ border-bottom:1px solid #000000;
+ height:3em;
+}
+
div.footerDiv{
background-color: #808080;
text-align: center;
and
diff --git a/resources/mhttpd.js b/resources/mhttpd.js
index de8bc6c..972c261 100644
--- a/resources/mhttpd.js
+++ b/resources/mhttpd.js
@@ -172,7 +172,7 @@ function mhttpd_goto_page(page) {
function mhttpd_navigation_bar(current_page, path)
{
- document.write("<div id=\"customHeader\">\n");
+ document.write("<div class=\"headerDiv\" id=\"customHeader\">\n");
document.write("</div>\n");
document.write("<div class=\"mnavcss\">\n");
What do you think? |
05 Apr 2017, Stefan Ritt, Suggestion, nicer header?!
|
In my opinion this makes sense. If KO agrees, you should commit your change.
Stefan
> We use the customHeader to display some useful information. Currently I do not
> like its style. What about to make it more alike the footer?
>
> I just changed in resources/mhttpd.css
>
> diff --git a/resources/mhttpd.css b/resources/mhttpd.css
> index fb0070d..f3264c8 100644
> --- a/resources/mhttpd.css
> +++ b/resources/mhttpd.css
> @@ -280,6 +280,15 @@ table.headerTable td{
> border: none;
> }
>
> +div.headerDiv{
> + background-color: #6F6F6F;
> + text-align: center;
> + padding:1em;
> + color:#EEEEEE;
> + border-bottom:1px solid #000000;
> + height:3em;
> +}
> +
> div.footerDiv{
> background-color: #808080;
> text-align: center;
>
> and
>
> diff --git a/resources/mhttpd.js b/resources/mhttpd.js
> index de8bc6c..972c261 100644
> --- a/resources/mhttpd.js
> +++ b/resources/mhttpd.js
> @@ -172,7 +172,7 @@ function mhttpd_goto_page(page) {
>
> function mhttpd_navigation_bar(current_page, path)
> {
> - document.write("<div id=\"customHeader\">\n");
> + document.write("<div class=\"headerDiv\" id=\"customHeader\">\n");
> document.write("</div>\n");
>
> document.write("<div class=\"mnavcss\">\n");
>
> What do you think? |
15 Apr 2017, Konstantin Olchanski, Suggestion, nicer header?!
|
> In my opinion this makes sense. If KO agrees, you should commit your change.
Please go ahead (sorry for slow reply). I have no idea what this change does. A screenshot of "before"
and "after" would be nice. The reason I ask is:
note that I am getting rid of the css hell in mhttpd.css. all the new pages will be using the simplified css
rules in midas.css.
the main change is: the new css rules only change the appearance of html elements that request the
"midas look" and one can still use the normal html formatting if desired. The old css changed all (and I
do mean *all*) html elements, making it impossible to write custom web pages using common examples
from the web - the insane formatting from mhttpd.css was applied to everything indiscriminantly, i.e. h1,
h2, h3 all look the same.
K.O.
>
> Stefan
>
> > We use the customHeader to display some useful information. Currently I do not
> > like its style. What about to make it more alike the footer?
> >
> > I just changed in resources/mhttpd.css
> >
> > diff --git a/resources/mhttpd.css b/resources/mhttpd.css
> > index fb0070d..f3264c8 100644
> > --- a/resources/mhttpd.css
> > +++ b/resources/mhttpd.css
> > @@ -280,6 +280,15 @@ table.headerTable td{
> > border: none;
> > }
> >
> > +div.headerDiv{
> > + background-color: #6F6F6F;
> > + text-align: center;
> > + padding:1em;
> > + color:#EEEEEE;
> > + border-bottom:1px solid #000000;
> > + height:3em;
> > +}
> > +
> > div.footerDiv{
> > background-color: #808080;
> > text-align: center;
> >
> > and
> >
> > diff --git a/resources/mhttpd.js b/resources/mhttpd.js
> > index de8bc6c..972c261 100644
> > --- a/resources/mhttpd.js
> > +++ b/resources/mhttpd.js
> > @@ -172,7 +172,7 @@ function mhttpd_goto_page(page) {
> >
> > function mhttpd_navigation_bar(current_page, path)
> > {
> > - document.write("<div id=\"customHeader\">\n");
> > + document.write("<div class=\"headerDiv\" id=\"customHeader\">\n");
> > document.write("</div>\n");
> >
> > document.write("<div class=\"mnavcss\">\n");
> >
> > What do you think? |
14 Mar 2017, Andreas Suter, Bug Report, mhttpd - /Experiment/Menu Buttons - git-sha a350e8db11
|
I think there sneaked in a little bug in the mhttpd: when starting an experiment
from scratch and starting the mhttpd, the Menu Buttons are missing and,
correctly, I get periodic error messages. I expected that the default ODB entry
for the Menu Buttons is create if it doesn't exist. As far as I see this happens
now since the default creation of the 'Menu Buttons' is now tag as an obsolete
feature. In case this is not a bug but a feature, it should documented. |
14 Mar 2017, Konstantin Olchanski, Bug Report, mhttpd - /Experiment/Menu Buttons - git-sha a350e8db11
|
> I think there sneaked in a little bug in the mhttpd: when starting an experiment
> from scratch and starting the mhttpd, the Menu Buttons are missing and,
> correctly, I get periodic error messages. I expected that the default ODB entry
> for the Menu Buttons is create if it doesn't exist. As far as I see this happens
> now since the default creation of the 'Menu Buttons' is now tag as an obsolete
> feature. In case this is not a bug but a feature, it should documented.
I think you are right. Will fix.
K.O. |
16 Mar 2017, Konstantin Olchanski, Bug Report, Replaced with /experiment/menu, mhttpd - /Experiment/Menu Buttons - git-sha a350e8db11
|
> > I think there sneaked in a little bug in the mhttpd: when starting an experiment
> > from scratch and starting the mhttpd, the Menu Buttons are missing
Ok, the original problem with a small bug in the javascript code for the menu buttons (fixed now),
but I was moved to implement something I wanted to do for a long time.
The menu configuration is now done through a subdirectory /experiment/menu. Each entry corresponds to
one menu button. Set to "y" to show it, set to "n" to hide it.
Buttons are displayed in the same order as they are in ODB, to change the order of buttons,
change their order in ODB (odbedit command "move").
This fixes the long standing problem with adding new midas pages - they were not automatically added to
the existing "menu buttons" lists. So for example when the "chat" page was added, I did not know about it
for a long time (and some people still do not know about it's existence) because it is was not included in
my "/experiment/menu buttons" list in all my already existing experiments. When the "start" and
"transition" pages were added, probably nobody knows that they exist.
Now new buttons for new pages are automatically added to the list (via mhttpd.cxx::init_menu_buttons()),
the users have an option to hide them by setting their values to "n".
K.O. |
16 Mar 2017, Thomas Lindner, Bug Report, Replaced with /experiment/menu, mhttpd - /Experiment/Menu Buttons - git-sha a350e8db11
|
> > > I think there sneaked in a little bug in the mhttpd: when starting an experiment
> > > from scratch and starting the mhttpd, the Menu Buttons are missing
>
> Ok, the original problem with a small bug in the javascript code for the menu buttons (fixed now),
> but I was moved to implement something I wanted to do for a long time.
>
Is this change back-wards compatible with an old ODB? Ie, if I upgrade MIDAS, will it notice that I have the old-style key "/Experiment/Menu Buttons"
and replace it equivalently set keys in /Experiment/Menu? Or will it just continue to use the old-style ODB key? |
28 Mar 2017, Konstantin Olchanski, Bug Report, Replaced with /experiment/menu, mhttpd - /Experiment/Menu Buttons - git-sha a350e8db11
|
> > > > I think there sneaked in a little bug in the mhttpd: when starting an experiment
> > > > from scratch and starting the mhttpd, the Menu Buttons are missing
> >
> > Ok, the original problem with a small bug in the javascript code for the menu buttons (fixed now),
> > but I was moved to implement something I wanted to do for a long time.
> >
>
> Is this change back-wards compatible with an old ODB? Ie, if I upgrade MIDAS, will it notice that I have the old-style key "/Experiment/Menu Buttons"
> and replace it equivalently set keys in /Experiment/Menu? Or will it just continue to use the old-style ODB key?
I am trying to keep some compatibility between the web pages and mhttpd. I think in most cases, old mhttpd should continue to work
against new web pages (assuming matching mhttpd.js & co). But old web pages would probably break against new mhttpd, mostly due
to the rapid pace of their development.
Anyhow, the midas web page forms menu buttons in this order:
/Experiment/Menu, if it does not exist, then:
/Experiment/menu buttons, if it does not exist, then
built in list of menu buttons, which includes all possible buttons, hardcoded in mhttpd.js.
In cooperation with mhttpd: new mhttpd
- will automatically create the tree /experiment/menu with all buttons disabled
- will complain about the existence of /expriment/menu buttons, instruct user to delete it.
So to answer the question:
after git pull, make, restart mhttpd, you will see all possible menu buttons and you will have to go
into the odb editor to disable the buttons you do not want to see (i.e. the mscb button).
I did it this way on purpose, to give old-time midas users an opportunity to discover
some of the newly added buttons and pages, like the "chat" page, or the "example" page. If I migrated
the existing "menu buttons" verbatim, to the new tree, I would not even today know
that the "chat" page exists (I do not think it was ever announced or described on this forum
or anywhere in the documentation).
K.O. |
14 May 2015, Konstantin Olchanski, Suggestion, checksums for midas data files
|
I am adding LZ4 and LZO compression the mlogger and as part of this work, I would like to add
computation of checksums for the midas files.
On one side, such checksums help me confirm that uncompressed data contents is the same as original
data (compression/decompression is okey).
On the other side, such checksums can confirm to the end user that today's contents of the midas file is
the same as originally written by mlogger (maybe years ago) - there was no bit rot, no file corruption, no
accidental or intentional modification of contents.
There are several choices of checksums available:
crc32 - as implemented by zlib (already written inside mid.gz files)
crc32c - improved and hardware accelerated version of CRC32 (http://tools.ietf.org/html/rfc3309)
md5 - cryptographically strong checksum, but obsolete
sha1 - same, also obsolete
sha256 - currently considered to be cryptographically strong
Of these checksums, only sha256 (sha512, etc) are presently considered to be cryptographically strong,
meaning that they can detect intentional file modifications. As opposed to (for example) crc32 where
it is easy to construct 2 files with different contents but the same checksum. Both md5 and sha1 are
presently considered to be similarly cryptographically broken. But all of them are still usable
as checksums - as they will detect non-intentional data modifications (bit rot, etc) with
very high probability.
(Of course the strongest checksum is also the most expensive to compute).
I will probably implement crc32 (already in zlib), crc32c (easy to find hardware-accelerated
implementations) and sha256 (cryptographically strong).
I can write the computed checksums into midas.log, or into runNNN.crc32, runNNN.sha256, etc files. (or
both).
Any thoughts on this?
K.O. |
14 May 2015, Stefan Ritt, Suggestion, checksums for midas data files
|
> Any thoughts on this?
We use binary midas files now for ~20 years and never felt the necessity to put any checksums or even encryption on these files. The reason for that is the following: Data on
modern hard disks is already protected by CRC code or even ERC on the lower level, so it's very unlikely that single bytes change. If something happens, then it's a
corruption of the file system, so a few sectors of a file are missing or wrong. In that case a CRC won't help you much, just tells you that the files are corrupt. But you see that
also in the midas event structure. Each event has a header with the size of the event, so you can follow the file event by event. If something is missing, the next event header
is no event header but something in the middle of the date, and you recognise this immediately since the header does not make any send (date is off by many years, event ID
is arbitrary, event size is very different). So this redundancy in the midas event structure helps you to identify any corrupt files as good in my opinion as a CRC code will. I
would not want to waste a single CPU cycle on lengthy CRC or even SHA algorithms, unless I see single bytes change inside events. But in this case this can even happen at
the network level between frontend and backends. So we should add the CRC/SHA code at the frontend level. This could increase the dead time of the experiment which is
bad. And what about VME transfer? While hard disks and Ethernet networks have already built-in CRC checks, VME transfer doesn't. So how can you be sure that no bits
get corrupt between your ADC and your frontend computer?
If people insist of having CRC or SHA protection/encryption for some reason I do not understand yet, we should make this optional, so that I can turn it off, since I don't
need it.
/Stefan |
15 May 2015, Konstantin Olchanski, Suggestion, checksums for midas data files
|
> > Any thoughts on this?
>
> We use binary midas files now for ~20 years and never felt the necessity to put any checksums or even encryption on these files ...
>
"I have never seen a corrupted file, therefore nobody should ever need checksums". Well,
1) actually if you write mid.gz files, you get gzip checksums "for free" (but the checksums are not recorded anywhere, so 5 years later you cannot confirm that the file did not change).
2) I had a defective computer once where reading the same file several times yielded different data. (the defect was on the motherboard, not in the disks)
3) I am presently testing the btrfs filesystem which (like ZFS) keeps checksums for all data. For these tests I am using 3rd quality disks and I see btrfs regularly detect (and correct) "data corruption" events - where data on disk has changed.
4) there was a report from CERN(?) where they checked the checksums on a large number of data files and found a good number of corrupted files.
So bit rot does exist.
In more practical terms:
a) CRC32C is "free" to compute (hardware accelerated on latest CPUs), but does not detect malicious file modifications
b) SHA256 does detect that (but for how long?), but probably too expensive to compute (speed measurement TBD).
c) gzip compressed files have internal whole-file CRC32
d) bzip2 compressed files have internal per-block CRC32
e) lz4 compressed files have internal per-block xxhash checksums
Personally, when dealing with compressed files, I prefer to have a checksum recoded somewhere that I can check against after I decompress the file.
I think there is no need to add checksums to the MIDAS data files format itself (see c,d,e above).
K.O. |
05 Oct 2016, Lee Pool, Suggestion, checksums for midas data files
|
Hi
> On one side, such checksums help me confirm that uncompressed data contents is the same as original
> data (compression/decompression is okey).
>
> I can write the computed checksums into midas.log, or into runNNN.crc32, runNNN.sha256, etc files. (or
> both).
>
Just a thought on my side. I have been using a checksum, on data produced by our experiments via mlogger, the runxxxx.mid.gz, in
the same manner you proposed and I see now implemented.
I have a slight, objection, if I may call it that, to how the checksum is saved to disk, in
run00007.mid.gz.sha256 as an example.
$ cat ~/Data/run00007.mid.gz.sha256
f315af7caf6ca204cc082132862cb4227d77066cb60c6e2b1039d6dc5b04d1ee 650597 Data/run00007.mid.gz
It seems a little misleading to have the gzip'd filename paired with the checksum of the uncompressed content.
May I suggest that the pairing should be ,
f315af7caf6ca204cc082132862cb4227d77066cb60c6e2b1039d6dc5b04d1ee run00007.mid as an example.
As I find, this information will sit in an archive, database in my case for a long period, and it might
be confusing later on, when verification of the checksum is required. |
13 Oct 2016, Konstantin Olchanski, Suggestion, checksums for midas data files
|
Confirmed, this is a bug in mlogger. It should be creating *2* files, one with the before-compression checksum and one with the after-compression checksum. At
least both checksums are written to midas.log, so you can grep them from there. K.O.
> Hi
>
> > On one side, such checksums help me confirm that uncompressed data contents is the same as original
> > data (compression/decompression is okey).
> >
>
> > I can write the computed checksums into midas.log, or into runNNN.crc32, runNNN.sha256, etc files. (or
> > both).
> >
>
> Just a thought on my side. I have been using a checksum, on data produced by our experiments via mlogger, the runxxxx.mid.gz, in
> the same manner you proposed and I see now implemented.
>
> I have a slight, objection, if I may call it that, to how the checksum is saved to disk, in
> run00007.mid.gz.sha256 as an example.
>
>
> $ cat ~/Data/run00007.mid.gz.sha256
> f315af7caf6ca204cc082132862cb4227d77066cb60c6e2b1039d6dc5b04d1ee 650597 Data/run00007.mid.gz
>
>
> It seems a little misleading to have the gzip'd filename paired with the checksum of the uncompressed content.
>
> May I suggest that the pairing should be ,
>
> f315af7caf6ca204cc082132862cb4227d77066cb60c6e2b1039d6dc5b04d1ee run00007.mid as an example.
>
> As I find, this information will sit in an archive, database in my case for a long period, and it might
> be confusing later on, when verification of the checksum is required. |
13 Mar 2017, Konstantin Olchanski, Suggestion, checksums for midas data files
|
> Confirmed, this is a bug in mlogger. It should be creating *2* files, one with the before-compression checksum and one with the after-compression checksum. At
> least both checksums are written to midas.log, so you can grep them from there. K.O.
This should be fixed now. Thank you for nudging me.
K.O.
>
> > Hi
> >
> > > On one side, such checksums help me confirm that uncompressed data contents is the same as original
> > > data (compression/decompression is okey).
> > >
> >
> > > I can write the computed checksums into midas.log, or into runNNN.crc32, runNNN.sha256, etc files. (or
> > > both).
> > >
> >
> > Just a thought on my side. I have been using a checksum, on data produced by our experiments via mlogger, the runxxxx.mid.gz, in
> > the same manner you proposed and I see now implemented.
> >
> > I have a slight, objection, if I may call it that, to how the checksum is saved to disk, in
> > run00007.mid.gz.sha256 as an example.
> >
> >
> > $ cat ~/Data/run00007.mid.gz.sha256
> > f315af7caf6ca204cc082132862cb4227d77066cb60c6e2b1039d6dc5b04d1ee 650597 Data/run00007.mid.gz
> >
> >
> > It seems a little misleading to have the gzip'd filename paired with the checksum of the uncompressed content.
> >
> > May I suggest that the pairing should be ,
> >
> > f315af7caf6ca204cc082132862cb4227d77066cb60c6e2b1039d6dc5b04d1ee run00007.mid as an example.
> >
> > As I find, this information will sit in an archive, database in my case for a long period, and it might
> > be confusing later on, when verification of the checksum is required. |
13 Mar 2017, Konstantin Olchanski, Info, improved mhttpd sounds
|
I reworked the alarm sounds in mhttpd - now you can turn off all sounds without disabling the
alarm system for everybody.
a) new checkbox on the "alarms" page to turn off the alarm buzzer sound
b) fixed a bug where the status page will speak the last alarm even if the "speak" checkbox is
unchecked on the "alarms" page (was coming through the TALK messages)
c) made sure the chat messages are only spoken if "speak" is enabled on the "chat" page
d) these speech and sounds settings are now stored in the browser "localStorage", which means
they are shared across all open tabs and windows and are preserved across browser sessions and
computer reboots.
I hope this is an improvement.
There is still one bug remaining - the first (last?) alarm is always spoken twice - 1st time in the loop
over all alarms and 2nd time through the TALK messages. I do not know how to fix this.
K.O. |
27 Feb 2017, William Moore, Suggestion, analyzer failing to load ODB parameters
|
Hi,
I am attempting to compile and run analysis code on a completely different,
unconnected system than the DAQ computer for the experiment. The analyzer was
developed previously and my goal is to get it running and then update it to
achieve my needs. Before compiling the analyzer, I load a backup ODB file in
odbedit, and compile experim.h. I then compile the analyzer with that experim.h
file. When I run the analyzer I get the following output:
> MIDAS version 2.1ROOT version 5.34/36Root server listening on port 9090...
> Running analyzer offline. Stop with "!"
> Configuration file "/somedir/switches.odb" loaded
> [Analyzer,INFO] Set run number 1290 in ODB
> Load ODB from run 1290...[Analyzer,INFO] cannot load value "Client Notify":
write protected
> [Analyzer,INFO] cannot load value "Prompt": write protected
.
.
.
> [Analyzer,INFO] cannot load value "LANSCE-ops": write protected
> MIDAS version 2.1ROOT version 5.34/36OK
> Configuration file "/somedir/switches.odb" loaded
> Data_Raw/run01290.mid.gz:16355 Data_Analyzed/run01290.root:15208 events, 0.43s
I have confirmed all files being used have read/write access to all users. The
analyzer does populate a .root output file with filled histograms, however not
all histograms are filled. I believe this is because histograms that relied on
an ODB paramater that failed to load did not populate. Any idea as to what I am
doing wrong or how I could resolve this issue are greatly appreciated.
Thanks,
William Moore |
15 Feb 2017, NguyenMinhTruong, Bug Report, increase event buffer size
|
Dear all,
I have problem in event buffer size.
When run MIDAS, I got error "total event size (1307072) larger than buffer size
(1048576)", so I guess that the EVENT_BUFFER_SIZE is small.
I change EVENT_BUFFER_SIZE in midas.h from 0x100000 to 0x200000. After compiling
and run MIDAS, I got other error "Shared memory segment with key 0x4d040761
already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" in
system.C
I check the shmget() function in system.C and it is said that error come from
Shared memory segments larger than 16,773,120 bytes and create teraspace shared
memory segments
Anyone has this problem before?
Thanks for your help
M.T |
16 Feb 2017, Konstantin Olchanski, Bug Report, increase event buffer size
|
> I have problem in event buffer size.
>
> When run MIDAS, I got error "total event size (1307072) larger than buffer size
> (1048576)", so I guess that the EVENT_BUFFER_SIZE is small.
>
Correct. You have a choice of sending smaller events or increasing the buffer size.
Increasing the buffer size consumes computer memory, how much memory do you have on your machine?
>
> I change EVENT_BUFFER_SIZE in midas.h from 0x100000 to 0x200000. After compiling
> and run MIDAS, I got other error "Shared memory segment with key 0x4d040761
> already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" in
> system.C
>
This is not normal. In recent versions of MIDAS (for the last few years)
a) buffer size is changed via ODB "/Experiment/buffer sizes", no need to edit midas.h
b) shared memory was switched from SYSV shared memory to POSIX shared memory, and you should not see any references to
SYSV shared memory functions like "ipcrm", "shmget" and "segment key".
Are you using a very old version of MIDAS? Or maybe you have a MIDAS installation that still uses SYSV shared memory. Check
the contents of .SHM_TYPE.TXT (in the same directory as .ODB.SHM), if would normally say "POSIXv2_SHM". If it says
something else, it is best to convert to POSIX SHM. Simplest way is to stop everything, save odb to text file, delete
.SHM_TYPE.TXT, restart odb with odbedit, reload from text file. Now check that .SHM_TYPE.TXT says "POSIXv2_SHM".
>
> I check the shmget() function in system.C and it is said that error come from
> Shared memory segments larger than 16,773,120 bytes and create teraspace shared
> memory segments
>
What teraspace?!? You changed the size from 1 Mbyte to 2 Mbyte (0x200000), this is still below even the value you have above
(16,773,120).
At the end, it is not clear what your problem is. After changing the shared memory size (via odb or via midas.h),
the midas *will* complain about the mismatch in size (existing vs expected) and will tell you how to fix it, (run "ipcrm").
After does this, is there still an error? Normally everything will just work. (you might also have to erase .SYSTEM.SHM,
midas will tell you to do so if it is needed).
So what is your final error? (After running ipcrm?)
K.O. |
20 Feb 2017, NguyenMinhTruong, Bug Report, increase event buffer size
|
I am sorry for my late reply
memory in my PC is 16 GB
I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".
But there is no buffer sizes in "/Experiment"
After run "ipcrm -M 0x4d040761 size0x204a3c", remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T |
20 Feb 2017, Konstantin Olchanski, Bug Report, increase event buffer size
|
> memory in my PC is 16 GB
You can safely go to buffer size 100 Mbytes or more.
> I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".
Good.
> But there is no buffer sizes in "/Experiment"
This is strange. How old is your midas? What does it say on the "help" page in "Revision"?
> After run "ipcrm -M 0x4d040761 size0x204a3c"
This command is wrong. It probably gave you an error instead of removing the shared memory, that's why
nothing worked afterwards.
My copy of system.c reads this:
cm_msg(MERROR, "ss_shm_open", "Shared memory segment with key 0x%x already exists, please remove it manually: ipcrm -M 0x%x", key,
key);
Note how there is no text "size0x..." in my copy? What does your copy say? Did somebody change it?
> remove .SYSTEM.SHM and run MIDAS again, I still get error "Shared memory segment
> with key 0x4d040761 already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" M.T
Yes, that's because the ipcrm command is wrong and did not work,
it should read "ipcrm -M 0x4d040761" without the spurious "size..." text.
K.O. |
20 Feb 2017, Konstantin Olchanski, Bug Report, increase event buffer size
|
> > memory in my PC is 16 GB
>
> You can safely go to buffer size 100 Mbytes or more.
>
> > I check the contents of .SHM_TYPE.TXT and it is "POSIXv2_SHM".
>
> Good.
No, wait, this is all wrong. If it says POSIX shared memory, how come it later
complains about SYSV shared memory and tells you to run SYSV shared memory
commands like ipcrm?!?
> > But there is no buffer sizes in "/Experiment"
Now this kind of makes sense - you are probably running a strange mixture
of very old and recently new MIDAS. Probably you current version is so old
that it does not use .SHM_TYPE.TXT and can only do SYSV shared memory
and so old it does not have "/Experiment/buffer sizes".
But at some point you must have run a recent version of midas, or you would
not have the file .SHM_TYPE.TXT in your experiment directory.
I say:
a) run the correct ipcrm command (without the spurious "size..." text)
b) review your computer contents to identify all the versions of midas
and to make sure you are using the midas you want to use (old or new,
whatever), but not some wrong version by accident (incorrect PATH setting, etc)
As MIDAS developers, we usually recommend that you use the latest version of MIDAS,
certainly latest version is simpler to debug.
K.O. |
15 Feb 2017, NguyenMinhTruong, Bug Report, increase event buffer size
|
Dear all,
I have problem in event buffer size.
When run MIDAS, I got error "total event size (1307072) larger than buffer size
(1048576)", so I guess that the EVENT_BUFFER_SIZE is small.
I change EVENT_BUFFER_SIZE in midas.h from 0x100000 to 0x200000. After compiling
and run MIDAS, I got other error "Shared memory segment with key 0x4d040761
already exists, please remove it manually: ipcrm -M 0x4d040761 size0x204a3c" in
system.C
I check the shmget() function in system.C and it is said that error come from
Shared memory segments larger than 16,773,120 bytes and create teraspace shared
memory segments
Anyone has this problem before?
Thanks for your help
M.T |
14 Feb 2017, Konstantin Olchanski, Info, mhttpd.js split into midas.js, mhttpd.js and obsolete.js
|
As discussed before, the midas omnibus javascript file mhttpd.js has been split into three pieces:
midas.js - midas "public api" for building web pages that interact with midas
mhttpd.js - javascript functions used by mhttpd web pages
obsolete.js - functions still in use, but not recommended for new designs, mostly because of the deprecated "Synchronous XMLHttpRequest" business.
Consider these use cases:
a) completely standalone web pages served from some other web server (not mhttpd): loading midas.js, set the mhttpd location (base URL) via mjsonrpc_set_url(url) and issue
midas json-rpc requests as normal. (mhttpd fully supports the cross-site scripting (CORS) function).
b) custom pages loaded from mhttpd without midas styling: same as above, but no need to set the mhttpd base url.
c) custom pages loaded from mhttpd with midas styling: load midas.js, load mhttpd.js, load midas.css or mhttpd.css, see aaa_template.html or example.html to see how it all fits
together.
d) custom replacement for mhttpd standard web pages: to replace (for example) the standard "alarms" page, copy (or create a new one) alarms.html into the experiment directory
($MIDAS_DIR, same place as .ODB.SHM) and hack away. You can start from alarms.html, from aaa_template.html or from example.html.
K.O.
P.S. I am also reviewing mhttpd.css - the existing css file severely changes standard html formatting making it difficult to create custom web pages (all online tutorials and examples
look nothing like that are supposed to look like). The new CSS file midas.css fixes this by only changing formatting of html elements that explicitly ask for "midas styling", without
contaminating the standard html formatting. midas.css only works for example.html and aaa_template.html for now.
P.P.S. Here is the complete list of javascript functions in all 3 files:
8s-macbook-pro:resources 8ss$ grep ^function midas.js mhttpd.js obsolete.js
midas.js:function mjsonrpc_set_url(url)
midas.js:function mjsonrpc_send_request(req)
midas.js:function mjsonrpc_debug_alert(rpc) {
midas.js:function mjsonrpc_decode_error(error) {
midas.js:function mjsonrpc_error_alert(error) {
midas.js:function mjsonrpc_make_request(method, params, id)
midas.js:function mjsonrpc_call(method, params, id)
midas.js:function mjsonrpc_start_program(name, id) {
midas.js:function mjsonrpc_stop_program(name, unique, id) {
midas.js:function mjsonrpc_cm_exist(name, unique, id) {
midas.js:function mjsonrpc_al_reset_alarm(alarms, id) {
midas.js:function mjsonrpc_al_trigger_alarm(name, message, xclass, condition, type, id) {
midas.js:function mjsonrpc_db_copy(paths, id) {
midas.js:function mjsonrpc_db_get_values(paths, id) {
midas.js:function mjsonrpc_db_ls(paths, id) {
midas.js:function mjsonrpc_db_resize(paths, new_lengths, id) {
midas.js:function mjsonrpc_db_key(paths, id) {
midas.js:function mjsonrpc_db_delete(paths, id) {
midas.js:function mjsonrpc_db_paste(paths, values, id) {
midas.js:function mjsonrpc_db_create(paths, id) {
midas.js:function mjsonrpc_cm_msg(message, type, id) {
mhttpd.js:function ODBFinishInlineEdit(p, path, bracket)
mhttpd.js:function ODBInlineEditKeydown(event, p, path, bracket)
mhttpd.js:function ODBInlineEdit(p, odb_path, bracket)
mhttpd.js:function mhttpd_disable_button(button)
mhttpd.js:function mhttpd_enable_button(button)
mhttpd.js:function mhttpd_hide_button(button)
mhttpd.js:function mhttpd_unhide_button(button)
mhttpd.js:function mhttpd_init_overlay(overlay)
mhttpd.js:function mhttpd_hide_overlay(overlay)
mhttpd.js:function mhttpd_unhide_overlay(overlay)
mhttpd.js:function mhttpd_getParameterByName(name) {
mhttpd.js:function mhttpd_goto_page(page) {
mhttpd.js:function mhttpd_navigation_bar(current_page)
mhttpd.js:function mhttpd_page_footer()
mhttpd.js:function mhttpd_create_page_handle_create(mouseEvent)
mhttpd.js:function mhttpd_create_page_handle_cancel(mouseEvent)
mhttpd.js:function mhttpd_delete_page_handle_delete(mouseEvent)
mhttpd.js:function mhttpd_delete_page_handle_cancel(mouseEvent)
mhttpd.js:function mhttpd_start_run()
mhttpd.js:function mhttpd_stop_run()
mhttpd.js:function mhttpd_pause_run()
mhttpd.js:function mhttpd_resume_run()
mhttpd.js:function mhttpd_cancel_transition()
mhttpd.js:function mhttpd_reset_alarm(alarm_name)
mhttpd.js:function msg_load(f)
mhttpd.js:function msg_prepend(msg)
mhttpd.js:function msg_append(msg)
mhttpd.js:function findPos(obj) {
mhttpd.js:function msg_extend()
mhttpd.js:function alarm_load()
mhttpd.js:function aspeak_click(t)
mhttpd.js:function mhttpd_alarm_speak(t)
mhttpd.js:function chat_kp(e)
mhttpd.js:function rb()
mhttpd.js:function speak_click(t)
mhttpd.js:function chat_send()
mhttpd.js:function chat_load()
mhttpd.js:function chat_format(line)
mhttpd.js:function chat_prepend(msg)
mhttpd.js:function chat_append(msg)
mhttpd.js:function chat_reformat()
mhttpd.js:function chat_extend()
obsolete.js:function XMLHttpRequestGeneric()
obsolete.js:function ODBSetURL(url)
obsolete.js:function ODBSet(path, value, pwdname)
obsolete.js:function ODBGet(path, format, defval, len, type)
obsolete.js:function ODBMGet(paths, callback, formats)
obsolete.js:function ODBGetRecord(path)
obsolete.js:function ODBExtractRecord(record, key)
obsolete.js:function ODBKey(path)
obsolete.js:function ODBCopy(path, format)
obsolete.js:function ODBCall(url, callback)
obsolete.js:function ODBMCopy(paths, callback, encoding)
obsolete.js:function ODBMLs(paths, callback)
obsolete.js:function ODBMCreate(paths, types, arraylengths, stringlengths, callback)
obsolete.js:function ODBMResize(paths, arraylengths, stringlengths, callback)
obsolete.js:function ODBMRename(paths, names, callback)
obsolete.js:function ODBMLink(paths, links, callback)
obsolete.js:function ODBMReorder(paths, indices, callback)
obsolete.js:function ODBMKey(paths, callback)
obsolete.js:function ODBMDelete(paths, callback)
obsolete.js:function ODBRpc_rev0(name, rpc, args)
obsolete.js:function ODBRpc_rev1(name, rpc, max_reply_length, args)
obsolete.js:function ODBRpc(program_name, command_name, arguments_string, callback, max_reply_length)
obsolete.js:function ODBGetMsg(facility, start, n)
obsolete.js:function ODBGenerateMsg(type,facility,user,msg)
obsolete.js:function ODBGetAlarms()
obsolete.js:function ODBEdit(path)
obsolete.js:function getMouseXY(e)
8s-macbook-pro:resources 8ss$
K.O. |
08 Sep 2016, Amy Roberts, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
I've recently run into issues when using JSON.parse on ODB keys containing
8-bit data.
For JSON.parse to successfully parse a string, (A) the string must be valid
UTF-8, (B) several whitespace characters, control characters, and the
characters " and \ must be escaped, and (C) you've got to follow the key-
value rules laid out in http://www.json.org/.
The web browser takes care of (A), and I verified that for this key Midas
handled (C) correctly. In principle, the function json_write in odb.c
handles (B) - but json_write does not escape control characters.
To manage this problem, I modified json_write (in odb.c) to replace any
control character with the more-inocuous character, 'C'. My default case
now looks like:
default:
{
// if a char is a control character,
// print 'C' in its place
// note that this loses data:
// a more-correct method would be to print
// \uXXXX, where XXXX is the character in hex
if(iscntrl(*s)){
(*buffer)[(*buffer_end)++] = 'C';
s++;
} else {
(*buffer)[(*buffer_end)++] = *s++;
}
}
Where the call to iscntrl(*s) requires the addition of the ctype.h header
file.
I'm guessing a blanket replacement of control characters with 'C' isn't
something all Midas users would want to do. Replacing the control character
with its hex value seems like a good choice - but not without adding bounds
checking!
An alternative to changing odb.c could be to add a regex to Midas response
text which removes all control characters (U+0000 - U+001F):
var resp_lint = req.response.replace(/[\u{0000}-\u{001F}]/gmu, '');
var json_obj = JSON.parse(resp_lint);
Unfortunately, the 'u' regex flax doesn't work on the Firefox version
included in Scientific Linux 6.8. |
30 Sep 2016, Konstantin Olchanski, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> I've recently run into issues when using JSON.parse on ODB keys containing
> 8-bit data.
I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid
UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
it is impractical to handle or permit invalid UTF-8 strings.
Certainly in the general case, replacing all control characters with something else or escaping them or
otherwise changing the value if TID_STRING data would wreck *valid* UTF-8 strings, which I would
assume to be the normal use.
In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as
we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check
that TID_STRING is valid UTF-8.
But in your specific case, why do you have random control characters in your TID_STRING data?
Maybe you are using TID_STRING as general storage instead of arrays of TID_CHAR or
TID_DWORD?
K.O.
>
> For JSON.parse to successfully parse a string, (A) the string must be valid
> UTF-8, (B) several whitespace characters, control characters, and the
> characters " and \ must be escaped, and (C) you've got to follow the key-
> value rules laid out in http://www.json.org/.
>
> The web browser takes care of (A), and I verified that for this key Midas
> handled (C) correctly. In principle, the function json_write in odb.c
> handles (B) - but json_write does not escape control characters.
>
> To manage this problem, I modified json_write (in odb.c) to replace any
> control character with the more-inocuous character, 'C'. My default case
> now looks like:
>
> default:
> {
> // if a char is a control character,
> // print 'C' in its place
> // note that this loses data:
> // a more-correct method would be to print
> // \uXXXX, where XXXX is the character in hex
> if(iscntrl(*s)){
> (*buffer)[(*buffer_end)++] = 'C';
> s++;
> } else {
> (*buffer)[(*buffer_end)++] = *s++;
> }
> }
>
> Where the call to iscntrl(*s) requires the addition of the ctype.h header
> file.
>
> I'm guessing a blanket replacement of control characters with 'C' isn't
> something all Midas users would want to do. Replacing the control character
> with its hex value seems like a good choice - but not without adding bounds
> checking!
>
> An alternative to changing odb.c could be to add a regex to Midas response
> text which removes all control characters (U+0000 - U+001F):
>
> var resp_lint = req.response.replace(/[\u{0000}-\u{001F}]/gmu, '');
> var json_obj = JSON.parse(resp_lint);
>
> Unfortunately, the 'u' regex flax doesn't work on the Firefox version
> included in Scientific Linux 6.8. |
25 Oct 2016, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> > I've recently run into issues when using JSON.parse on ODB keys containing
> > 8-bit data.
>
> I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid
> UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
> it is impractical to handle or permit invalid UTF-8 strings.
> ....
> But in your specific case, why do you have random control characters in your TID_STRING data?
> Maybe you are using TID_STRING as general storage instead of arrays of TID_CHAR or
> TID_DWORD?
I'm a little confused by this report and want to make sure I understand the situation. Konstantin points
out that the TID_STRING should be valid UTF-8. But I think that Amy agreed that the string was valid UTF-8.
My understanding was that Amy's contention was that the valid UTF-8 string didn't get returned as valid JSON.
But I am having trouble reproducing your behaviour Amy. I created a ODB string variable with a tab control
control character
sprintf(mystring,"first line \t second line");
status = db_set_value(hDB, 0,"/test2/mystring", &mystring, size, 1, TID_STRING);
and what I tried to pull the ODB using jcopy
http://neut18:8081/?cmd=jcopy&odb=/test2/mystring&format=json
I got
{
"mystring/key" : { "type" : 12, "item_size" : 32, "access_mode" : 7, "last_written" : 1477416322 },
"mystring" : "first line \t second line"
}
which seems to be valid JSON.
I only tried this with tab. Are there other control characters that you are having trouble with? Or maybe
I misunderstand the question?
>
> >
> > For JSON.parse to successfully parse a string, (A) the string must be valid
> > UTF-8, (B) several whitespace characters, control characters, and the
> > characters " and \ must be escaped, and (C) you've got to follow the key-
> > value rules laid out in http://www.json.org/.
> >
> > The web browser takes care of (A), and I verified that for this key Midas
> > handled (C) correctly. In principle, the function json_write in odb.c
> > handles (B) - but json_write does not escape control characters.
> >
> > To manage this problem, I modified json_write (in odb.c) to replace any
> > control character with the more-inocuous character, 'C'. My default case
> > now looks like:
> >
> > default:
> > {
> > // if a char is a control character,
> > // print 'C' in its place
> > // note that this loses data:
> > // a more-correct method would be to print
> > // \uXXXX, where XXXX is the character in hex
> > if(iscntrl(*s)){
> > (*buffer)[(*buffer_end)++] = 'C';
> > s++;
> > } else {
> > (*buffer)[(*buffer_end)++] = *s++;
> > }
> > }
> >
> > Where the call to iscntrl(*s) requires the addition of the ctype.h header
> > file.
> >
> > I'm guessing a blanket replacement of control characters with 'C' isn't
> > something all Midas users would want to do. Replacing the control character
> > with its hex value seems like a good choice - but not without adding bounds
> > checking!
> >
> > An alternative to changing odb.c could be to add a regex to Midas response
> > text which removes all control characters (U+0000 - U+001F):
> >
> > var resp_lint = req.response.replace(/[\u{0000}-\u{001F}]/gmu, '');
> > var json_obj = JSON.parse(resp_lint);
> >
> > Unfortunately, the 'u' regex flax doesn't work on the Firefox version
> > included in Scientific Linux 6.8. |
01 Dec 2016, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> > I've recently run into issues when using JSON.parse on ODB keys containing
> > 8-bit data.
>
> I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid
> UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
> it is impractical to handle or permit invalid UTF-8 strings.
>
> Certainly in the general case, replacing all control characters with something else or escaping them or
> otherwise changing the value if TID_STRING data would wreck *valid* UTF-8 strings, which I would
> assume to be the normal use.
>
> In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as
> we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check
> that TID_STRING is valid UTF-8.
I agree that I think we should start requiring strings to be UTF-8 encoded unicode.
I'd suggest that before worrying about the TID_STRING data, we should start by sanitizing the ODB key names.
I've seen a couple cases where the ODB key name is a non-UTF-8 string. It is very awkward to use odbedit
to delete these keys.
I attach a suggested modification to odb.c that rejects calls to db_create_key with non-UTF-8 key names. It
uses some random function I found on the internet that is supposed to check if a string is valid UTF-8. I
checked a couple of strings with invalid UTF-8 characters and it correctly identified them. But I won't
claim to be certain that this is really identifying all UTF-8 vs non-UTF-8 cases. Maybe others have a
better way of identifying this. |
15 Jan 2017, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> > In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as
> > we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check
> > that TID_STRING is valid UTF-8.
> ...
> I attach a suggested modification to odb.c that rejects calls to db_create_key with non-UTF-8 key names. It
> uses some random function I found on the internet that is supposed to check if a string is valid UTF-8. I
> checked a couple of strings with invalid UTF-8 characters and it correctly identified them. But I won't
> claim to be certain that this is really identifying all UTF-8 vs non-UTF-8 cases. Maybe others have a
> better way of identifying this.
At Konstantin's suggestion, I committed the function I found for checking if a string was UTF-8 compatible to
odb.c. The function is currently not used; I commented out a proposed use in db_create_key. Experts can decide
if the code was good enough to use. |
23 Jan 2017, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> At Konstantin's suggestion, I committed the function I found for checking if a string was UTF-8 compatible to
> odb.c. The function is currently not used; I commented out a proposed use in db_create_key. Experts can decide
> if the code was good enough to use.
After more discussion, I have enabled the parts of the ODB code that check that key names are UTF-8 compliant.
This check will show up in (at least) two ways:
1) Attempts to create a new ODB variable if the ODB key is not UTF-8 compliant. You will see error messages like
[fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_create_key: UTF-8 incompatible
string
2) When a program first connects to the ODB, it runs a check to ensure that the ODB is valid. This will now include
a check that all key names are UTF-8 compliant. Any non-UTF8 compliant key names will be replaced by a string of the
pointer to the key. You will see error messages like:
[fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_validate_key: UTF-8
incompatible string
[fesimdaq,ERROR] [odb.c:647:db_validate_key,ERROR] Warning: corrected key "/Equipment/SIMDAQ/Eur€": invalid name
"Eur€" replaced with "0x7f74be63f970"
This behaviour (checking UTF-8 compatibility and automatically fixing ODB names) can be disabled by setting an
environment variable
MIDAS_INVALID_STRING_IS_OK
It doesn't matter what the environment variable is set to; it just needs to be set. Note also that this variable is
only checked once, when a program starts. |
30 Jan 2017, Stefan Ritt, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
>
> > At Konstantin's suggestion, I committed the function I found for checking if a string was UTF-8 compatible to
> > odb.c. The function is currently not used; I commented out a proposed use in db_create_key. Experts can decide
> > if the code was good enough to use.
>
> After more discussion, I have enabled the parts of the ODB code that check that key names are UTF-8 compliant.
>
> This check will show up in (at least) two ways:
>
> 1) Attempts to create a new ODB variable if the ODB key is not UTF-8 compliant. You will see error messages like
>
> [fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_create_key: UTF-8 incompatible
> string
>
> 2) When a program first connects to the ODB, it runs a check to ensure that the ODB is valid. This will now include
> a check that all key names are UTF-8 compliant. Any non-UTF8 compliant key names will be replaced by a string of the
> pointer to the key. You will see error messages like:
>
> [fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_validate_key: UTF-8
> incompatible string
> [fesimdaq,ERROR] [odb.c:647:db_validate_key,ERROR] Warning: corrected key "/Equipment/SIMDAQ/Eur€": invalid name
> "Eur€" replaced with "0x7f74be63f970"
>
> This behaviour (checking UTF-8 compatibility and automatically fixing ODB names) can be disabled by setting an
> environment variable
>
> MIDAS_INVALID_STRING_IS_OK
>
> It doesn't matter what the environment variable is set to; it just needs to be set. Note also that this variable is
> only checked once, when a program starts.
I see you put some switches into the environment ("MIDAS_INVALID_STRING_IS_OK"). Do you think this is a good idea? Most variables are
sitting in the ODB (/experiment/xxx), except those which cannot be in the ODB because we need it before we open the ODB, like MIDAS_DIR.
Having them in the ODB has the advantage that everything is in one place, and we see a "list" of things we can change. From an empty
environment it is not clear that such a thing like "MIDAS_INVALID_STRING_IS_OK" does exist, while if it would be an ODB key it would be
obvious. Can I convince you to move this flag into the ODB? |
01 Feb 2017, Konstantin Olchanski, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
>
> I see you put some switches into the environment ("MIDAS_INVALID_STRING_IS_OK"). Do you think this is a good idea? Most variables are
> sitting in the ODB (/experiment/xxx), except those which cannot be in the ODB because we need it before we open the ODB, like MIDAS_DIR.
> Having them in the ODB has the advantage that everything is in one place, and we see a "list" of things we can change. From an empty
> environment it is not clear that such a thing like "MIDAS_INVALID_STRING_IS_OK" does exist, while if it would be an ODB key it would be
> obvious. Can I convince you to move this flag into the ODB?
>
Some additional explanation.
Time passed, the world turned, and the current web-compatible standard for text strings is UTF-8 encoded Unicode, see
https://en.wikipedia.org/wiki/UTF-8
(ObCanadianContent, UTF-8 was invented the Canadian Rob Pike https://en.wikipedia.org/wiki/Rob_Pike)
(and by some other guy https://en.wikipedia.org/wiki/Ken_Thompson).
It turns out that not every combination of 8-bit characters (char*) is valid UTF-8 Unicode.
In the MIDAS world we run into this when MIDAS ODB strings are exported to Javascript running inside web
browsers ("custom pages", etc). ODB strings (TID_STRING) and ODB key names that are not valid UTF-8
make such web pages malfunction and do not work right.
One solution to this is to declare that ODB strings (TID_STRING) and ODB key names *must* be valid UTF-8 Unicode.
The present commits implemented this solution. Invalid UTF-8 is rejected by db_create() & co and by the ODB integrity validator.
This means some existing running experiment may suddenly break because somehow they have "old-style" ODB entries
or they mistakenly use TID_STRING to store arbitrary binary data (use array of TID_CHAR instead).
To permit such experiments to use current releases of MIDAS, we include a "defeat" device - to disable UTF-8 checks
until they figure out where non-UTF-8 strings come from and correct the problem.
Why is this defeat device non an ODB entry? Because it is not a normal mode of operation - there is no use-case where
an experiment will continue to use non-UTF-8 compatible ODB indefinitely, in the long term. For example, as the MIDAS user
interface moves to more and more to HTML+Javascript+"AJAX", such experiments will see that non-UTF-8 compatible ODB entries
cause all sorts of problems and will have to convert.
K.O. |
01 Feb 2017, Stefan Ritt, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> Some additional explanation.
>
> Time passed, the world turned, and the current web-compatible standard for text strings is UTF-8 encoded Unicode, see
> https://en.wikipedia.org/wiki/UTF-8
> (ObCanadianContent, UTF-8 was invented the Canadian Rob Pike https://en.wikipedia.org/wiki/Rob_Pike)
> (and by some other guy https://en.wikipedia.org/wiki/Ken_Thompson).
>
> It turns out that not every combination of 8-bit characters (char*) is valid UTF-8 Unicode.
>
> In the MIDAS world we run into this when MIDAS ODB strings are exported to Javascript running inside web
> browsers ("custom pages", etc). ODB strings (TID_STRING) and ODB key names that are not valid UTF-8
> make such web pages malfunction and do not work right.
>
> One solution to this is to declare that ODB strings (TID_STRING) and ODB key names *must* be valid UTF-8 Unicode.
>
> The present commits implemented this solution. Invalid UTF-8 is rejected by db_create() & co and by the ODB integrity validator.
>
> This means some existing running experiment may suddenly break because somehow they have "old-style" ODB entries
> or they mistakenly use TID_STRING to store arbitrary binary data (use array of TID_CHAR instead).
>
> To permit such experiments to use current releases of MIDAS, we include a "defeat" device - to disable UTF-8 checks
> until they figure out where non-UTF-8 strings come from and correct the problem.
>
> Why is this defeat device non an ODB entry? Because it is not a normal mode of operation - there is no use-case where
> an experiment will continue to use non-UTF-8 compatible ODB indefinitely, in the long term. For example, as the MIDAS user
> interface moves to more and more to HTML+Javascript+"AJAX", such experiments will see that non-UTF-8 compatible ODB entries
> cause all sorts of problems and will have to convert.
>
>
> K.O.
Ok, I agree.
Stefan |
15 Dec 2016, Kevin Giovanetti, Bug Report, midas.h error
|
creating a frontend on MAC Sierra OSX 10
include the midas.h file and when compiling with XCode I get an error based on
this entry in the midas.h include
#if !defined(OS_IRIX) && !defined(OS_VMS) && !defined(OS_MSDOS) &&
!defined(OS_UNIX) && !defined(OS_VXWORKS) && !defined(OS_WINNT)
#error MIDAS cannot be used on this operating system
#endif
Perhaps I should not use Xcode?
Perhaps I won't need Midas.h?
The MIDAS system is running on my MAC but I need to add a very simple front end
for testing and I encounted this error. |
15 Dec 2016, Stefan Ritt, Bug Report, midas.h error
|
> creating a frontend on MAC Sierra OSX 10
> include the midas.h file and when compiling with XCode I get an error based on
> this entry in the midas.h include
>
> #if !defined(OS_IRIX) && !defined(OS_VMS) && !defined(OS_MSDOS) &&
> !defined(OS_UNIX) && !defined(OS_VXWORKS) && !defined(OS_WINNT)
> #error MIDAS cannot be used on this operating system
> #endif
>
>
> Perhaps I should not use Xcode?
> Perhaps I won't need Midas.h?
>
> The MIDAS system is running on my MAC but I need to add a very simple front end
> for testing and I encounted this error.
If you compile with the included Makefile, you will see a
-DOS_LINUX -DOS_DARWIN
flag which tells the compiler that we are on a mac. If you do this with XCode, you have to do it via "Build Settings" (see
attached picture).
Stefan |
01 Feb 2017, Konstantin Olchanski, Bug Report, midas.h error
|
>
> If you compile with the included Makefile, you will see a
>
> -DOS_LINUX -DOS_DARWIN
>
Moving forward, it looks like I can define these variables in midas.h and remove the need to define them on the compiler command line.
This would be part of the Makefile and header files cleanup to get things working on Windows10.
K.O. |
01 Feb 2017, Stefan Ritt, Bug Report, midas.h error
|
> >
> > If you compile with the included Makefile, you will see a
> >
> > -DOS_LINUX -DOS_DARWIN
> >
>
> Moving forward, it looks like I can define these variables in midas.h and remove the need to define them on the compiler command line.
>
> This would be part of the Makefile and header files cleanup to get things working on Windows10.
>
> K.O.
Will you detect the underlying OS automatically in midas.h? Note that you have several compilers in MacOS (llvm and gcc), and they might use different
predefined symbols. I appreciate however getting rid of these flags in the Makefile.
Stefan |
14 Oct 2016, Konstantin Olchanski, Info, Javascript based run start and stop pages.
|
I switched mhttpd to use the new javascript based run start and stop pages.
There are two new html pages:
resources/start.html - mimics the old run start page exactly - where you can enter the "edit on
start" parameters and start the run.
resources/transition.html - monitors the transition progress, shows the status of every transition
client, their sequence number, waiting list dependency, time spent making rpc calls, etc.
If the new pages do not work for you, please report it here and switch to the old pages
by editing src/mhttpd.cxx - comment-out the line "#define NEW_START_STOP 1"
K.O. |
05 Dec 2016, Thomas Lindner, Info, Javascript based run start and stop pages.
|
> I switched mhttpd to use the new javascript based run start and stop pages.
One initial complaint: the transition.html page doesn't seem to deal well with a frontend program using
a deferred transition. Specifically, I find with my simulated frontend ([1]), which has a deferred
end-of-run transition, that two problems happen:
i) the page doesn't give any indication that a frontend has a deferred transition; in fact it says that
the frontend immediately has finished the transition.
ii) once the deferred transition has finished, the page doesn't switch to saying that the run has
stopped. In fact, even if I reload the transition page it still continues to show that the run is
ongoing; the status page, by contrast, shows that the run has stopped.
I separately still think that the transition page should automatically go away after 5 seconds
(assuming that all the transitions were successful). I think it is annoying that you need to click
back to the status page.
[1] https://github.com/thomaslindner/fesimdaq |
01 Feb 2017, Konstantin Olchanski, Info, Javascript based run start and stop pages.
|
> > I switched mhttpd to use the new javascript based run start and stop pages.
>
> One initial complaint: the transition.html page doesn't seem to deal well with a frontend program using
> a deferred transition.
>
We now have a test frontend for deferred transitions, and this problem will likely be fixed.
>
> I separately still think that the transition page should automatically go away after 5 seconds
>
This is a user-interface philosophy issue.
Instead of using personal preferences one should follow established design principles
(there is research done and books written about this).
I did not recently look at current recommendations for this type of interaction, but generally
one expects web pages to "do things" (such as switch to a different page) only when directed
by user input (press a button).
My personal opinion is that half the users will find 5 sec delay too slow, the other half will
find 5 sec too fast and the 3rd half will wonder "what happened, the web page flashed and disappeared,
did I miss something important, how do I get back to whatever is was?!?".
One idea is to implement the transition page as a implant on the state page - after the "start" page
you go back to the status page where you can see the progress of the transition. After the transition
completes, it's progress window "collapses" into a "success/failure" display with a link to the full
transition page to see any details of what happened. Any volunteers? (I would html-ize the status page first).
K.O. |
01 Feb 2017, Stefan Ritt, Info, Javascript based run start and stop pages.
|
> > > I switched mhttpd to use the new javascript based run start and stop pages.
> >
> > One initial complaint: the transition.html page doesn't seem to deal well with a frontend program using
> > a deferred transition.
> >
>
> We now have a test frontend for deferred transitions, and this problem will likely be fixed.
>
> >
> > I separately still think that the transition page should automatically go away after 5 seconds
> >
>
> This is a user-interface philosophy issue.
>
> Instead of using personal preferences one should follow established design principles
> (there is research done and books written about this).
>
> I did not recently look at current recommendations for this type of interaction, but generally
> one expects web pages to "do things" (such as switch to a different page) only when directed
> by user input (press a button).
>
> My personal opinion is that half the users will find 5 sec delay too slow, the other half will
> find 5 sec too fast and the 3rd half will wonder "what happened, the web page flashed and disappeared,
> did I miss something important, how do I get back to whatever is was?!?".
>
> One idea is to implement the transition page as a implant on the state page - after the "start" page
> you go back to the status page where you can see the progress of the transition. After the transition
> completes, it's progress window "collapses" into a "success/failure" display with a link to the full
> transition page to see any details of what happened. Any volunteers? (I would html-ize the status page first).
>
> K.O.
I agree with Konstantin's plans and volunteer for the "collapsable" display. We will address this during my next visit to TRIUMF. |
01 Dec 2016, Konstantin Olchanski, Info, midas wiki updated to mediawiki 1.27.1
|
midas wiki at https://midas.triumf.ca/MidasWiki/index.php/Main_Page
was updated to MediaWiki version 1.27.1, the current MediaWiki LTS release.
Everything should work as before, but if you see any problems or anomalies, please report
them on this forum here.
K.O. |
24 Oct 2016, Tim Gorringe, Bug Report, problem with error code DB_NO_MEMORY from db_open_record() call when establish additional hotlinks
|
Hi Midas forum,
I'm having a problem with odb hotlinks after increasing sub-directories in an
odb. I now get the error code DB_NO_MEMORY after some db_open_record() calls. I
tried
1) increasing the parameter DEFAULT_ODB_SIZE in midas.h and make clean, make
but got the same error
2) increasing the parameter MAX_OPEN_RECORDS in midas.h and make clean, make
but got fatal errors from odbedit and my midas FE and couldnt run anything
3) deleting my expts SHM files and starting odbedit with "odbedit -e SLAC -s
0x1000000" to increse the odb size but got the same error?
4) I tried a different computer and got the same error code DB_NO_MEMORY
Maybe I running into some system limit that restricts the humber of open records?
Or maybe I've not increased the correct midas parameter?
Best ,Tim. |
25 Oct 2016, Tim Gorringe, Bug Report, problem with error code DB_NO_MEMORY from db_open_record() call when establish additional hotlinks
|
oOne additional comment. I was able to trace the setting of the error code DB_NO_MEMORY
to a call to the db_add_open_record() by mserver that is initiated during the start-up
of my frontend via an RPC call. I checked with a debug printout that I have indeed
reached the number of MAX_OPEN_RECORDS
> Hi Midas forum,
>
> I'm having a problem with odb hotlinks after increasing sub-directories in an
> odb. I now get the error code DB_NO_MEMORY after some db_open_record() calls. I
> tried
>
> 1) increasing the parameter DEFAULT_ODB_SIZE in midas.h and make clean, make
> but got the same error
>
> 2) increasing the parameter MAX_OPEN_RECORDS in midas.h and make clean, make
> but got fatal errors from odbedit and my midas FE and couldnt run anything
>
> 3) deleting my expts SHM files and starting odbedit with "odbedit -e SLAC -s
> 0x1000000" to increse the odb size but got the same error?
>
> 4) I tried a different computer and got the same error code DB_NO_MEMORY
>
> Maybe I running into some system limit that restricts the humber of open records?
> Or maybe I've not increased the correct midas parameter?
>
> Best ,Tim. |
04 Nov 2016, Thomas Lindner, Bug Report, problem with error code DB_NO_MEMORY from db_open_record() call when establish additional hotlinks
|
Hi Tim,
I reproduced your problem and then managed to go through a procedure to increase the number
of allowable open records. The following is the procedure that I used
1) Use odbedit to save current ODB
odbedit
save current_odb.odb
2) Stop all the running MIDAS processes, including mlogger and mserver using the web
interface. Then stop mhttpd as well.
3) Remove your old ODB (we will recreate it after modifying MIDAS, using the backup you just
made).
mv .ODB.SHM .ODB.SHM.20161104
rm /dev/shm/thomas_ODB_SHM
4) Make the following modifications to midas. In this particular case I have increased the
max number of open records from 256 to 1024. You would need to change the constants if you
want to change to other values
diff --git a/include/midas.h b/include/midas.h
index 02b30dd..33be7be 100644
--- a/include/midas.h
+++ b/include/midas.h
@@ -254,7 +254,7 @@ typedef std::vector<std::string> STRING_LIST;
-#define MAX_OPEN_RECORDS 256 /**< number of open DB records */
+#define MAX_OPEN_RECORDS 1024 /**< number of open DB records */
diff --git a/src/odb.c b/src/odb.c
index 47ace8f..ac1bef3 100755
--- a/src/odb.c
+++ b/src/odb.c
@@ -699,8 +699,8 @@ static void db_validate_sizes()
- assert(sizeof(DATABASE_CLIENT) == 2112);
- assert(sizeof(DATABASE_HEADER) == 135232);
+ assert(sizeof(DATABASE_CLIENT) == 8256);
+ assert(sizeof(DATABASE_HEADER) == 528448);
The calculation is as follows (in case you want a different number of open records):
DATABASE_CLIENT = 64 + 8*MAX_OPEN_ERCORDS = 64 + 8*1024 = 8256
DATABASE_HEADER = 64 + 64*DATABASE_CLIENT = 64 + 64*8256 = 528448
5) Rebuild MIDAS
make clean; make
6) Create new ODB
odbedit -s 1000000
Change the size of the ODB to whatever you want.
7) reload your original ODB
load current_odb.odb
8) Rebuild your frontend against new MIDAS; then it should work and you should be able to
produce more open records.
8.5*) Actually, I had a weird error where I needed to remove my .SYSTEM.SHM file as well
when I first restarted my front-end. Not sure if that was some unrelated error, but I
mention it here for completeness.
This was a procedure based on something that originally was used for T2K (procedure by Renee
Poutissou). It is possible that not all steps are necessary and that there is a better way.
But this worked for me.
Also, any objections from other developers to tweaking the assert checks in odb.c so that
the values are calculated automatically and MIDAS only needs to be touched in one place to
modify the number of open records?
Let me know if it worked for you and I'll add these instructions to the Wiki.
Thomas
> oOne additional comment. I was able to trace the setting of the error code DB_NO_MEMORY
> to a call to the db_add_open_record() by mserver that is initiated during the start-up
> of my frontend via an RPC call. I checked with a debug printout that I have indeed
> reached the number of MAX_OPEN_RECORDS
>
> > Hi Midas forum,
> >
> > I'm having a problem with odb hotlinks after increasing sub-directories in an
> > odb. I now get the error code DB_NO_MEMORY after some db_open_record() calls. I
> > tried
> >
> > 1) increasing the parameter DEFAULT_ODB_SIZE in midas.h and make clean, make
> > but got the same error
> >
> > 2) increasing the parameter MAX_OPEN_RECORDS in midas.h and make clean, make
> > but got fatal errors from odbedit and my midas FE and couldnt run anything
> >
> > 3) deleting my expts SHM files and starting odbedit with "odbedit -e SLAC -s
> > 0x1000000" to increse the odb size but got the same error?
> >
> > 4) I tried a different computer and got the same error code DB_NO_MEMORY
> >
> > Maybe I running into some system limit that restricts the humber of open records?
> > Or maybe I've not increased the correct midas parameter?
> >
> > Best ,Tim. |
25 Nov 2016, Thomas Lindner, Bug Report, problem with error code DB_NO_MEMORY from db_open_record() call when establish additional hotlinks
|
The procedure I wrote seemed to work for Tim too, so I added a page to the wiki about it here:
https://midas.triumf.ca/MidasWiki/index.php/FAQ
> Hi Tim,
>
> I reproduced your problem and then managed to go through a procedure to increase the number
> of allowable open records. The following is the procedure that I used
>
> 1) Use odbedit to save current ODB
>
> odbedit
> save current_odb.odb
>
> 2) Stop all the running MIDAS processes, including mlogger and mserver using the web
> interface. Then stop mhttpd as well.
>
>
> 3) Remove your old ODB (we will recreate it after modifying MIDAS, using the backup you just
> made).
>
> mv .ODB.SHM .ODB.SHM.20161104
> rm /dev/shm/thomas_ODB_SHM
>
> 4) Make the following modifications to midas. In this particular case I have increased the
> max number of open records from 256 to 1024. You would need to change the constants if you
> want to change to other values
>
> diff --git a/include/midas.h b/include/midas.h
> index 02b30dd..33be7be 100644
> --- a/include/midas.h
> +++ b/include/midas.h
> @@ -254,7 +254,7 @@ typedef std::vector<std::string> STRING_LIST;
> -#define MAX_OPEN_RECORDS 256 /**< number of open DB records */
> +#define MAX_OPEN_RECORDS 1024 /**< number of open DB records */
> diff --git a/src/odb.c b/src/odb.c
> index 47ace8f..ac1bef3 100755
> --- a/src/odb.c
> +++ b/src/odb.c
> @@ -699,8 +699,8 @@ static void db_validate_sizes()
> - assert(sizeof(DATABASE_CLIENT) == 2112);
> - assert(sizeof(DATABASE_HEADER) == 135232);
> + assert(sizeof(DATABASE_CLIENT) == 8256);
> + assert(sizeof(DATABASE_HEADER) == 528448);
>
> The calculation is as follows (in case you want a different number of open records):
> DATABASE_CLIENT = 64 + 8*MAX_OPEN_ERCORDS = 64 + 8*1024 = 8256
> DATABASE_HEADER = 64 + 64*DATABASE_CLIENT = 64 + 64*8256 = 528448
>
> 5) Rebuild MIDAS
>
> make clean; make
>
> 6) Create new ODB
>
> odbedit -s 1000000
>
> Change the size of the ODB to whatever you want.
>
> 7) reload your original ODB
>
> load current_odb.odb
>
> 8) Rebuild your frontend against new MIDAS; then it should work and you should be able to
> produce more open records.
>
> 8.5*) Actually, I had a weird error where I needed to remove my .SYSTEM.SHM file as well
> when I first restarted my front-end. Not sure if that was some unrelated error, but I
> mention it here for completeness.
>
> This was a procedure based on something that originally was used for T2K (procedure by Renee
> Poutissou). It is possible that not all steps are necessary and that there is a better way.
> But this worked for me.
>
> Also, any objections from other developers to tweaking the assert checks in odb.c so that
> the values are calculated automatically and MIDAS only needs to be touched in one place to
> modify the number of open records?
>
> Let me know if it worked for you and I'll add these instructions to the Wiki.
>
> Thomas
>
>
>
> > oOne additional comment. I was able to trace the setting of the error code DB_NO_MEMORY
> > to a call to the db_add_open_record() by mserver that is initiated during the start-up
> > of my frontend via an RPC call. I checked with a debug printout that I have indeed
> > reached the number of MAX_OPEN_RECORDS
> >
> > > Hi Midas forum,
> > >
> > > I'm having a problem with odb hotlinks after increasing sub-directories in an
> > > odb. I now get the error code DB_NO_MEMORY after some db_open_record() calls. I
> > > tried
> > >
> > > 1) increasing the parameter DEFAULT_ODB_SIZE in midas.h and make clean, make
> > > but got the same error
> > >
> > > 2) increasing the parameter MAX_OPEN_RECORDS in midas.h and make clean, make
> > > but got fatal errors from odbedit and my midas FE and couldnt run anything
> > >
> > > 3) deleting my expts SHM files and starting odbedit with "odbedit -e SLAC -s
> > > 0x1000000" to increse the odb size but got the same error?
> > >
> > > 4) I tried a different computer and got the same error code DB_NO_MEMORY
> > >
> > > Maybe I running into some system limit that restricts the humber of open records?
> > > Or maybe I've not increased the correct midas parameter?
> > >
> > > Best ,Tim. |
14 Oct 2016, Luka Pavelic, Forum, Wiener PCIVME link
|
Hello,
I'm trying to make Wiener PCIVME link work with MIDAS.
In documentation/VME dirvers/ it's saying: "wevmemm.c PCI/VME Wiener board
supported. (see Wiener PCI)".
Provided link is dead. Does anyone have that file? I would appreciate very very
much if someone could send it to me.
Thank you and best regards,
L.P. |
14 Oct 2016, Konstantin Olchanski, Forum, Wiener PCIVME link
|
> Hello,
> I'm trying to make Wiener PCIVME link work with MIDAS.
> In documentation/VME dirvers/ it's saying: "wevmemm.c PCI/VME Wiener board
> supported. (see Wiener PCI)".
> Provided link is dead. Does anyone have that file? I would appreciate very very
> much if someone could send it to me.
>
> Thank you and best regards,
> L.P.
Hi, I am not familiar with this module, I am pretty sure I have never seen one.
I do not see any code for it in the midas distribution.
I do not see any reference to it on the wiener web site (http://www.wiener-d.com/)
For obsolete modules, they direct us to http://file.wiener-d.com/ which is dead.
The next best step is to contact Wiener customer support. They usually reply very quickly.
If you have no luck getting answer directly from Wiener, you can ask me to contact them through
our sales representative. He is always super very helpful.
K.O. |
14 Oct 2016, Pierre-Andre Amaudruz, Forum, Wiener PCIVME link
|
> > Hello,
> > I'm trying to make Wiener PCIVME link work with MIDAS.
> > In documentation/VME dirvers/ it's saying: "wevmemm.c PCI/VME Wiener
board
> > supported. (see Wiener PCI)".
> > Provided link is dead. Does anyone have that file? I would appreciate
very very
> > much if someone could send it to me.
> >
> > Thank you and best regards,
> > L.P.
>
> Hi, I am not familiar with this module, I am pretty sure I have never
seen one.
> I do not see any code for it in the midas distribution.
> I do not see any reference to it on the wiener web site
(http://www.wiener-d.com/)
>
> For obsolete modules, they direct us to http://file.wiener-d.com/ which
is dead.
>
> The next best step is to contact Wiener customer support. They usually
reply very quickly.
>
> If you have no luck getting answer directly from Wiener, you can ask me
to contact them through
> our sales representative. He is always super very helpful.
>
> K.O.
Hi, I do recall that we had this interface a while ago.
I'll be meeting with Wiener during the weekend and will post my findings
later.
PAA |
13 Oct 2016, Konstantin Olchanski, Info, new odbinit utility
|
odbinit is a new utility program to initialize new ODB and to recover from corrupted ODB.
Right now, midas odb has some strange properties different from typical behavior of other
database packages:
a) a new odb of default size is automatically create run running *any* midas program (surprise: now
way to specify the size of odb).
b) the size of ODB is not saved anywhere. If your experiment requires an ODB of big size, one
always forgets to use "odbedit -s" when recovering from odb corruption, leading to massive
confusion: nothing works, odb is corrupted? (maybe not), recreate odb (of default size instead of
large size), reload odb, (reload fails, odb is too small), now really for sure nothing works. Been
there, done that myself 100 times. Tired.
c) there is no midas tool to automatically recover from odb corruption (or any generic ODB
malfunction, such as stuck ODB semaphore): shared memory has to be deleted, old .ODB.SHM
has to be deleted, old semaphore has to be deleted. Some of these steps are different on Linux
and MacOS (hello Apple, where is MacOS "ls -l /dev/shm"?!?).
The new odbinit tool corrects these problems:
1) ODB size is saved to .ODB_SIZE.TXT, then is used to recreate ODB after corruption recovery
2) "odbinit -s different_size_from_saved_size" will ask "are you sure?". No way to unintentionally
change size of ODB.
3) if you already have an ODB, it will insist that you say "odbinit --cleanup"
4) there is a "-n" mode, to report what will be done, but "do nothing"
5) "odbinit --cleanup" tries very hard to recover from any and all possible ODB problems.
6) old .ODB.SHM is never deleted, always renamed to .ODB.SHM.timestamp
7) if "odbinit" gets to "Done!", you have a working ODB, 100% guaranteed, for sure.
8) output of "odbinit" is very verbose for pasting into this forum here to make it possible to debug
your problem. (in the unlikely case odbinit fails).
Next step will be to remove the automatic creation of ODB (and event buffers) and require running
"odbinit" to create a new experiment. ("odbedit -s nnn" will be removed).
But not today, as all that requires changes to the midas internal APIs: ss_shm_open() needs to
return the size of connected shared memory, there needs to be ss_shm_create() and
db_create_database(), etc.
This will make ODB to work more like a normal database: with a tool to create a new database and
a tool to recover from corruption/malfunction.
K.O. |
|