26 Jul 2013, Konstantin Olchanski, Bug Report, abort on buffer overflow in odb.c::merge_records()
|
The odb.c function merge_records() has a fixed size array of 10000 bytes to handle the data and it
aborts with an assert() if passed data bigger than that. It is called from db_create_record() which
already allocates a data buffer of correct size for it's operations.
K.O. |
26 Jul 2013, Konstantin Olchanski, Bug Report, odbedit fixed size buffer overrun
|
odbedit uses a fixed size buffer for ODB data. If an array in ODB is bugger than this size,
db_get_data() correctly returns DB_TRUNCATED and there is no memory overwrite, but the following
code for printing the data does not know about this truncation and proceeds printing memory
values contained after the end of the fixed size data buffer - in the current case, this memory
somehow had the contents of the shell history - this very confused the MIDAS users as they though
that the funny printout was actually in ODB. This is in the print_key() function. If db_get_data()
returns DB_TRUNCATED, it should allocate a buffer of bigger size. This (and the previous) bug found
by the TIGRESS experiment. K.O. |
02 Aug 2013, Konstantin Olchanski, Bug Fix, multithreaded run transitions work!
|
As of commit
https://bitbucket.org/tmidas/midas/commits/dfa5fb1a93cae11a2960d441044c7fd277e1f0ec
(we are now liberated from the tyranny of SVN IDs),
multithreaded run transitions seem to work reliably and are now the default in mhttpd.
In odbedit and mtransition, the default is the old sequential transitions. "-m" and "-a" flags activate
the new multithread run transitions. mhttpd now uses the equivalent of "mtransition -a"
(mutithreaded asynchronous).
This is one of the new features implemented by Stefan while at TRIUMF.
K.O.
(We hope to write up all the recent changes soon). |
21 Aug 2013, Konstantin Olchanski, Info, Documentation for ODBGet() & co, Javascript and AJAX functions.
|
The bulk of the MIDAS AJAX and Javascript functions is now documented on the MIDAS Wiki:
https://midas.triumf.ca/MidasWiki/index.php/Mhttpd.js
https://midas.triumf.ca/MidasWiki/index.php/AJAX
Enjoy,
K.O. |
22 Aug 2013, Konstantin Olchanski, Info, Documentation for ODBGet() & co, Javascript and AJAX functions.
|
> The bulk of the MIDAS AJAX and Javascript functions is now documented on the MIDAS Wiki:
>
> https://midas.triumf.ca/MidasWiki/index.php/Mhttpd.js
> https://midas.triumf.ca/MidasWiki/index.php/AJAX
>
The documentation was updated again.
All functions and AJAX methods except jset, jget, jrpc (ODBGet, ODBSet, ODBRpc) and inline edit are now fully
documented. AJAX methods jset/jget and their javascript wrappers ODBSet/ODBGet/ODBMGet/ODBGetRecord() are
partially documented. Inline edit will have to be documented by Stefan.
When using these functions please read the "BUG" sections carefully.
When using the Javascript functions (ODBGet, ODBSet, ODBMCopy, etc) please pay special attention to the rules for URI-
encoding function arguments - some functions require that some arguments be pre-encoded by
"encodeURIComponent()", some functions require some arguments to NOT be pre-encoded. The examples in
examples/javascript1/example.html are mostly correct.
Special confusion is created by special handling in mhttpd of URI-encoding of parameters named "format".
Special confusion is created by ODBSet(path, value), where "path" should be pre-encoded, while "value" is now encoded
internally, which is a recent change introduced with the inline edit function. Older versions of mhttpd.js still require that
"value" be URI-encoded.
Going forward, I hope to resolve most of the confusion by providing a cleaner interface for reading ODB
- ODBMCopy() already looks good with full async JSON/JSONP support (already implemented)
- ODBMKey() to read just the keys, with async JSON/JSONP support (to be added)
- ODBMCreate() to create ODB keys (RPC for db_create()) (to be added)
- ODBMSet() to write into ODB. (to be added)
K.O. |
26 Aug 2013, Konstantin Olchanski, Bug Fix, Enable cross-site requests in mhttpd
|
Javascript "AJAX" functions (and their MIDAS wrappers - ODBGet/ODBSet) are subject to something called
"same origin policy" intended to prevent something called "cross-site scripting attacks", i.e. see
http://en.wikipedia.org/wiki/Same-origin_policy
In practice it means that if you load the MIDAS custom web page from test.foo.com and try to access
mhttpd at midas.foo.com, ODBSet/ODBGet will not work.
I always thought that this meant that the requests are blocked by the browser and are a form of
protection of the web server - only scripts loaded from mhttpd can do AJAX (ODBGet/ODBSet) to mhttpd.
It turns out that I was wrong. This is what actually happens: the "cross-site" requests are still sent to the
server (mhttpd), the response it received, parsed and discarded if "same origin" conditions are not met.
This means that the "same origin" policy does not protect mhttpd at all - any script from any page
anywhere can issue AJAX requests into any mhttpd, these requests will be successfully sent, received
and processed by mhttpd, including requests for writing into ODB ("jset" command using the HTTP GET
method).
So for the case of MIDAS, "same origin" does not prevent malicious (or buggy) scripts from writing into the
wrong mhttpd of the wrong experiment.
All it does is prevent desired and intentional access to mhttpd (ODBGet) from scripts that happen to have
been loaded outside of mhttpd (i.e. from a developer own test page).
Then it turns out that there is an "official" way to disable this unwanted protection policy, called CORS, see
http://www.w3.org/TR/cors/
I have now implemented this in mhttpd and added an mhttpd.js function ODBSetURL() to explicitly set the
URL of mhttpd that we want to talk to.
This work is on the feature/ajax branch, to be merged soon. For the impatient, here is what you need to
do in mhttpd:
diff --git a/src/mhttpd.cxx b/src/mhttpd.cxx
index 1d9d1cc..0460cec 100755
--- a/src/mhttpd.cxx
+++ b/src/mhttpd.cxx
@@ -1070,6 +1070,7 @@ void show_text_header()
{
rsprintf("HTTP/1.0 200 Document follows\r\n");
rsprintf("Server: MIDAS HTTP %d\r\n", mhttpd_revision());
+ rsprintf("Access-Control-Allow-Origin: *\r\n");
rsprintf("Pragma: no-cache\r\n");
rsprintf("Expires: Fri, 01 Jan 1983 00:00:00 GMT\r\n");
rsprintf("Content-Type: text/plain; charset=iso-8859-1\r\n\r\n");
K.O. |
13 Sep 2013, Konstantin Olchanski, Forum, MIDAS CITATION
|
>
> I have been using your software in my lab (APC, Paris)
> to run our data acqusition system. It is very robust and flexible.s
>
> I would like to give you the large amount of credit which you are due.
> How should I cite both MIDAS and ROODY? I have not been able to find any
> information in the usual places.
>
Good to hear from a happy user.
I think the best way to give us credit is to recommend MIDAS to 10 of your friends.
For MIDAS citations, I think Pierre and Stefan have a standard one somewhere, we should have it linked from
midas.triumf.ca.
For ROODY citations, I am not sure we have one. The idea behind it was to make a ROOT version of PAW++.
Main authors are Greg King (UBC COOP student, where is he now?), Joe Chuma (also author of PHYSICA,
R.I.P.), Pierre Amaudruz, myself and a few others.
K.O. |
13 Sep 2013, Konstantin Olchanski, Bug Report, mhttpd truncates string variables to 32 characters
|
I can confirm part of the problem - the new inline-edit function - after you finish editing - shows you what you
have typed, not what's actually in ODB - at the very end it should do an ODBGet() to load the actual ODB
contents and show *that* to the user.
The truncation to 32 characters - most likely it is a failure to resize the ODB string - is probably in mhttpd and
I can take a quick look into it.
There is a 3rd problem - the mhttpd ODB editor "create" function does not ask for the string length to create.
Actually, in ODB, "create" and "set string size" are two separate functions - db_create_key(TID_STRING) creates
a string of length zero, then db_set_data() creates an empty string of desired length.
In the new AJAX interface these two functions are separate (ODBCreate just calls db_create_key()).
In the present ODBSet() function the two are mixed together - and the ODB inline edit function uses ODBSet().
K.O.
> I find that new mhttpd has strange behaviour for ODB strings.
>
> - I create a new STRING variable in ODB through mhttpd. It defaults to size 32.
>
> - I then edit the STRING variable through mhttpd, writing a new string larger
> than 32 characters.
>
> - Initially everything looks fine; it seems as if the new string value has been
> accepted.
>
> - But if you reload the page, or navigate back to the page, you realize that
> mhttpd has silently truncated the string back to 32 characters.
>
> You can reproduce this problem on a test page here:
>
> http://midptf01.triumf.ca:8081/AnnMessage
>
> Older versions of mhttpd (I'm testing one from 2 years ago) don't have this
> 'feature'. For older mhttpd the string variable would get resized when a larger
> string was inputted. That definitely seems like the right behavior to me.
>
> I am using fresh copy of midas from bitbucket as of this morning. (How do I get
> a particular tag/hash of the version of midas that I am using?) |
14 Sep 2013, Konstantin Olchanski, Info, mktime() and daylight savings time
|
I would like to share with you a silly problem with mktime() and daylight savings time (Summer
time/Winter time) that I have run into while working on the mhttpd history query page.
I am implementing 1 hour granularity for the queries (was 1 day granularity) and somehow all my queries
were off by 1 hour.
It turns out that the mktime() and localtime() functions for converting between time_t and normal time
units (days, hours) are not exact inverses of each other.
Daylight savings time (DST) is to blame.
While localtime() always applies the current DST, mktime() will return the wrong answer unless tm_isdst is
set correctly.
For tm_isdst, the default value 0 is wrong 50% of the time in most locations as it means "DST off" (whether
that's Summer time or Winter time depends on your location).
Today in Vancouver, BC, DST is in effect, and localtime(mktime()) is off by 1 hour.
If I were doing this in January, I would not see this problem.
"man mktime" talks about "tm_isdst" special value "-1" that is supposed to fix this. But the wording of
"man mktime" on Linux and on MacOS is different (I am amused by the talk about "attempting to divine
the DST setting"). Wording at http://pubs.opengroup.org/onlinepubs/007908799/xsh/mktime.html is
different again. MS Windows (Visual Studio) documentation says different things for different versions.
So for mhttpd I use the following code. First mktime() gets the approximate time, a call to localtime()
returns the DST setting in effect for that date, a second mktime() with the correct DST setting returns the
correct time. (By "correct" I mean that localtime(mktime(t)) == t).
time_t mktime_with_dst(const struct tm* ptms)
{
// this silly stuff is required to correctly handle daylight savings time (Summer time/Winter time)
// when we fill "struct tm" from user input, we cannot know if daylight savings time is in effect
// and we do not know how to initialize the value of tms->tm_isdst.
// This can cause the output of mktime() to be off by one hour.
// (Rules for daylight savings time are set by national and local govt and in some locations, changes
yearly)
// (There are no locations with 2 hour or half-hour daylight savings that I know of)
// (Yes, "man mktime" talks about using "tms->tm_isdst = -1")
//
// We assume the user is using local time and we convert in two steps:
//
// first we convert "struct tm" to "time_t" using mktime() with unknown tm_isdst
// second we convert "time_t" back to "struct tm" using localtime_r()
// this fills "tm_isdst" with correct value from the system time zone database
// then we reset all the time fields (except for sub-minute fields not affected by daylight savings)
// and call mktime() again, now with the correct value of "tm_isdst".
// K.O. 2013-09-14
struct tm tms = *ptms;
struct tm tms2;
time_t t1 = mktime(&tms);
localtime_r(&t1, &tms2);
tms2.tm_year = ptms->tm_year;
tms2.tm_mon = ptms->tm_mon;
tms2.tm_mday = ptms->tm_mday;
tms2.tm_hour = ptms->tm_hour;
tms2.tm_min = ptms->tm_min;
time_t t2 = mktime(&tms2);
//printf("t1 %.0f, t2 %.0f, diff %d\n", (double)t1, (double)t2, (int)(t1-t2));
return t2;
}
K.O. |
18 Sep 2013, Konstantin Olchanski, Bug Report, mhttpd truncates string variables to 32 characters
|
I confirm the second part of the problem.
Inline edit uses ODBSet(), which uses the "jset" AJAX call to mhttpd which does not extend string variables.
This is the jset code. The best I can tell it truncates string variables to the existing size in ODB:
db_find_key(hDB, 0, str, &hkey)
db_get_key(hDB, hkey, &key);
memset(data, 0, sizeof(data));
size = sizeof(data);
db_sscanf(getparam("value"), data, &size, 0, key.type);
db_set_data_index(hDB, hkey, data, key.item_size, index, key.type);
These original jset/jget functions are a little bit too complicated and there is no documentation (what exists is done by me trying to read the existing code).
We now have a jcopy/ODBMCopy() as a sane replacement for jget, but nothing comparable for jset, yet.
I think this quirk of inline edit cannot be fixed in javascript - the mhttpd code for "jset" has to change.
K.O.
>
> I can confirm part of the problem - the new inline-edit function - after you finish editing - shows you what you
> have typed, not what's actually in ODB - at the very end it should do an ODBGet() to load the actual ODB
> contents and show *that* to the user.
>
> The truncation to 32 characters - most likely it is a failure to resize the ODB string - is probably in mhttpd and
> I can take a quick look into it.
>
> There is a 3rd problem - the mhttpd ODB editor "create" function does not ask for the string length to create.
>
> Actually, in ODB, "create" and "set string size" are two separate functions - db_create_key(TID_STRING) creates
> a string of length zero, then db_set_data() creates an empty string of desired length.
>
> In the new AJAX interface these two functions are separate (ODBCreate just calls db_create_key()).
>
> In the present ODBSet() function the two are mixed together - and the ODB inline edit function uses ODBSet().
>
> K.O.
>
>
>
> > I find that new mhttpd has strange behaviour for ODB strings.
> >
> > - I create a new STRING variable in ODB through mhttpd. It defaults to size 32.
> >
> > - I then edit the STRING variable through mhttpd, writing a new string larger
> > than 32 characters.
> >
> > - Initially everything looks fine; it seems as if the new string value has been
> > accepted.
> >
> > - But if you reload the page, or navigate back to the page, you realize that
> > mhttpd has silently truncated the string back to 32 characters.
> >
> > You can reproduce this problem on a test page here:
> >
> > http://midptf01.triumf.ca:8081/AnnMessage
> >
> > Older versions of mhttpd (I'm testing one from 2 years ago) don't have this
> > 'feature'. For older mhttpd the string variable would get resized when a larger
> > string was inputted. That definitely seems like the right behavior to me.
> >
> > I am using fresh copy of midas from bitbucket as of this morning. (How do I get
> > a particular tag/hash of the version of midas that I am using?) |
24 Sep 2013, Konstantin Olchanski, Info, mktime() and daylight savings time
|
> I vaguely remember that I had a similar problem with ELOG. The solution was to call tzset() at the beginning of the program. The man page says that
> this function is called automatically by programs using time zones, but apparently it is not. Can you try that? There is also the TZ environment
> variable and /etc/localtime. I never understood the details, but playing with these things can influence mktime() and localtime().
I confirm that the timezone is set correctly - I do get the correct time eventually - so there is no missing call to tzet().
K.O. |
25 Sep 2013, Konstantin Olchanski, Info, Documentation for ODBGet() & co, Javascript and AJAX functions.
|
> > The bulk of the MIDAS AJAX and Javascript functions is now documented on the MIDAS Wiki:
> >
> > https://midas.triumf.ca/MidasWiki/index.php/Mhttpd.js
> > https://midas.triumf.ca/MidasWiki/index.php/AJAX
> >
>
> The documentation was updated again.
>
Newly documented are the additional Javascript and AJAX functions present in the GIT branch "feature/ajax":
ODBMCreate(paths, types);
ODBMCreate(paths, types, arraylengths, stringlengths, callback);
ODBMResize(paths, arraylengths, stringlengths, callback);
ODBMRename(paths, names, callback);
ODBMLink(paths, links, callback);
ODBMReorder(paths, indices, callback);
ODBMKey(paths, callback);
ODBMDelete(paths, callback);
All these functions permit asynchronous use (with callback on completion) and the underlying AJAX functions permit JSON-P encoding.
ODBSetUrl("http://mhttpd.somewhere.com:8080") : this new function removes the restriction that custom scripts had to be loaded from the same mhttpd that they will
access. Together with the newly added CORS support in mhttpd, allows loading custom scripts from any web server, including local file, and having then access any one (or
any several) mhttpd data sources.
I think these new functions are now stable (I still had to make some changes to ODBMCreate() recently) and after some more testing this branch will be merged into
"develop".
To use this branch, do either:
a) git clone midas; git pull; git checkout feature/ajax
b) git clone midas; git checkout develop; git pull; git checkout -b ajaxtest; git merge feature/ajax;
(Option (b) creates a local branch with the latest "develop" and "feature/ajax" merged together).
K.O. |
27 Sep 2013, Konstantin Olchanski, Info, ODB JSON support
|
> odbedit can now save ODB in JSON-formatted files.
>
> JSON encoding implementation follows specifications at:
> http://json.org/
>
> The result passes validation by:
> http://jsonlint.com/
>
A bug was reported in my JSON ODB encoder: NaN values are not encoded correctly. A quick review found this:
1) the authors of JSON smoked some bad mushrooms and specifically disallowed NaN and Inf values for floating point numbers:
http://tools.ietf.org/html/rfc4627
2) most JSON encoders and decoders do reasonable and unreasonable things with NaN and Inf values. The worst ones encode them as zero. More bad
mushrooms.
There is a quick survey at: http://lavag.org/topic/16217-cr-json-labview/?p=99058
<pre>
Some Javascript engines allow it since it is valid Javascript but not valid Json however there is no concensus.
cmj-JSON4Lua: raw tostring() output (invalid JSON).
dkjson: 'null' (like in the original JSON-implementation).
Fleece: NaN is 0.0000, freezes on +/-Inf.
jf-JSON: NaN is 'null', Inf is 1e+9999 (the encode_pretty function still outputs raw tostring()).
Lua-Yajl: NaN is -0, Inf is 1e+666.
mp-CJSON: raises invalid JSON error by default, but runtime configurable ('null' or Nan/Inf).
nm-luajsonlib: 'null' (like in the original JSON-implementation).
sb-Json: raw tostring() output (invalid JSON).
th-LuaJSON: JavaScript? constants: NaN is 'NaN', Inf is 'Infinity' (this is valid JavaScript?, but invalid JSON).
</pre>
For the MIDAS JSON encoder (and decoder) I have several choices:
a) encode NaN and Inf using the printf("%f") encoding (as strings, making it valid JSON)
b) encode NaN and Inf as strings using the Javascript special values: "NaN", "Infinity" and "-Infinity", see
http://www.w3schools.com/jsref/jsref_positive_infinity.asp
I note that the Python JSON encoder does (b), see section 18.2.3.3 at http://docs.python.org/2/library/json.html
In either case, behaviour of the JSON decoder on the Javascript side needs to be tested. (Silent conversion to value of zero is not acceptable).
If anybody has an suggestion on this, please let me know.
P.S. If you do not know all about NaN, Inf, "-0" and other floating point funnies, please read: https://www.ualberta.ca/~kbeach/phys420_580_2010/docs/ACM-Goldberg.pdf
P.P.S. If you ever used the type "float" or "double", used the "/" operator or the function "sqrt()" you also should read that reference.
K.O. |
01 Oct 2013, Konstantin Olchanski, Info, MacOS select() problem
|
The following code found in mhttpd does not work on MacOS (BSD UNIX).
On Linux, the do-loop will finish after 2 seconds as expected. On MacOS (and other BSD systems), it will
loop forever.
The cause is the MIDAS watchdog alarm() signal that fires every 1 second and always interrupts the 2
second sleep of select(). The Linux select() updates it's timeout argument to reflect time already slept, so
eventually we finish. The MacOS (BSD) select() does not update the timeout argument and select goes back
to sleep for another 2 seconds (to be again interrupted half-way through).
The POSIX standard (specification for select() & co) permits either behaviour. Compare "man select" on
MacOS and on Linux.
If the select() timeout were not 2 seconds, but 0.9 seconds; or if the MIDAS watchdog alarm fired every
2.1 seconds, this problem would also not exist.
I think there are several places in MIDAS with code like this. An audit is required.
{
FD_ZERO(&readfds);
FD_SET(_sock, &readfds);
timeout.tv_sec = 2;
timeout.tv_usec = 0;
do {
status = select(FD_SETSIZE, &readfds, NULL, NULL, &timeout);
/* if an alarm signal was cought, restart with reduced timeout */
} while (status == -1 && errno == EINTR);
}
K.O. |
09 Oct 2013, Konstantin Olchanski, Info, ODB JSON support
|
> > odbedit can now save ODB in JSON-formatted files.
> A bug was reported in my JSON ODB encoder: NaN values are not encoded correctly.
Tested the browser-builtin JSON.stringify() function in google-chrome, firefox, safari, opera:
everybody encodes numeric values NaN and Inf as JSON value [null].
To me, this clearly demonstrates a severe defect in the JSON standard and in it's Javascript implementation:
a) NaN, Inf and -Inf are valid, useful and commonly used numeric values defined by the IEEE754/854 standard (as opposed to the special value "-0", which is also defined by the standard, but is not nearly as useful)
b) they are all distinct numeric values, encoding them all into the same JSON value [null] is the same as encoding all even numbers into the JSON value [42].
c) on the decoding end, JSON value [null] is decoded into Javascript value [null], which works as 0 for numeric computation, so effectively NaN, Inf and -Inf are made equal to zero. A neat trick.
Note that (c) - NaN, Inf is same as 0 - eventually produces incorrect numerical results by breaking the IEEE754/854 standard specification that number+NaN->NaN, number+infinity->infinity, etc.
In MIDAS we have a requirement that results be numerically correct: if an ODB value is "infinity", the corresponding web page should not show "0".
In addition we have a requirement that JSON encoding should be lossess: i.e. ODB contents encoded by JSON should decode back into the same ODB contents.
To satisfy both requirements, I now encode NaN, Inf and -Inf as JSON string values "NaN", "Infinity" and "-Infinity". (Corresponding to the respective Javascript values).
Notes:
1) this is valid JSON
2) it survives decode/encode in the browser (ODBMCopy()/JSON.parse/modify some values/JSON.stringify/ODBMPaste() does not destroy these special values)
3) it is numerically correct for "NaN" values (Javascript [1+"NaN"] -> NaN)
4) it fails in an obvious way for Inf and -Inf values (Javascript [1+"Infinity"] is NaN instead of Infinity).
https://bitbucket.org/tmidas/midas/commits/82dd203cc95dacb6ec9c0a24bc97ffd45bb58427
K.O. |
22 Oct 2013, Konstantin Olchanski, Info, midas programs "auto start", etc
|
MIDAS "programs" settings include: /programs/xxx/"auto start", "auto restart" and "auto stop". What do
they do?
"auto start":
if set to "y", the program's "start command" will be unconditionally executed at the beginning of the run
start transition.
Because there are no checks or tests, the "start command" will be executed even if the program is already
running. It means that this function cannot be used to start frontend programs - a new copy will be
started each time, and a previously running copy will be killed.
Also the timing of the program startup and run transition is wrong - in my tests, the program starts too
late to see the run transition. If the program is a frontend, it will never see the begin-of-run transition.
1st conclusion: "auto start" should be "n" for frontend programs and for any other programs that are
supposed to be continuously running (mlogger, lazylogger, etc).
2nd conclusion: "auto start" does the same thing as "/programs/execute on start run".
"auto stop":
if set to "y", the program will be stopped after the end of run. (using cm_shutdown).
"auto restart":
this has nothing to do with starting and stopping runs. Instead, it works in conjunction with the alarm
system and the "program is not running" alarm.
The alarm system periodically calls al_check(). al_check() checks all programs defined under /Programs to
see if they are running (using cm_exist()). If a program is not running and an alarm is defined, the alarm is
raised ("program is not running" alarm). If there is a start command and "auto restart" is set to "y", the
start command is executed.
When using these "auto start" and "auto restart" functions, one needs to be careful about the context
where the start command will be executed: midas clients may be running from different directories, under
different user names and on different computers.
In "auto start", the start command is executed from cm_transition. For remote clients, this will happen on
the remote computer. (against the expectation that the program will be started on the main computer).
In "auto restart", the start command is executed by al_check() which always runs locally (for remote
clients, it runs inside the mserver). So the started program will always run on the main computer, but
maybe not in the same directory as when started from the mhttpd "programs -> start" button.
Conclusion:
"programs auto start" : works but has strange interactions and side effects, do not use it.
"programs auto stop" : works, can be used to stop programs at the end of run (but what for?)
"programs auto restart" : works, seems to work correctly, can be used to auto restart mlogger, frontends,
etc.
K.O. |
22 Oct 2013, Konstantin Olchanski, Info, audit of db_get_record()
|
Record-oriented ODB functions db_create_record(), db_get_record(), db_check_record() and
db_set_record() require special attention to the consistency between their "C struct"s (usually defined in
midas.h), their initialization strings (usually defined in midas.h) and the contents of ODB.
When these 3 items become inconsistent, the corresponding midas functions tend to break.
Unlike ODB internal structures and event buffer internal structures, these record-oriented functions are
not part of the midas binary-compatibility abi and they are not protected by db_validate_sizes().
From time to time, new items are added to some of these data structures. Usually this does not cause
problems, but recently we had some difficulty with the runinfo and equipment structures, prompting this
audit.
db_check_record: note: (C) means that this record is created there
alarm.c: alarm_odb_str(C)
mana.c: skipped
mfe.c: equipment_common_str, equipment_statistics_str(C), event_descrip(C), bank_list(C)
mhttpd.cxx: cgif_label_str(C), cgif_bar_str(C), runinfo_str(C), equipment_common_str(C)
mlogger.cxx: ch_settings_str(C)
sequencer.cxx: sequencer_str(C)
db_create_record:
alarm.c: alarm_odb_str, alarm_periodic_str, alarm_class_str
fal.c: skipped
mfe.c: equipment_common_str
midas.c: program_info_str (maybe)
odb.c: (maybe)
lazylogger.cxx: lazy_settings, lazy_statistics
mhttpd.cxx: runinfo_str
mlogger.cxx: chn_settings_str
db_get_record: (hard to do with grep, will have to check every db_get_record by hand)
alarm.c: alarm, class, program_info
fal.c: skipped
mana.c: skipped
midas.c: program_info
odb.c: (maybe)
lazylogger.cxx: lazyst
mhttpd.cxx: runinfo, equipment, ?hkeytemp?, chn_settings, chn_stats, ?label?, ?bar?
mlogger.cxx: ?, ?, chn_stats, chn, settings
sequencer.cxx: hkeyseq
db_set_record:
alarm.c: hkeyalarm, hkeyclass, ???, program_info
fal.c: skipped
mana.c: skipped
mfe.c: equipment_info, ?event structure?
odb.c: (maybe)
lazelogger.cxx: lazyst
mlogger.cxx: chn_stat
sequencer.cxx: seq
db_open_record: note: (W) means MODE_WRITE
fal.c: skipped
mana.c: skipped
mfe.c: equipment_info, equipment_stats(W)
midas.c: requested_transition
odbedit.c: key_update - generic test of hotlink
lazylogger.cxx: runstate, lazyst(W), lazy?
mlogger.cxx: history, chn_statistics, chn_settings
sequencer.cxx: seq
K.O. |
25 Oct 2013, Konstantin Olchanski, Bug Fix, fixed mlogger run auto restart bug
|
A problem existed in midas for some time: when recording long data sets of time (or event) limited runs
with logger run auto restart set to "yes", the runs will automatically stop and restart as expected, but
sometimes the run will stop and never restart and beam will be lost until the experiment operator on shift
wakes up and restarts the run manually.
I have now traced this problem to a race condition inside the mlogger - when a run is being stopped from
the mlogger, the mlogger run transition handler (tr_stop) triggers an immediate attempt to start the next
run, without waiting for the run-stop transition to actually complete. If the run-stop transition does not
finish quickly enough, a safety check in start_the_run() will cause the run restart attempt to silently fail
without any error message.
This race condition is pretty rare but somehow I managed to replicate it while debugging the
multithreaded transitions. It is fixed by making mlogger wait until the run-stop transition completes.
https://bitbucket.org/tmidas/midas/commits/b2631fbed5f7b1ec80e8a6c8781ada0baed7702b
K.O. |
25 Oct 2013, Konstantin Olchanski, Info, MacOS select() problem
|
> The following code found in mhttpd does not work on MacOS (BSD UNIX). ...
Because of this problem, on MacOS, run transitions can get stuck forever - most timeouts do not work. (Specifically, recv_string() never times out)
K.O. |
28 Oct 2013, Konstantin Olchanski, Bug Fix, fixed mlogger run auto restart bug
|
>
> More generally I kind of consider the mlogger auto restart facility as deprecated. It works in the background and the operator does not have a clue
> what is going on. We use now the sequencer to achieve exactly the same functionality.
>
Before subruns were available, most experiments at TRIUMF have used the "auto restart" function. Now, I think most of them use subruns,
with the notable exception of PIENU where the analysis framework could not handle subruns. (PIENU is now shutdown and disassembled).
>
> It just requires a few lines of sequencer code:
>
> LOOP INFINITE
> TRANSITION start
> WAIT events, 5000
> TRANSITION stop
> ENDLOOP
>
Mouse click "auto restart" to "yes" is a little bit simpler than setting up a sequencer file, and it survives a crash of mhttpd.
Does the sequencer survive a crash or a restart of mhttpd?
K.O. |
|