09 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline
|
Hello Stefan,
tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should
presumably not be the case, since CloseRootOutputFile() frees the trees at eor().
------------------- output ---------------------------
lamb@lamb2:~/midas/root_3705> ./analyzer -e
exa_root -i /tmp/midas/examples/root/run%05d.mid -o /tmp/midas/run%05d.root -r 1 2
Root server listening on port 9090...
Running analyzer offline. Stop with "!"
book_ttree: tree_struct.n_tree = 1
book_ttree: tree_struct.n_tree = 2
Set run number 1 in ODB
Load ODB from run 1...OK
/tmp/midas/examples/root/run00001.mid:2722 /tmp/midas/run00001.root:2720 events,
0.21s
book_ttree: tree_struct.n_tree = 3 <<---- !!!!
book_ttree: tree_struct.n_tree = 4
Set run number 2 in ODB
Load ODB from run 2...OK
/tmp/midas/examples/root/run00002.mid:2347 /tmp/midas/run00002.root:2345 events,
0.18s
*** Break *** segmentation violation
----------------- \output ----------------------------
Adding this one line fixes the segfault problem for the root example expt.
----------------- code -------------------------
lamb@lamb2:/data/software/midas/midas_3705/src/src> svn diff mana.c
Index: mana.c
===================================================================
--- mana.c (revision 3705)
+++ mana.c (working copy)
@@ -1496,6 +1496,7 @@
/* delete event tree */
free(tree_struct.event_tree);
tree_struct.event_tree = NULL;
+ tree_struct.n_tree = 0;
// go to ROOT root directory
gROOT->cd();
---------------- \code ---------------------------
Please check if this gives the intended behaviour. I am not very familiar with the
midas internals.
Unfortunately my own analyzer's segfault problem is not solved by this patch. I
guess I have to keep searching for a bug on my side..... :-)
Cheers,
Randolf |
10 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline
|
> tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should
> presumably not be the case, since CloseRootOutputFile() frees the trees at eor().
Yes this indeed a bug. I applied your change and committed the new code. |
11 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline
|
Hello again,
just for the record, in case somebody else runs into the same problem...
I have hunted down "my" segfault problem to the fact that I book histograms not
in <module>_init, but in <module>_bor. I have to do so, because only in bor do I
know which histograms to book, as this information comes from the ODB (booking
only histograms for CAMAC modules which were set to "read" in the ODB). The core
dump happens on the first access (->Fill, ->SetName,...) of one of these histos
in the 2nd run analyzed offline ("./analyzer -r n m").
In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module
bor() functions go to the output file", and then does a gManaOutputFile->cd();
Consequently, the histograms vanish after the file is closed, therefore the
segfault when trying to access them in the 2nd run. (I keep track of existing
histograms, only booking the missing histos in bor.)
The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with
TFolders and booking the histogram.
I do, however, not really understand the intention why histos booked in bor() go
to only the file, whereas histos booked in init() go to memory. Could you please
comment briefly? Maybe I missed the most important point. And what about online
mode, should this work?
Thanks a lot in advance,
Randolf |
11 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline
|
> I have hunted down "my" segfault problem to the fact that I book histograms not
> in <module>_init, but in <module>_bor. I have to do so, because only in bor do I
> know which histograms to book, as this information comes from the ODB (booking
> only histograms for CAMAC modules which were set to "read" in the ODB). The core
> dump happens on the first access (->Fill, ->SetName,...) of one of these histos
> in the 2nd run analyzed offline ("./analyzer -r n m").
>
> In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module
> bor() functions go to the output file", and then does a gManaOutputFile->cd();
> Consequently, the histograms vanish after the file is closed, therefore the
> segfault when trying to access them in the 2nd run. (I keep track of existing
> histograms, only booking the missing histos in bor.)
>
> The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with
> TFolders and booking the histogram.
ROOT has the strange concept of "current working directory", coming from the fact that
ROOT was written by Fortran and PAW people, being used to have directories and
subdirectories with a persistent state (not really object-oriented style). So one can
set the "current working directory" to the root (=memory) with gROOT->cd() and to a
subdirectory which will later be written into a file with gManaOutputFile->cd(). If
you do the first one, the histograms are created only in memory, while in the later
case they are also created in memory, but will later be written into the output file
in the routine CloseRootOutputFile(). So if you do a gROOT->cd() in <module>_bor,
these histograms will not be written to file. So I guess your solution is not a real
solution.
> I do, however, not really understand the intention why histos booked in bor() go
> to only the file, whereas histos booked in init() go to memory. Could you please
> comment briefly? Maybe I missed the most important point. And what about online
> mode, should this work?
The root output file is opened in bor() and closed in eor(). For a histo to go to the
file, it must be booked after opening the file, that is after bor() in mana.c and
therefore after the gManaOutputFile->cd().
I agree with you that the current scheme is not satisfactory. When running online, you
want to keep the histos between the runs. When running offline, you delete and
re-create them for each run. It would be better to create all histos online and
offline under gROOT, and just copy them to gManaOutputFile before writing them. I have
to admit that this root code was never really used in a productive environment for
offline analysis, so there might be some issues here and there. Some people write
directly root files in the logger, and then do a root-only (without the midas
analyzer) analysis. Unfortunately I'm busy these days and cannot write any code right
now. But if you feel like something should be modified in mana.c, please send it to me
and I can incorporate it into the standard code. |
12 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline
|
Hi
> So I guess your solution is not a real solution.
I was not precise enough on what I do. This way the histograms persist in memory, but
they are also written to every file:
e.g. in module "trig_tdc":
TDirectory *savedir = gDirectory; // will restore this afterwards
gROOT->cd(); // go to file
// make sure we are in the right "analyzer module folder"
TDC_Folder = (TFolder *) gROOT->FindObjectAny("trig_tdc");
gHistoFolderStack->Add((TObject *) TDC_Folder);
...(loop over all TDCs, figure out which histos exist, and which need to be booked)
open_subfolder("raw4208");
hrTDC = h1_book(....); // create histo in memory, but it shows up in the file, too.
close_subfolder(); //raw4208
// restore gHistoFolderStack (we added a folder when entering routine)
gHistoFolderStack->Remove(gHistoFolderStack->Last());
// restore current directory
savedir->cd();
When deleting histos I do:
gManaHistosFolder->RecursiveRemove(*pHisto);
(*pHisto)->Delete();
(*pHisto) = NULL; // for my book-keeping of existing histos.
You don't have to clear the histos explicitly between runs. gManaHistosFolder does this
magic to you.
> But if you feel like something should be modified in mana.c, please send it to me
> and I can incorporate it into the standard code.
No, the code is fine. I just wanted to explain my problem and a solution to it, because
I thought that somebody might run into the same problem, too.
Ciao,
Randolf |
12 Jun 2010, hai qu, Forum, crash on start run
|
Dear experts,
I use fedora 12 and midas 4680. there is problem to start run when the frontend
application runs fine.
# odbedit -c start
Starting run #18
[midas.c:8423:rpc_client_connect,ERROR] timeout on receive remote computer info:
[midas.c:3659:cm_transition,ERROR] cannot connect to client
"feTPCPacketReceiver" on host tpcdaq0, port 36663, status 503
[midas.c:8423:rpc_client_connect,ERROR] timeout on receive remote computer info:
[midas.c:4880:cm_shutdown,ERROR] Cannot connect to client 'frontend' on host
'hostname', port 36663
[midas.c:4883:cm_shutdown,ERROR] Killing and Deleting client
'feTPCPacketReceiver' pid 24516
[midas.c:3857:cm_transition,ERROR] Could not start a run: cm_transition() status
503, message 'Cannot connect to client 'frontend''
Run #18 start aborted
Error: Cannot connect to client 'frontend'
11:03:42 [Logger,INFO] Deleting previous file "/home/daq/Run/online/run00018.mid"
11:03:42 [Logger,INFO] Client 'feTPCPacketReceiver' on buffer 'SYSMSG' removed
by cm_watchdog because client pid 24516 does not exist
11:03:42 [Logger,ERROR] [system.c:563:ss_shm_close,ERROR]
shmctl(shmid=7274511,IPC_RMID) failed, errno 1 (Operation not permitted)
11:03:42 [ODBEdit,INFO] Run #18 start aborted
==========================================================================
there are several ethernet cards on the host machine. eth0 connect the host
machine to the gateway machine and the front end application listen to eth1 for
the incoming data packets:
eth0 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx
inet addr:10.0.1.1 Bcast:10.0.1.63 Mask:255.255.255.0
inet6 addr: fe80::f6ce:46ff:fe99:709b/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:470870 errors:0 dropped:0 overruns:0 frame:0
TX packets:515987 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:345000246 (329.0 MiB) TX bytes:377269124 (359.7 MiB)
Interrupt:17
eth1 Link encap:Ethernet HWaddr xx:xx:xx:xx:xx:xx
inet addr:10.0.1.2 Bcast:10.255.255.255 Mask:255.0.0.0
inet6 addr: fe80::226:55ff:fed6:56a9/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:15 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:1836 (1.7 KiB)
Memory:ec180000-ec1a0000
thanks for hints |
14 Jun 2010, Stefan Ritt, Forum, crash on start run
|
> I use fedora 12 and midas 4680. there is problem to start run when the frontend
> application runs fine.
I don't know exactly what is wrong, but I would check following things:
- does your feTCPPacketReceiver die during the start-of-run? Maybe you do some segfault
int he begin-of-run routine. Can you STOP a run?
- is there any network problem due to your two cards? When you try to stop your fe from
odbedit with
# odbedit -c "shutdown feTCPPacketReceiver"
do you then get the same error? The shutdown functionality uses the same RPC channel as
the start/stop run. Some people had firewall problems, on both sides (host AND client),
so make sure all firewalls are disabled.
- if you disable one network card, do you still get the same problem? |
14 Jun 2010, hai qu, Forum, crash on start run
|
> - does your feTCPPacketReceiver die during the start-of-run? Maybe you do some segfault
> int he begin-of-run routine. Can you STOP a run?
when start a run, it bring the mtransition process and I guess the server try to talk to the
client, then it fails and the frontend application get killed since not response.
>> When you try to stop your fe from
> odbedit with # odbedit -c "shutdown feTCPPacketReceiver"
it gets
[midas.c:8423:rpc_client_connect,ERROR] timeout on receive remote computer info:
[midas.c:4880:cm_shutdown,ERROR] Cannot connect to client
"feTPCPacketReceiver" on host 'tpcdaq0', port 35865
[midas.c:4883:cm_shutdown,ERROR] Killing and Deleting client
'feTPCPacketReceiver' pid 27250
Client feTPCPacketReceiver not active
what does this error mean? :
11:03:42 [Logger,ERROR] [system.c:563:ss_shm_close,ERROR]
shmctl(shmid=7274511,IPC_RMID) failed, errno 1 (Operation not permitted)
thanks
hai
p.s. that code runs fine on my laptop with ubuntu 9, so that also be possible that somewhere
my configuration not right to cause problem |
08 Sep 2016, Amy Roberts, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
I've recently run into issues when using JSON.parse on ODB keys containing
8-bit data.
For JSON.parse to successfully parse a string, (A) the string must be valid
UTF-8, (B) several whitespace characters, control characters, and the
characters " and \ must be escaped, and (C) you've got to follow the key-
value rules laid out in http://www.json.org/.
The web browser takes care of (A), and I verified that for this key Midas
handled (C) correctly. In principle, the function json_write in odb.c
handles (B) - but json_write does not escape control characters.
To manage this problem, I modified json_write (in odb.c) to replace any
control character with the more-inocuous character, 'C'. My default case
now looks like:
default:
{
// if a char is a control character,
// print 'C' in its place
// note that this loses data:
// a more-correct method would be to print
// \uXXXX, where XXXX is the character in hex
if(iscntrl(*s)){
(*buffer)[(*buffer_end)++] = 'C';
s++;
} else {
(*buffer)[(*buffer_end)++] = *s++;
}
}
Where the call to iscntrl(*s) requires the addition of the ctype.h header
file.
I'm guessing a blanket replacement of control characters with 'C' isn't
something all Midas users would want to do. Replacing the control character
with its hex value seems like a good choice - but not without adding bounds
checking!
An alternative to changing odb.c could be to add a regex to Midas response
text which removes all control characters (U+0000 - U+001F):
var resp_lint = req.response.replace(/[\u{0000}-\u{001F}]/gmu, '');
var json_obj = JSON.parse(resp_lint);
Unfortunately, the 'u' regex flax doesn't work on the Firefox version
included in Scientific Linux 6.8. |
30 Sep 2016, Konstantin Olchanski, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> I've recently run into issues when using JSON.parse on ODB keys containing
> 8-bit data.
I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid
UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
it is impractical to handle or permit invalid UTF-8 strings.
Certainly in the general case, replacing all control characters with something else or escaping them or
otherwise changing the value if TID_STRING data would wreck *valid* UTF-8 strings, which I would
assume to be the normal use.
In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as
we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check
that TID_STRING is valid UTF-8.
But in your specific case, why do you have random control characters in your TID_STRING data?
Maybe you are using TID_STRING as general storage instead of arrays of TID_CHAR or
TID_DWORD?
K.O.
>
> For JSON.parse to successfully parse a string, (A) the string must be valid
> UTF-8, (B) several whitespace characters, control characters, and the
> characters " and \ must be escaped, and (C) you've got to follow the key-
> value rules laid out in http://www.json.org/.
>
> The web browser takes care of (A), and I verified that for this key Midas
> handled (C) correctly. In principle, the function json_write in odb.c
> handles (B) - but json_write does not escape control characters.
>
> To manage this problem, I modified json_write (in odb.c) to replace any
> control character with the more-inocuous character, 'C'. My default case
> now looks like:
>
> default:
> {
> // if a char is a control character,
> // print 'C' in its place
> // note that this loses data:
> // a more-correct method would be to print
> // \uXXXX, where XXXX is the character in hex
> if(iscntrl(*s)){
> (*buffer)[(*buffer_end)++] = 'C';
> s++;
> } else {
> (*buffer)[(*buffer_end)++] = *s++;
> }
> }
>
> Where the call to iscntrl(*s) requires the addition of the ctype.h header
> file.
>
> I'm guessing a blanket replacement of control characters with 'C' isn't
> something all Midas users would want to do. Replacing the control character
> with its hex value seems like a good choice - but not without adding bounds
> checking!
>
> An alternative to changing odb.c could be to add a regex to Midas response
> text which removes all control characters (U+0000 - U+001F):
>
> var resp_lint = req.response.replace(/[\u{0000}-\u{001F}]/gmu, '');
> var json_obj = JSON.parse(resp_lint);
>
> Unfortunately, the 'u' regex flax doesn't work on the Firefox version
> included in Scientific Linux 6.8. |
25 Oct 2016, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> > I've recently run into issues when using JSON.parse on ODB keys containing
> > 8-bit data.
>
> I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid
> UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
> it is impractical to handle or permit invalid UTF-8 strings.
> ....
> But in your specific case, why do you have random control characters in your TID_STRING data?
> Maybe you are using TID_STRING as general storage instead of arrays of TID_CHAR or
> TID_DWORD?
I'm a little confused by this report and want to make sure I understand the situation. Konstantin points
out that the TID_STRING should be valid UTF-8. But I think that Amy agreed that the string was valid UTF-8.
My understanding was that Amy's contention was that the valid UTF-8 string didn't get returned as valid JSON.
But I am having trouble reproducing your behaviour Amy. I created a ODB string variable with a tab control
control character
sprintf(mystring,"first line \t second line");
status = db_set_value(hDB, 0,"/test2/mystring", &mystring, size, 1, TID_STRING);
and what I tried to pull the ODB using jcopy
http://neut18:8081/?cmd=jcopy&odb=/test2/mystring&format=json
I got
{
"mystring/key" : { "type" : 12, "item_size" : 32, "access_mode" : 7, "last_written" : 1477416322 },
"mystring" : "first line \t second line"
}
which seems to be valid JSON.
I only tried this with tab. Are there other control characters that you are having trouble with? Or maybe
I misunderstand the question?
>
> >
> > For JSON.parse to successfully parse a string, (A) the string must be valid
> > UTF-8, (B) several whitespace characters, control characters, and the
> > characters " and \ must be escaped, and (C) you've got to follow the key-
> > value rules laid out in http://www.json.org/.
> >
> > The web browser takes care of (A), and I verified that for this key Midas
> > handled (C) correctly. In principle, the function json_write in odb.c
> > handles (B) - but json_write does not escape control characters.
> >
> > To manage this problem, I modified json_write (in odb.c) to replace any
> > control character with the more-inocuous character, 'C'. My default case
> > now looks like:
> >
> > default:
> > {
> > // if a char is a control character,
> > // print 'C' in its place
> > // note that this loses data:
> > // a more-correct method would be to print
> > // \uXXXX, where XXXX is the character in hex
> > if(iscntrl(*s)){
> > (*buffer)[(*buffer_end)++] = 'C';
> > s++;
> > } else {
> > (*buffer)[(*buffer_end)++] = *s++;
> > }
> > }
> >
> > Where the call to iscntrl(*s) requires the addition of the ctype.h header
> > file.
> >
> > I'm guessing a blanket replacement of control characters with 'C' isn't
> > something all Midas users would want to do. Replacing the control character
> > with its hex value seems like a good choice - but not without adding bounds
> > checking!
> >
> > An alternative to changing odb.c could be to add a regex to Midas response
> > text which removes all control characters (U+0000 - U+001F):
> >
> > var resp_lint = req.response.replace(/[\u{0000}-\u{001F}]/gmu, '');
> > var json_obj = JSON.parse(resp_lint);
> >
> > Unfortunately, the 'u' regex flax doesn't work on the Firefox version
> > included in Scientific Linux 6.8. |
01 Dec 2016, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> > I've recently run into issues when using JSON.parse on ODB keys containing
> > 8-bit data.
>
> I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid
> UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
> it is impractical to handle or permit invalid UTF-8 strings.
>
> Certainly in the general case, replacing all control characters with something else or escaping them or
> otherwise changing the value if TID_STRING data would wreck *valid* UTF-8 strings, which I would
> assume to be the normal use.
>
> In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as
> we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check
> that TID_STRING is valid UTF-8.
I agree that I think we should start requiring strings to be UTF-8 encoded unicode.
I'd suggest that before worrying about the TID_STRING data, we should start by sanitizing the ODB key names.
I've seen a couple cases where the ODB key name is a non-UTF-8 string. It is very awkward to use odbedit
to delete these keys.
I attach a suggested modification to odb.c that rejects calls to db_create_key with non-UTF-8 key names. It
uses some random function I found on the internet that is supposed to check if a string is valid UTF-8. I
checked a couple of strings with invalid UTF-8 characters and it correctly identified them. But I won't
claim to be certain that this is really identifying all UTF-8 vs non-UTF-8 cases. Maybe others have a
better way of identifying this. |
15 Jan 2017, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> > In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as
> > we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check
> > that TID_STRING is valid UTF-8.
> ...
> I attach a suggested modification to odb.c that rejects calls to db_create_key with non-UTF-8 key names. It
> uses some random function I found on the internet that is supposed to check if a string is valid UTF-8. I
> checked a couple of strings with invalid UTF-8 characters and it correctly identified them. But I won't
> claim to be certain that this is really identifying all UTF-8 vs non-UTF-8 cases. Maybe others have a
> better way of identifying this.
At Konstantin's suggestion, I committed the function I found for checking if a string was UTF-8 compatible to
odb.c. The function is currently not used; I commented out a proposed use in db_create_key. Experts can decide
if the code was good enough to use. |
23 Jan 2017, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> At Konstantin's suggestion, I committed the function I found for checking if a string was UTF-8 compatible to
> odb.c. The function is currently not used; I commented out a proposed use in db_create_key. Experts can decide
> if the code was good enough to use.
After more discussion, I have enabled the parts of the ODB code that check that key names are UTF-8 compliant.
This check will show up in (at least) two ways:
1) Attempts to create a new ODB variable if the ODB key is not UTF-8 compliant. You will see error messages like
[fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_create_key: UTF-8 incompatible
string
2) When a program first connects to the ODB, it runs a check to ensure that the ODB is valid. This will now include
a check that all key names are UTF-8 compliant. Any non-UTF8 compliant key names will be replaced by a string of the
pointer to the key. You will see error messages like:
[fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_validate_key: UTF-8
incompatible string
[fesimdaq,ERROR] [odb.c:647:db_validate_key,ERROR] Warning: corrected key "/Equipment/SIMDAQ/Eur€": invalid name
"Eur€" replaced with "0x7f74be63f970"
This behaviour (checking UTF-8 compatibility and automatically fixing ODB names) can be disabled by setting an
environment variable
MIDAS_INVALID_STRING_IS_OK
It doesn't matter what the environment variable is set to; it just needs to be set. Note also that this variable is
only checked once, when a program starts. |
30 Jan 2017, Stefan Ritt, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
>
> > At Konstantin's suggestion, I committed the function I found for checking if a string was UTF-8 compatible to
> > odb.c. The function is currently not used; I commented out a proposed use in db_create_key. Experts can decide
> > if the code was good enough to use.
>
> After more discussion, I have enabled the parts of the ODB code that check that key names are UTF-8 compliant.
>
> This check will show up in (at least) two ways:
>
> 1) Attempts to create a new ODB variable if the ODB key is not UTF-8 compliant. You will see error messages like
>
> [fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_create_key: UTF-8 incompatible
> string
>
> 2) When a program first connects to the ODB, it runs a check to ensure that the ODB is valid. This will now include
> a check that all key names are UTF-8 compliant. Any non-UTF8 compliant key names will be replaced by a string of the
> pointer to the key. You will see error messages like:
>
> [fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_validate_key: UTF-8
> incompatible string
> [fesimdaq,ERROR] [odb.c:647:db_validate_key,ERROR] Warning: corrected key "/Equipment/SIMDAQ/Eur€": invalid name
> "Eur€" replaced with "0x7f74be63f970"
>
> This behaviour (checking UTF-8 compatibility and automatically fixing ODB names) can be disabled by setting an
> environment variable
>
> MIDAS_INVALID_STRING_IS_OK
>
> It doesn't matter what the environment variable is set to; it just needs to be set. Note also that this variable is
> only checked once, when a program starts.
I see you put some switches into the environment ("MIDAS_INVALID_STRING_IS_OK"). Do you think this is a good idea? Most variables are
sitting in the ODB (/experiment/xxx), except those which cannot be in the ODB because we need it before we open the ODB, like MIDAS_DIR.
Having them in the ODB has the advantage that everything is in one place, and we see a "list" of things we can change. From an empty
environment it is not clear that such a thing like "MIDAS_INVALID_STRING_IS_OK" does exist, while if it would be an ODB key it would be
obvious. Can I convince you to move this flag into the ODB? |
01 Feb 2017, Konstantin Olchanski, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
>
> I see you put some switches into the environment ("MIDAS_INVALID_STRING_IS_OK"). Do you think this is a good idea? Most variables are
> sitting in the ODB (/experiment/xxx), except those which cannot be in the ODB because we need it before we open the ODB, like MIDAS_DIR.
> Having them in the ODB has the advantage that everything is in one place, and we see a "list" of things we can change. From an empty
> environment it is not clear that such a thing like "MIDAS_INVALID_STRING_IS_OK" does exist, while if it would be an ODB key it would be
> obvious. Can I convince you to move this flag into the ODB?
>
Some additional explanation.
Time passed, the world turned, and the current web-compatible standard for text strings is UTF-8 encoded Unicode, see
https://en.wikipedia.org/wiki/UTF-8
(ObCanadianContent, UTF-8 was invented the Canadian Rob Pike https://en.wikipedia.org/wiki/Rob_Pike)
(and by some other guy https://en.wikipedia.org/wiki/Ken_Thompson).
It turns out that not every combination of 8-bit characters (char*) is valid UTF-8 Unicode.
In the MIDAS world we run into this when MIDAS ODB strings are exported to Javascript running inside web
browsers ("custom pages", etc). ODB strings (TID_STRING) and ODB key names that are not valid UTF-8
make such web pages malfunction and do not work right.
One solution to this is to declare that ODB strings (TID_STRING) and ODB key names *must* be valid UTF-8 Unicode.
The present commits implemented this solution. Invalid UTF-8 is rejected by db_create() & co and by the ODB integrity validator.
This means some existing running experiment may suddenly break because somehow they have "old-style" ODB entries
or they mistakenly use TID_STRING to store arbitrary binary data (use array of TID_CHAR instead).
To permit such experiments to use current releases of MIDAS, we include a "defeat" device - to disable UTF-8 checks
until they figure out where non-UTF-8 strings come from and correct the problem.
Why is this defeat device non an ODB entry? Because it is not a normal mode of operation - there is no use-case where
an experiment will continue to use non-UTF-8 compatible ODB indefinitely, in the long term. For example, as the MIDAS user
interface moves to more and more to HTML+Javascript+"AJAX", such experiments will see that non-UTF-8 compatible ODB entries
cause all sorts of problems and will have to convert.
K.O. |
01 Feb 2017, Stefan Ritt, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
> Some additional explanation.
>
> Time passed, the world turned, and the current web-compatible standard for text strings is UTF-8 encoded Unicode, see
> https://en.wikipedia.org/wiki/UTF-8
> (ObCanadianContent, UTF-8 was invented the Canadian Rob Pike https://en.wikipedia.org/wiki/Rob_Pike)
> (and by some other guy https://en.wikipedia.org/wiki/Ken_Thompson).
>
> It turns out that not every combination of 8-bit characters (char*) is valid UTF-8 Unicode.
>
> In the MIDAS world we run into this when MIDAS ODB strings are exported to Javascript running inside web
> browsers ("custom pages", etc). ODB strings (TID_STRING) and ODB key names that are not valid UTF-8
> make such web pages malfunction and do not work right.
>
> One solution to this is to declare that ODB strings (TID_STRING) and ODB key names *must* be valid UTF-8 Unicode.
>
> The present commits implemented this solution. Invalid UTF-8 is rejected by db_create() & co and by the ODB integrity validator.
>
> This means some existing running experiment may suddenly break because somehow they have "old-style" ODB entries
> or they mistakenly use TID_STRING to store arbitrary binary data (use array of TID_CHAR instead).
>
> To permit such experiments to use current releases of MIDAS, we include a "defeat" device - to disable UTF-8 checks
> until they figure out where non-UTF-8 strings come from and correct the problem.
>
> Why is this defeat device non an ODB entry? Because it is not a normal mode of operation - there is no use-case where
> an experiment will continue to use non-UTF-8 compatible ODB indefinitely, in the long term. For example, as the MIDAS user
> interface moves to more and more to HTML+Javascript+"AJAX", such experiments will see that non-UTF-8 compatible ODB entries
> cause all sorts of problems and will have to convert.
>
>
> K.O.
Ok, I agree.
Stefan |
22 Feb 2023, Stefano Piacentini, Info, connection to a MySQL server: retry procedure in the Logger
|
Dear all,
we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an
option to retry multiple times the I/O on the MySQL?
The error we are experiencing is the following (hiding the IP address):
[Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't
connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)
Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.
Best,
Stefano. |
22 Feb 2023, Stefan Ritt, Info, connection to a MySQL server: retry procedure in the Logger
|
> Dear all,
>
> we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an
> option to retry multiple times the I/O on the MySQL?
>
> The error we are experiencing is the following (hiding the IP address):
>
> [Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't
> connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)
>
> Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.
What would you propose? If the connection does not work, most likely the server is down or busy. If we retry,
the connection still might not work. If we retry many times, people will complain that the run start or stop
takes very long. If we then just continue (without stopping the logger), the MySQL database will miss important
information and the runs probably cannot be analyzed later. So I believe it's better to really stop the logger
so that people get aware that there is a problem and fix the source, rather than curing the symptoms.
In the MEG experiment at PSI we run the logger with a MySQL database and we never see any connection issue,
except when the MySQL server gets in maintenance (once a year), but usually we don't take data then. Since we
use the same logger code, it cannot be a problem there. So I would try to fix the problem on the MySQL side.
Best,
Stefan |
07 Mar 2023, Stefano Piacentini, Info, connection to a MySQL server: retry procedure in the Logger
|
> > Dear all,
> >
> > we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an
> > option to retry multiple times the I/O on the MySQL?
> >
> > The error we are experiencing is the following (hiding the IP address):
> >
> > [Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't
> > connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)
> >
> > Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.
>
> What would you propose? If the connection does not work, most likely the server is down or busy. If we retry,
> the connection still might not work. If we retry many times, people will complain that the run start or stop
> takes very long. If we then just continue (without stopping the logger), the MySQL database will miss important
> information and the runs probably cannot be analyzed later. So I believe it's better to really stop the logger
> so that people get aware that there is a problem and fix the source, rather than curing the symptoms.
>
> In the MEG experiment at PSI we run the logger with a MySQL database and we never see any connection issue,
> except when the MySQL server gets in maintenance (once a year), but usually we don't take data then. Since we
> use the same logger code, it cannot be a problem there. So I would try to fix the problem on the MySQL side.
>
> Best,
> Stefan
Dear Stefan,
a possible solution could be to define the number of times to retry as a parameter that is 0 by default, as well as a wait time between two subsequent tries. This
would leave the decision on how to handle a possible failed connection to the user. In our case, for example, we would prefer to not stop the acquisition in case
of a failed connection to the external SQL. In addition, we have other software that, with a retry procedure, doesn’t fail: with 1 re-try and a sleep time of 0.5 s
we already recover 100% of the faults.
Anyway, we implemented a local database, which is a mirror of the external one, and the problems disappeared.
Thanks,
Stefano. |
|