Back Midas Rome Roody Rootana
  Midas DAQ System, Page 50 of 146  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
    Reply  09 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline 
Hello Stefan,

tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should 
presumably not be the case, since CloseRootOutputFile() frees the trees at eor().

------------------- output ---------------------------
lamb@lamb2:~/midas/root_3705> ./analyzer -e 
exa_root -i /tmp/midas/examples/root/run%05d.mid -o /tmp/midas/run%05d.root -r 1 2
Root server listening on port 9090...
Running analyzer offline. Stop with "!"
book_ttree: tree_struct.n_tree = 1
book_ttree: tree_struct.n_tree = 2
Set run number 1 in ODB
Load ODB from run 1...OK
/tmp/midas/examples/root/run00001.mid:2722  /tmp/midas/run00001.root:2720  events, 
0.21s
book_ttree: tree_struct.n_tree = 3     <<---- !!!!
book_ttree: tree_struct.n_tree = 4
Set run number 2 in ODB
Load ODB from run 2...OK
/tmp/midas/examples/root/run00002.mid:2347  /tmp/midas/run00002.root:2345  events, 
0.18s

 *** Break *** segmentation violation
----------------- \output ----------------------------

Adding this one line fixes the segfault problem for the root example expt.

----------------- code -------------------------
lamb@lamb2:/data/software/midas/midas_3705/src/src> svn diff mana.c
Index: mana.c
===================================================================
--- mana.c      (revision 3705)
+++ mana.c      (working copy)
@@ -1496,6 +1496,7 @@
    /* delete event tree */
    free(tree_struct.event_tree);
    tree_struct.event_tree = NULL;
+   tree_struct.n_tree = 0;
 
    // go to ROOT root directory
    gROOT->cd();
---------------- \code ---------------------------

Please check if this gives the intended behaviour. I am not very familiar with the 
midas internals.

Unfortunately my own analyzer's segfault problem is not solved by this patch. I 
guess I have to keep searching for a bug on my side.....  :-)


Cheers,

Randolf
    Reply  10 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline 
> tree_struct.n_tree keeps counting up from run to run (in book_ttree). This should 
> presumably not be the case, since CloseRootOutputFile() frees the trees at eor().

Yes this indeed a bug. I applied your change and committed the new code.
    Reply  11 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline 
Hello again,

just for the record, in case somebody else runs into the same problem...

I have hunted down "my" segfault problem to the fact that I book histograms not 
in <module>_init, but in <module>_bor. I have to do so, because only in bor do I 
know which histograms to book, as this information comes from the ODB (booking 
only histograms for CAMAC modules which were set to "read" in the ODB). The core 
dump happens on the first access (->Fill, ->SetName,...) of one of these histos 
in the 2nd run analyzed offline ("./analyzer -r n m").

In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module 
bor() functions go to the output file", and then does a gManaOutputFile->cd();
Consequently, the histograms vanish after the file is closed, therefore the 
segfault when trying to access them in the 2nd run. (I keep track of existing 
histograms, only booking the missing histos in bor.)

The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with 
TFolders and booking the histogram.


I do, however, not really understand the intention why histos booked in bor() go 
to only the file, whereas histos booked in init() go to memory. Could you please 
comment briefly? Maybe I missed the most important point. And what about online 
mode, should this work?


Thanks a lot in advance,

Randolf
    Reply  11 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline 
> I have hunted down "my" segfault problem to the fact that I book histograms not 
> in <module>_init, but in <module>_bor. I have to do so, because only in bor do I 
> know which histograms to book, as this information comes from the ODB (booking 
> only histograms for CAMAC modules which were set to "read" in the ODB). The core 
> dump happens on the first access (->Fill, ->SetName,...) of one of these histos 
> in the 2nd run analyzed offline ("./analyzer -r n m").
> 
> In mana.c:bor (line 1854) is stated that "all ROOT objects created by user module 
> bor() functions go to the output file", and then does a gManaOutputFile->cd();
> Consequently, the histograms vanish after the file is closed, therefore the 
> segfault when trying to access them in the 2nd run. (I keep track of existing 
> histograms, only booking the missing histos in bor.)
> 
> The problem goes away with "gROOT->cd()" in <module>_bor, before fiddling with 
> TFolders and booking the histogram.

ROOT has the strange concept of "current working directory", coming from the fact that
ROOT was written by Fortran and PAW people, being used to have directories and
subdirectories with a persistent state (not really object-oriented style). So one can
set the "current working directory" to the root (=memory) with gROOT->cd() and to a
subdirectory which will later be written into a file with gManaOutputFile->cd(). If
you do the first one, the histograms are created only in memory, while in the later
case they are also created in memory, but will later be written into the output file
in the routine CloseRootOutputFile(). So if you do a gROOT->cd() in <module>_bor,
these histograms will not be written to file. So I guess your solution is not a real
solution.

> I do, however, not really understand the intention why histos booked in bor() go 
> to only the file, whereas histos booked in init() go to memory. Could you please 
> comment briefly? Maybe I missed the most important point. And what about online 
> mode, should this work?

The root output file is opened in bor() and closed in eor(). For a histo to go to the
file, it must be booked after opening the file, that is after bor() in mana.c and
therefore after the gManaOutputFile->cd().

I agree with you that the current scheme is not satisfactory. When running online, you
want to keep the histos between the runs. When running offline, you delete and
re-create them for each run. It would be better to create all histos online and
offline under gROOT, and just copy them to gManaOutputFile before writing them. I have
to admit that this root code was never really used in a productive environment for
offline analysis, so there might be some issues here and there. Some people write
directly root files in the logger, and then do a root-only (without the midas
analyzer) analysis. Unfortunately I'm busy these days and cannot write any code right
now. But if you feel like something should be modified in mana.c, please send it to me
and I can incorporate it into the standard code.
    Reply  12 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline 
Hi

> So I guess your solution is not a real solution.

I was not precise enough on what I do. This way the histograms persist in memory, but 
they are also written to every file:

e.g. in module "trig_tdc":

  TDirectory *savedir = gDirectory;  // will restore this afterwards
  gROOT->cd();     // go to file

  // make sure we are in the right "analyzer module folder"
  TDC_Folder = (TFolder *) gROOT->FindObjectAny("trig_tdc");
  gHistoFolderStack->Add((TObject *) TDC_Folder);

  ...(loop over all TDCs, figure out which histos exist, and which need to be booked)

  open_subfolder("raw4208");
  hrTDC = h1_book(....);   // create histo in memory, but it shows up in the file, too.
  close_subfolder(); //raw4208

  // restore gHistoFolderStack (we added a folder when entering routine)
  gHistoFolderStack->Remove(gHistoFolderStack->Last());

  // restore current directory
  savedir->cd();

When deleting histos I do:

     gManaHistosFolder->RecursiveRemove(*pHisto);
    (*pHisto)->Delete();
    (*pHisto) = NULL;  // for my book-keeping of existing histos.

You don't have to clear the histos explicitly between runs. gManaHistosFolder does this 
magic to you.

> But if you feel like something should be modified in mana.c, please send it to me
> and I can incorporate it into the standard code.

No, the code is fine. I just wanted to explain my problem and a solution to it, because 
I thought that somebody might run into the same problem, too. 

Ciao,

Randolf
Entry  12 Jun 2010, hai qu, Forum, crash on start run 
Dear experts,

I use fedora 12 and midas 4680. there is problem to start run when the frontend
application runs fine. 


# odbedit -c start


Starting run #18
[midas.c:8423:rpc_client_connect,ERROR] timeout on receive remote computer info: 
[midas.c:3659:cm_transition,ERROR] cannot connect to client
"feTPCPacketReceiver" on host tpcdaq0, port 36663, status 503
[midas.c:8423:rpc_client_connect,ERROR] timeout on receive remote computer info: 
[midas.c:4880:cm_shutdown,ERROR] Cannot connect to client 'frontend' on host
'hostname', port 36663
[midas.c:4883:cm_shutdown,ERROR] Killing and Deleting client
'feTPCPacketReceiver' pid 24516
[midas.c:3857:cm_transition,ERROR] Could not start a run: cm_transition() status
503, message 'Cannot connect to client 'frontend''
Run #18 start aborted
Error: Cannot connect to client 'frontend'

11:03:42 [Logger,INFO] Deleting previous file "/home/daq/Run/online/run00018.mid"

11:03:42 [Logger,INFO] Client 'feTPCPacketReceiver' on buffer 'SYSMSG' removed
by cm_watchdog because client pid 24516 does not exist

11:03:42 [Logger,ERROR] [system.c:563:ss_shm_close,ERROR]
shmctl(shmid=7274511,IPC_RMID) failed, errno 1 (Operation not permitted)

11:03:42 [ODBEdit,INFO] Run #18 start aborted
==========================================================================

there are several ethernet cards on the host machine. eth0 connect the host
machine to the gateway machine and the front end application listen to eth1 for
the incoming data packets:

eth0      Link encap:Ethernet  HWaddr xx:xx:xx:xx:xx:xx  
          inet addr:10.0.1.1  Bcast:10.0.1.63  Mask:255.255.255.0
          inet6 addr: fe80::f6ce:46ff:fe99:709b/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:470870 errors:0 dropped:0 overruns:0 frame:0
          TX packets:515987 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:345000246 (329.0 MiB)  TX bytes:377269124 (359.7 MiB)
          Interrupt:17 

eth1      Link encap:Ethernet  HWaddr xx:xx:xx:xx:xx:xx   
          inet addr:10.0.1.2  Bcast:10.255.255.255  Mask:255.0.0.0
          inet6 addr: fe80::226:55ff:fed6:56a9/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:15 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:1836 (1.7 KiB)
          Memory:ec180000-ec1a0000 


thanks for hints
    Reply  14 Jun 2010, Stefan Ritt, Forum, crash on start run 
> I use fedora 12 and midas 4680. there is problem to start run when the frontend
> application runs fine. 

I don't know exactly what is wrong, but I would check following things:

- does your feTCPPacketReceiver die during the start-of-run? Maybe you do some segfault 
int he begin-of-run routine. Can you STOP a run?

- is there any network problem due to your two cards? When you try to stop your fe from 
odbedit with

# odbedit -c "shutdown feTCPPacketReceiver"

do you then get the same error? The shutdown functionality uses the same RPC channel as 
the start/stop run. Some people had firewall problems, on both sides (host AND client), 
so make sure all firewalls are disabled.

- if you disable one network card, do you still get the same problem?
    Reply  14 Jun 2010, hai qu, Forum, crash on start run 
> - does your feTCPPacketReceiver die during the start-of-run? Maybe you do some segfault 
> int he begin-of-run routine. Can you STOP a run?
when start a run, it bring the mtransition process and I guess the server try to talk to the
client, then it fails and the frontend application get killed since not response.

>> When you try to stop your fe from 
> odbedit with  # odbedit -c "shutdown feTCPPacketReceiver"

it gets
[midas.c:8423:rpc_client_connect,ERROR] timeout on receive remote computer info: 
[midas.c:4880:cm_shutdown,ERROR] Cannot connect to client
"feTPCPacketReceiver" on host 'tpcdaq0', port 35865
[midas.c:4883:cm_shutdown,ERROR] Killing and Deleting client
'feTPCPacketReceiver' pid 27250
Client feTPCPacketReceiver not active


what does this error mean? :
11:03:42 [Logger,ERROR] [system.c:563:ss_shm_close,ERROR]
shmctl(shmid=7274511,IPC_RMID) failed, errno 1 (Operation not permitted)


thanks
hai

p.s. that code runs fine on my laptop with ubuntu 9, so that also be possible that somewhere
my configuration not right to cause problem
Entry  08 Sep 2016, Amy Roberts, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail 
I've recently run into issues when using JSON.parse on ODB keys containing 
8-bit data.

For JSON.parse to successfully parse a string, (A) the string must be valid 
UTF-8, (B) several whitespace characters, control characters, and the 
characters " and \ must be escaped, and (C) you've got to follow the key-
value rules laid out in http://www.json.org/.

The web browser takes care of (A), and I verified that for this key Midas 
handled (C) correctly.  In principle, the function json_write in odb.c 
handles (B) - but json_write does not escape control characters.

To manage this problem, I modified json_write (in odb.c) to replace any 
control character with the more-inocuous character, 'C'.  My default case 
now looks like:

default:
         {
           // if a char is a control character,
           // print 'C' in its place
           // note that this loses data:
           // a more-correct method would be to print
           // \uXXXX, where XXXX is the character in hex
           if(iscntrl(*s)){
             (*buffer)[(*buffer_end)++] = 'C';
             s++;
           } else {
             (*buffer)[(*buffer_end)++] = *s++;
           }
         }
      
Where the call to iscntrl(*s) requires the addition of the ctype.h header 
file.

I'm guessing a blanket replacement of control characters with 'C' isn't 
something all Midas users would want to do.  Replacing the control character 
with its hex value seems like a good choice - but not without adding bounds 
checking!

An alternative to changing odb.c could be to add a regex to Midas response 
text which removes all control characters (U+0000 - U+001F): 

var resp_lint = req.response.replace(/[\u{0000}-\u{001F}]/gmu, '');
var json_obj = JSON.parse(resp_lint);

Unfortunately, the 'u' regex flax doesn't work on the Firefox version 
included in Scientific Linux 6.8.  
    Reply  30 Sep 2016, Konstantin Olchanski, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail 
> I've recently run into issues when using JSON.parse on ODB keys containing 
> 8-bit data.

I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid 
UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
it is impractical to handle or permit invalid UTF-8 strings.

Certainly in the general case, replacing all control characters with something else or escaping them or 
otherwise changing the value if TID_STRING data would wreck *valid* UTF-8 strings, which I would 
assume to be the normal use.

In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as 
we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check 
that TID_STRING is valid UTF-8.

But in your specific case, why do you have random control characters in your TID_STRING data? 
Maybe you are using TID_STRING as general storage instead of arrays of TID_CHAR or 
TID_DWORD?

K.O.



> 
> For JSON.parse to successfully parse a string, (A) the string must be valid 
> UTF-8, (B) several whitespace characters, control characters, and the 
> characters " and \ must be escaped, and (C) you've got to follow the key-
> value rules laid out in http://www.json.org/.
> 
> The web browser takes care of (A), and I verified that for this key Midas 
> handled (C) correctly.  In principle, the function json_write in odb.c 
> handles (B) - but json_write does not escape control characters.
> 
> To manage this problem, I modified json_write (in odb.c) to replace any 
> control character with the more-inocuous character, 'C'.  My default case 
> now looks like:
> 
> default:
>          {
>            // if a char is a control character,
>            // print 'C' in its place
>            // note that this loses data:
>            // a more-correct method would be to print
>            // \uXXXX, where XXXX is the character in hex
>            if(iscntrl(*s)){
>              (*buffer)[(*buffer_end)++] = 'C';
>              s++;
>            } else {
>              (*buffer)[(*buffer_end)++] = *s++;
>            }
>          }
>       
> Where the call to iscntrl(*s) requires the addition of the ctype.h header 
> file.
> 
> I'm guessing a blanket replacement of control characters with 'C' isn't 
> something all Midas users would want to do.  Replacing the control character 
> with its hex value seems like a good choice - but not without adding bounds 
> checking!
> 
> An alternative to changing odb.c could be to add a regex to Midas response 
> text which removes all control characters (U+0000 - U+001F): 
> 
> var resp_lint = req.response.replace(/[\u{0000}-\u{001F}]/gmu, '');
> var json_obj = JSON.parse(resp_lint);
> 
> Unfortunately, the 'u' regex flax doesn't work on the Firefox version 
> included in Scientific Linux 6.8.  
    Reply  25 Oct 2016, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail 
> > I've recently run into issues when using JSON.parse on ODB keys containing 
> > 8-bit data.
> 
> I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid 
> UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
> it is impractical to handle or permit invalid UTF-8 strings.
> ....
> But in your specific case, why do you have random control characters in your TID_STRING data? 
> Maybe you are using TID_STRING as general storage instead of arrays of TID_CHAR or 
> TID_DWORD?

I'm a little confused by this report and want to make sure I understand the situation.  Konstantin points
out that the TID_STRING should be valid UTF-8.  But I think that Amy agreed that the string was valid UTF-8.
 My understanding was that Amy's contention was that the valid UTF-8 string didn't get returned as valid JSON.

But I am having trouble reproducing your behaviour Amy.  I created a ODB string variable with a tab control
control character

  sprintf(mystring,"first line \t second line");
  status = db_set_value(hDB, 0,"/test2/mystring", &mystring, size, 1, TID_STRING);

and what I tried to pull the ODB using jcopy

http://neut18:8081/?cmd=jcopy&odb=/test2/mystring&format=json

I got 

{
"mystring/key" : { "type" : 12, "item_size" : 32, "access_mode" : 7, "last_written" : 1477416322 },
"mystring" : "first line \t second line"
}

which seems to be valid JSON.  

I only tried this with tab.  Are there other control characters that you are having trouble with?  Or maybe
I misunderstand the question?





> 
> > 
> > For JSON.parse to successfully parse a string, (A) the string must be valid 
> > UTF-8, (B) several whitespace characters, control characters, and the 
> > characters " and \ must be escaped, and (C) you've got to follow the key-
> > value rules laid out in http://www.json.org/.
> > 
> > The web browser takes care of (A), and I verified that for this key Midas 
> > handled (C) correctly.  In principle, the function json_write in odb.c 
> > handles (B) - but json_write does not escape control characters.
> > 
> > To manage this problem, I modified json_write (in odb.c) to replace any 
> > control character with the more-inocuous character, 'C'.  My default case 
> > now looks like:
> > 
> > default:
> >          {
> >            // if a char is a control character,
> >            // print 'C' in its place
> >            // note that this loses data:
> >            // a more-correct method would be to print
> >            // \uXXXX, where XXXX is the character in hex
> >            if(iscntrl(*s)){
> >              (*buffer)[(*buffer_end)++] = 'C';
> >              s++;
> >            } else {
> >              (*buffer)[(*buffer_end)++] = *s++;
> >            }
> >          }
> >       
> > Where the call to iscntrl(*s) requires the addition of the ctype.h header 
> > file.
> > 
> > I'm guessing a blanket replacement of control characters with 'C' isn't 
> > something all Midas users would want to do.  Replacing the control character 
> > with its hex value seems like a good choice - but not without adding bounds 
> > checking!
> > 
> > An alternative to changing odb.c could be to add a regex to Midas response 
> > text which removes all control characters (U+0000 - U+001F): 
> > 
> > var resp_lint = req.response.replace(/[\u{0000}-\u{001F}]/gmu, '');
> > var json_obj = JSON.parse(resp_lint);
> > 
> > Unfortunately, the 'u' regex flax doesn't work on the Firefox version 
> > included in Scientific Linux 6.8.  
    Reply  01 Dec 2016, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail odb_modifications.txt
> > I've recently run into issues when using JSON.parse on ODB keys containing 
> > 8-bit data.
> 
> I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid 
> UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
> it is impractical to handle or permit invalid UTF-8 strings.
> 
> Certainly in the general case, replacing all control characters with something else or escaping them or 
> otherwise changing the value if TID_STRING data would wreck *valid* UTF-8 strings, which I would 
> assume to be the normal use.
> 
> In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as 
> we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check 
> that TID_STRING is valid UTF-8.

I agree that I think we should start requiring strings to be UTF-8 encoded unicode. 

I'd suggest that before worrying about the TID_STRING data, we should start by sanitizing the ODB key names.
 I've seen a couple cases where the ODB key name is a non-UTF-8 string.  It is very awkward to use odbedit
to delete these keys.

I attach a suggested modification to odb.c that rejects calls to db_create_key with non-UTF-8 key names.  It
uses some random function I found on the internet that is supposed to check if a string is valid UTF-8.  I
checked a couple of strings with invalid UTF-8 characters and it correctly identified them.  But I won't
claim to be certain that this is really identifying all UTF-8 vs non-UTF-8 cases.  Maybe others have a
better way of identifying this.
    Reply  15 Jan 2017, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail 
> > In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as 
> > we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check 
> > that TID_STRING is valid UTF-8.
> ...
> I attach a suggested modification to odb.c that rejects calls to db_create_key with non-UTF-8 key names.  It
> uses some random function I found on the internet that is supposed to check if a string is valid UTF-8.  I
> checked a couple of strings with invalid UTF-8 characters and it correctly identified them.  But I won't
> claim to be certain that this is really identifying all UTF-8 vs non-UTF-8 cases.  Maybe others have a
> better way of identifying this.

At Konstantin's suggestion, I committed the function I found for checking if a string was UTF-8 compatible to
odb.c.  The function is currently not used; I commented out a proposed use in db_create_key.  Experts can decide
if the code was good enough to use.
    Reply  23 Jan 2017, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail 
> At Konstantin's suggestion, I committed the function I found for checking if a string was UTF-8 compatible to
> odb.c.  The function is currently not used; I commented out a proposed use in db_create_key.  Experts can decide
> if the code was good enough to use.

After more discussion, I have enabled the parts of the ODB code that check that key names are UTF-8 compliant. 

This check will show up in (at least) two ways:

1) Attempts to create a new ODB variable if the ODB key is not UTF-8 compliant.  You will see error messages like

[fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_create_key: UTF-8 incompatible
string

2) When a program first connects to the ODB, it runs a check to ensure that the ODB is valid.  This will now include
a check that all key names are UTF-8 compliant. Any non-UTF8 compliant key names will be replaced by a string of the
pointer to the key.  You will see error messages like:

[fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_validate_key: UTF-8
incompatible string
[fesimdaq,ERROR] [odb.c:647:db_validate_key,ERROR] Warning: corrected key "/Equipment/SIMDAQ/Eur€": invalid name
"Eur€" replaced with "0x7f74be63f970"

This behaviour (checking UTF-8 compatibility and automatically fixing ODB names) can be disabled by setting an
environment variable

MIDAS_INVALID_STRING_IS_OK

It doesn't matter what the environment variable is set to; it just needs to be set.  Note also that this variable is
only checked once, when a program starts.
    Reply  30 Jan 2017, Stefan Ritt, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail 
> 
> > At Konstantin's suggestion, I committed the function I found for checking if a string was UTF-8 compatible to
> > odb.c.  The function is currently not used; I commented out a proposed use in db_create_key.  Experts can decide
> > if the code was good enough to use.
> 
> After more discussion, I have enabled the parts of the ODB code that check that key names are UTF-8 compliant. 
> 
> This check will show up in (at least) two ways:
> 
> 1) Attempts to create a new ODB variable if the ODB key is not UTF-8 compliant.  You will see error messages like
> 
> [fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_create_key: UTF-8 incompatible
> string
> 
> 2) When a program first connects to the ODB, it runs a check to ensure that the ODB is valid.  This will now include
> a check that all key names are UTF-8 compliant. Any non-UTF8 compliant key names will be replaced by a string of the
> pointer to the key.  You will see error messages like:
> 
> [fesimdaq,ERROR] [odb.c:572:db_validate_name,ERROR] Invalid name "Eur€" passed to db_validate_key: UTF-8
> incompatible string
> [fesimdaq,ERROR] [odb.c:647:db_validate_key,ERROR] Warning: corrected key "/Equipment/SIMDAQ/Eur€": invalid name
> "Eur€" replaced with "0x7f74be63f970"
> 
> This behaviour (checking UTF-8 compatibility and automatically fixing ODB names) can be disabled by setting an
> environment variable
> 
> MIDAS_INVALID_STRING_IS_OK
> 
> It doesn't matter what the environment variable is set to; it just needs to be set.  Note also that this variable is
> only checked once, when a program starts.



I see you put some switches into the environment ("MIDAS_INVALID_STRING_IS_OK"). Do you think this is a good idea? Most variables are 
sitting in the ODB (/experiment/xxx), except those which cannot be in the ODB because we need it before we open the ODB, like MIDAS_DIR. 
Having them in the ODB has the advantage that everything is in one place, and we see a "list" of things we can change. From an empty 
environment it is not clear that such a thing like "MIDAS_INVALID_STRING_IS_OK" does exist, while if it would be an ODB key it would be 
obvious. Can I convince you to move this flag into the ODB?
    Reply  01 Feb 2017, Konstantin Olchanski, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail 
> 
> I see you put some switches into the environment ("MIDAS_INVALID_STRING_IS_OK"). Do you think this is a good idea? Most variables are 
> sitting in the ODB (/experiment/xxx), except those which cannot be in the ODB because we need it before we open the ODB, like MIDAS_DIR. 
> Having them in the ODB has the advantage that everything is in one place, and we see a "list" of things we can change. From an empty 
> environment it is not clear that such a thing like "MIDAS_INVALID_STRING_IS_OK" does exist, while if it would be an ODB key it would be 
> obvious. Can I convince you to move this flag into the ODB?
>


Some additional explanation.

Time passed, the world turned, and the current web-compatible standard for text strings is UTF-8 encoded Unicode, see 
https://en.wikipedia.org/wiki/UTF-8
(ObCanadianContent, UTF-8 was invented the Canadian Rob Pike https://en.wikipedia.org/wiki/Rob_Pike)
(and by some other guy https://en.wikipedia.org/wiki/Ken_Thompson).

It turns out that not every combination of 8-bit characters (char*) is valid UTF-8 Unicode.

In the MIDAS world we run into this when MIDAS ODB strings are exported to Javascript running inside web
browsers ("custom pages", etc). ODB strings (TID_STRING) and ODB key names that are not valid UTF-8
make such web pages malfunction and do not work right.

One solution to this is to declare that ODB strings (TID_STRING) and ODB key names *must* be valid UTF-8 Unicode.

The present commits implemented this solution. Invalid UTF-8 is rejected by db_create() & co and by the ODB integrity validator.

This means some existing running experiment may suddenly break because somehow they have "old-style" ODB entries
or they mistakenly use TID_STRING to store arbitrary binary data (use array of TID_CHAR instead).

To permit such experiments to use current releases of MIDAS, we include a "defeat" device - to disable UTF-8 checks
until they figure out where non-UTF-8 strings come from and correct the problem.

Why is this defeat device non an ODB entry? Because it is not a normal mode of operation - there is no use-case where
an experiment will continue to use non-UTF-8 compatible ODB indefinitely, in the long term. For example, as the MIDAS user
interface moves to more and more to HTML+Javascript+"AJAX", such experiments will see that non-UTF-8 compatible ODB entries
cause all sorts of problems and will have to convert.


K.O.
    Reply  01 Feb 2017, Stefan Ritt, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail 
> Some additional explanation.
> 
> Time passed, the world turned, and the current web-compatible standard for text strings is UTF-8 encoded Unicode, see 
> https://en.wikipedia.org/wiki/UTF-8
> (ObCanadianContent, UTF-8 was invented the Canadian Rob Pike https://en.wikipedia.org/wiki/Rob_Pike)
> (and by some other guy https://en.wikipedia.org/wiki/Ken_Thompson).
> 
> It turns out that not every combination of 8-bit characters (char*) is valid UTF-8 Unicode.
> 
> In the MIDAS world we run into this when MIDAS ODB strings are exported to Javascript running inside web
> browsers ("custom pages", etc). ODB strings (TID_STRING) and ODB key names that are not valid UTF-8
> make such web pages malfunction and do not work right.
> 
> One solution to this is to declare that ODB strings (TID_STRING) and ODB key names *must* be valid UTF-8 Unicode.
> 
> The present commits implemented this solution. Invalid UTF-8 is rejected by db_create() & co and by the ODB integrity validator.
> 
> This means some existing running experiment may suddenly break because somehow they have "old-style" ODB entries
> or they mistakenly use TID_STRING to store arbitrary binary data (use array of TID_CHAR instead).
> 
> To permit such experiments to use current releases of MIDAS, we include a "defeat" device - to disable UTF-8 checks
> until they figure out where non-UTF-8 strings come from and correct the problem.
> 
> Why is this defeat device non an ODB entry? Because it is not a normal mode of operation - there is no use-case where
> an experiment will continue to use non-UTF-8 compatible ODB indefinitely, in the long term. For example, as the MIDAS user
> interface moves to more and more to HTML+Javascript+"AJAX", such experiments will see that non-UTF-8 compatible ODB entries
> cause all sorts of problems and will have to convert.
> 
> 
> K.O.

Ok, I agree.

Stefan
Entry  22 Feb 2023, Stefano Piacentini, Info, connection to a MySQL server: retry procedure in the Logger 
Dear all,

we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an 
option to retry multiple times the I/O on the MySQL?

The error we are experiencing is the following (hiding the IP address):

[Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't 
connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)

Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.

Best,
Stefano.
    Reply  22 Feb 2023, Stefan Ritt, Info, connection to a MySQL server: retry procedure in the Logger 
> Dear all,
> 
> we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an 
> option to retry multiple times the I/O on the MySQL?
> 
> The error we are experiencing is the following (hiding the IP address):
> 
> [Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't 
> connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)
> 
> Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.

What would you propose? If the connection does not work, most likely the server is down or busy. If we retry, 
the connection still might not work. If we retry many times, people will complain that the run start or stop 
takes very long. If we then just continue (without stopping the logger), the MySQL database will miss important 
information and the runs probably cannot be analyzed later. So I believe it's better to really stop the logger 
so that people get aware that there is a problem and fix the source, rather than curing the symptoms.

In the MEG experiment at PSI we run the logger with a MySQL database and we never see any connection issue, 
except when the MySQL server gets in maintenance (once a year), but usually we don't take data then. Since we 
use the same logger code, it cannot be a problem there. So I would try to fix the problem on the MySQL side.

Best,
Stefan
    Reply  07 Mar 2023, Stefano Piacentini, Info, connection to a MySQL server: retry procedure in the Logger 
> > Dear all,
> > 
> > we are experiencing a connection problem to the MySQL server that we use to log informations. Is there an 
> > option to retry multiple times the I/O on the MySQL?
> > 
> > The error we are experiencing is the following (hiding the IP address):
> > 
> > [Logger,ERROR] [mlogger.cxx:2455:write_runlog_sql,ERROR] Failed to connect to database: Error: Can't 
> > connect to MySQL server on 'xxx.xxx.xxx.xxx:6033' (110)
> > 
> > Then the logger stops, and must be restarted. This eventually happens only during the BOR or the EOR.
> 
> What would you propose? If the connection does not work, most likely the server is down or busy. If we retry, 
> the connection still might not work. If we retry many times, people will complain that the run start or stop 
> takes very long. If we then just continue (without stopping the logger), the MySQL database will miss important 
> information and the runs probably cannot be analyzed later. So I believe it's better to really stop the logger 
> so that people get aware that there is a problem and fix the source, rather than curing the symptoms.
> 
> In the MEG experiment at PSI we run the logger with a MySQL database and we never see any connection issue, 
> except when the MySQL server gets in maintenance (once a year), but usually we don't take data then. Since we 
> use the same logger code, it cannot be a problem there. So I would try to fix the problem on the MySQL side.
> 
> Best,
> Stefan


Dear Stefan,

a possible solution could be to define the number of times to retry as a parameter that is 0 by default, as well as a wait time between two subsequent tries. This 
would leave the decision on how to handle a possible failed connection to the user. In our case, for example, we would prefer to not stop the acquisition in case 
of a failed connection to the external SQL. In addition, we have other software that, with a retry procedure, doesn’t fail: with 1 re-try and a sleep time of 0.5 s 
we already recover 100% of the faults.

Anyway, we implemented a local database, which is a mirror of the external one, and the problems disappeared.

Thanks,
Stefano.
ELOG V3.1.4-2e1708b5