| > I've recently run into issues when using JSON.parse on ODB keys containing 
> 8-bit data.
I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid 
UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
it is impractical to handle or permit invalid UTF-8 strings.
Certainly in the general case, replacing all control characters with something else or escaping them or 
otherwise changing the value if TID_STRING data would wreck *valid* UTF-8 strings, which I would 
assume to be the normal use.
In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as 
we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check 
that TID_STRING is valid UTF-8.
But in your specific case, why do you have random control characters in your TID_STRING data? 
Maybe you are using TID_STRING as general storage instead of arrays of TID_CHAR or 
TID_DWORD?
K.O.
> 
> For JSON.parse to successfully parse a string, (A) the string must be valid 
> UTF-8, (B) several whitespace characters, control characters, and the 
> characters " and \ must be escaped, and (C) you've got to follow the key-
> value rules laid out in http://www.json.org/.
> 
> The web browser takes care of (A), and I verified that for this key Midas 
> handled (C) correctly.  In principle, the function json_write in odb.c 
> handles (B) - but json_write does not escape control characters.
> 
> To manage this problem, I modified json_write (in odb.c) to replace any 
> control character with the more-inocuous character, 'C'.  My default case 
> now looks like:
> 
> default:
>          {
>            // if a char is a control character,
>            // print 'C' in its place
>            // note that this loses data:
>            // a more-correct method would be to print
>            // \uXXXX, where XXXX is the character in hex
>            if(iscntrl(*s)){
>              (*buffer)[(*buffer_end)++] = 'C';
>              s++;
>            } else {
>              (*buffer)[(*buffer_end)++] = *s++;
>            }
>          }
>       
> Where the call to iscntrl(*s) requires the addition of the ctype.h header 
> file.
> 
> I'm guessing a blanket replacement of control characters with 'C' isn't 
> something all Midas users would want to do.  Replacing the control character 
> with its hex value seems like a good choice - but not without adding bounds 
> checking!
> 
> An alternative to changing odb.c could be to add a regex to Midas response 
> text which removes all control characters (U+0000 - U+001F): 
> 
> var resp_lint = req.response.replace(/[\u{0000}-\u{001F}]/gmu, '');
> var json_obj = JSON.parse(resp_lint);
> 
> Unfortunately, the 'u' regex flax doesn't work on the Firefox version 
> included in Scientific Linux 6.8.   |