|
Back
Midas
Rome
Roody
Rootana
|
Midas DAQ System |
Not logged in |
 |
|
08 Sep 2016, Amy Roberts, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
30 Sep 2016, Konstantin Olchanski, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
25 Oct 2016, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
01 Dec 2016, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
15 Jan 2017, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
23 Jan 2017, Thomas Lindner, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
30 Jan 2017, Stefan Ritt, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
01 Feb 2017, Konstantin Olchanski, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
01 Feb 2017, Stefan Ritt, Bug Report, control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail
|
|
Message ID: 1223
Entry time: 01 Dec 2016
In reply to: 1204
Reply to this: 1227
|
Author: |
Thomas Lindner |
Topic: |
Bug Report |
Subject: |
control characters not sanitized by json_write - can cause JSON.parse of mhttpd result to fail |
|
|
> > I've recently run into issues when using JSON.parse on ODB keys containing
> > 8-bit data.
>
> I am tempted to take a hard line and say that in general MIDAS TID_STRING data should be valid
> UTF-8 encoded Unicode. In the modern mixed javascript/json/whatever environment I think
> it is impractical to handle or permit invalid UTF-8 strings.
>
> Certainly in the general case, replacing all control characters with something else or escaping them or
> otherwise changing the value if TID_STRING data would wreck *valid* UTF-8 strings, which I would
> assume to be the normal use.
>
> In other words, non-UTF-8 strings are following non-IEEE-754 floating point values into oblivion - as
> we do not check the TID_FLOAT and TID_DOUBLE is valid IEEE-754 values, we should not check
> that TID_STRING is valid UTF-8.
I agree that I think we should start requiring strings to be UTF-8 encoded unicode.
I'd suggest that before worrying about the TID_STRING data, we should start by sanitizing the ODB key names.
I've seen a couple cases where the ODB key name is a non-UTF-8 string. It is very awkward to use odbedit
to delete these keys.
I attach a suggested modification to odb.c that rejects calls to db_create_key with non-UTF-8 key names. It
uses some random function I found on the internet that is supposed to check if a string is valid UTF-8. I
checked a couple of strings with invalid UTF-8 characters and it correctly identified them. But I won't
claim to be certain that this is really identifying all UTF-8 vs non-UTF-8 cases. Maybe others have a
better way of identifying this. |
|