Back Midas Rome Roody Rootana
  Midas DAQ System, Page 79 of 142  Not logged in ELOG logo
ID Datedown Author Topic Subject
  1287   02 May 2017 Konstantin OlchanskiForumProblem with logger at run start
> Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:9106:rpc_client_connect,ERROR] cannot connect to host "scar
> lett", port 44858: connect() returned -1, errno 113 (No route to host)

Forgot to reply to this: if you read the error messages, you will see the actual problem is "no route to host". Next step
is to ping the same hostname or try "telnet hostname 22" (cut-and-paste the hostname from the error message
to avoid the common pitfall of not seeing a typo, i.e. ping host00 works while midas connect to hostOO does not (zero vs capital-o)).
In your case you had the wrong hostname ("foo" and "foo.localdomain" resolve to different IP addresses, one works the other
one does not). You can also try to use the IP address instead of hostname, this will avoid hostname resolution problems
(inconsistency between /etc/hosts and hostnames in DNS is very easy to have when using self-made private networks).

K.O.
  1286   02 May 2017 Konstantin OlchanskiInfomhttpd inline-editor change
I changed the mhttpd odb inline editor to use the json-rpc interface. Good things:

- browser no longer complains about obsolete synchronous ajax calls
- can edit strings of arbitrary length (was limited to the max URL length)
- funny characters " (quote), > and < (angle brackets) are correctly escaped.
- after editing, the actual value from odb is loaded and displayed (confirming that the edit "took").

K.O.
  1285   02 May 2017 Konstantin OlchanskiBug Reportmhttpd inline-editor and web MAX_STRING_LENGTH, stop form odbedit broken
> > I shall check on the use of MAX_STRING_LENGTH at least in odb itself...

Also tested the web interface:

In the odb editor, overlong strings show truncated to MAX_STRING_LENGTH (via db_sprintf()),
but the odb inline-editor can handle overlong strings correctly.

The inline-editor implementation that uses ODBSet() had a string length limitation to maximum
URL length (ODBSet uses AJAX jset with call parameters encoded into the URL).

I now converted the inline-editor to use the json-rpc api (uses http post) and I confirm that this can handle
arbitrary long strings.

K.O.
  1284   02 May 2017 Konstantin OlchanskiInfoadded db_resize_string()
> Since we have been regularly running into problems with db_get_xxx(TID_STRING) and string buffers of mismatched size,
> I now implemented db_get_value_string(hdb, hkey, key_name, index, &string, create).

I run into problems with string arrays - non-array strings have unlimited length, but string arrays have fixed string length, usually set at creation time.

This causes a problem with growing arrays using db_get_value_string(), when converting a non-array variable to an array, the wrong
string length gets used, and one gets an array with useless string length. There is no way to specify the correct array string length
without adding more parameters to db_get_value_string() and confusing and complicating it for the typical case where it is used
against simple (non-array) odb entries.

To clarify the situation, db_get_value_string() was changed to reject attempts to resize an array and
calls of db_get_value_string(index>0 and create==TRUE) now return an error.

To create and resize string arrays, I added a new function - db_resize_array(hdb, hkey, key_name, num_values, max_string_size).

Here,
num_values is the new array size, making it possible to grow or shrink an array
max_string_size is the new string size, making it possible to change the array string length after the array was created (there was no midas function to do this before now).

I added a json-rpc call for db_resize_string().

But it still needs to be added to odbedit and mhttpd.

K.O.
  1283   26 Apr 2017 Francesco RengaForumProblem with logger at run start
Dear Stefan,
           thank you very much for your reply. We could finally fix the problem by replacing "scarlett" with "scarlett.localdomain" in our
hostname configuration file /etc/hostname (under debian).

Best Regards,
        Francesco

> Dear Francesco,
> 
> Your error (No route to host) typically means that you have a network problem outside of MIDAS. Your computer has to "find itself" and 
> this is probably broken. Try to do a "ping scarlett" or "nslookup scarlett" and you will see that the DNS server can't be reached or is 
> wrongly configured. Sometimes it helps to put scarlett explicitly into /etc/hosts
> 
> Stefan
> 
> 
> > Dear experts,
> >     we have a problem when trying to run a MIDAS DAQ which worked in the past on the same PC (but on a different
> > network). We get the following error messages when starting a new run:
> > 
> > Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:9106:rpc_client_connect,ERROR] cannot connect to host "scar
> > lett", port 44858: connect() returned -1, errno 113 (No route to host)
> > Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:3539:cm_transition_call,ERROR] cannot connect to client "Lo
> > gger" on host scarlett, port 44858, status 503
> > 
> > (scarlett is indeed the hostname of the PC). The error occurs even if the PC is disconnected from the network.
> > 
> > Any suggestion?
> > 
> > Best Regards,
> >         Francesco
  1282   26 Apr 2017 Stefan RittInfoadded db_get_value_string()
Just some thought for discussion:

Rather than "spicing up" the MIDAS library here and there with C++ objects such as std::string, wouldn't it make more sense to "cleanly" wrap an ODB value in a C++ class? We could use then 
both APIs in parallel, and encourage the C++ API for new developments. We could then write things like:

   ODBKEY<std::string> name("/Experiment/Name"); // constructor calls automatically db_get_value
   name = "New Name"; // overloading the "=" operator, will call db_set_value()

or even

   ODBKEY<std::vector, std::string> nameArray("...");
   for (auto &s : nameArray)
      std::cout << s << std::endl; // print all elements of string array

so we treat ODB arrays as vectors, which fixes array boundary violations nicely.

If the key does not exist, we could properly throw exceptions and forget about tons of nested return parameters for error conditions.

Many nice things could be done, common errors could be prevented, and we can do a "smooth" migration: We don't have to change the whole library completely, just where we feel it's currently 
needed. So over time the code would be "objectified". Would be nice if we could rely on C++11 (like the "auto" feature above). Not sure about VxWorks, but every other OS should be fine.

Stefan

> Since we have been regularly running into problems with db_get_xxx(TID_STRING) and string buffers of mismatched size,
> I now implemented db_get_value_string(hdb, hkey, key_name, index, &string, create).
> 
> It works the same as db_get_value(TID_STRING), except that the string value is returned into an std::string object,
> memory allocation is handled by std::string and there is no string length limit (other than std::string limits).
> 
> Accessing string arrays is done explicitly via an "index" parameter, if index is bigger than odb array size DB_OUT_OF_RANGE is returned
> without logging an error message (e.g. db_get_data_index() will log an error). This makes is safe to iterate over array entries with a simple
> loop of index from 0 and up until db_get returns an error.
> 
> As before, if the odb entry does not exist, it will be created (if create==true) and initialized with the value of the string parameter (zero-terminated in odb).
> 
> There is also newly added db_set_value_string() and cm_get_path_string(). if you want more of these, please ask, or send patches.
> 
> K.O.
  1281   26 Apr 2017 Stefan RittForumProblem with logger at run start
Dear Francesco,

Your error (No route to host) typically means that you have a network problem outside of MIDAS. Your computer has to "find itself" and 
this is probably broken. Try to do a "ping scarlett" or "nslookup scarlett" and you will see that the DNS server can't be reached or is 
wrongly configured. Sometimes it helps to put scarlett explicitly into /etc/hosts

Stefan


> Dear experts,
>     we have a problem when trying to run a MIDAS DAQ which worked in the past on the same PC (but on a different
> network). We get the following error messages when starting a new run:
> 
> Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:9106:rpc_client_connect,ERROR] cannot connect to host "scar
> lett", port 44858: connect() returned -1, errno 113 (No route to host)
> Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:3539:cm_transition_call,ERROR] cannot connect to client "Lo
> gger" on host scarlett, port 44858, status 503
> 
> (scarlett is indeed the hostname of the PC). The error occurs even if the PC is disconnected from the network.
> 
> Any suggestion?
> 
> Best Regards,
>         Francesco
  1280   26 Apr 2017 Konstantin OlchanskiInfoadded db_get_value_string()
Since we have been regularly running into problems with db_get_xxx(TID_STRING) and string buffers of mismatched size,
I now implemented db_get_value_string(hdb, hkey, key_name, index, &string, create).

It works the same as db_get_value(TID_STRING), except that the string value is returned into an std::string object,
memory allocation is handled by std::string and there is no string length limit (other than std::string limits).

Accessing string arrays is done explicitly via an "index" parameter, if index is bigger than odb array size DB_OUT_OF_RANGE is returned
without logging an error message (e.g. db_get_data_index() will log an error). This makes is safe to iterate over array entries with a simple
loop of index from 0 and up until db_get returns an error.

As before, if the odb entry does not exist, it will be created (if create==true) and initialized with the value of the string parameter (zero-terminated in odb).

There is also newly added db_set_value_string() and cm_get_path_string(). if you want more of these, please ask, or send patches.

K.O.
  1279   26 Apr 2017 Francesco RengaForumProblem with logger at run start
Dear experts,
    we have a problem when trying to run a MIDAS DAQ which worked in the past on the same PC (but on a different
network). We get the following error messages when starting a new run:

Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:9106:rpc_client_connect,ERROR] cannot connect to host "scar
lett", port 44858: connect() returned -1, errno 113 (No route to host)
Wed Apr 26 23:03:12 2017 [mhttpd,ERROR] [midas.c:3539:cm_transition_call,ERROR] cannot connect to client "Lo
gger" on host scarlett, port 44858, status 503

(scarlett is indeed the hostname of the PC). The error occurs even if the PC is disconnected from the network.

Any suggestion?

Best Regards,
        Francesco
  1278   24 Apr 2017 Stefan RittBug Reportstop form odbedit broken
> CSS File = STRING : [1024] mhttpd.css
> Sqlite dir = STRING : [1024]
> History dir = STRING : [1024]
> Sound = STRING : [1000] alarm.mp3

After a quick discussion with Konstantin, I changed these strings to a length of 256 chars 
(MAX_STRING_LENGTH). Actually all changes I had to made was on code introduced by KO, so I hope I 
did everything correctly. He should carefully check my changes (actually I would have preferred if he 
could change his code himself...).

I agree with KO that the preferred format for saving the ODB should be JSON, but there might be 
experiments with have some old ODB dumps in other formats, so we should not remove the possibility to 
read those formats back.

Stefan
  1277   22 Apr 2017 Konstantin OlchanskiBug ReportMAX_STRING_LENGTH, stop form odbedit broken
> ODB name lengths (the name of a key) are limited to 256 characters, the length of strings in the ODB should NOT be limited.

Right, I was not ever aware of such limitation until I just now looked at the .odb and .xml writing code. Definitely string length
is truncated to MAX_STRING_LENGTH on writing, chokes or truncates on reading.

The new json reader/writer handles overlength strings correctly. I would say we should deprecate the old formats and go forward
with json. Most current software can work with json data much easier than xml or custom .odb.

> I see now that db_paste & co. is hopelessly broken. To fix it, everything should be changed to std::string which is in my opinion 
> the only 'clean' solution. That would also remove the cumbersome strlcpy and strlcat.

Yes, that's the code for reading .odb format.

>
> But looking at odb.c, replacing everything with std::string would probably take a brave programmer a couple of weeks. Not sure if we should dive into that adventure right now.
>

I agree. Too much of an adventure.

Simpler solution could be add a db_get_data(), db_get_value() that allocates a data buffer of correct size (user has to remember to free it).

> a) The strings "CSS File", "Sqlite dir" etc. reported below get reduced to 256 characters (MAX_STRING_LENGTH).

We should fix the inconsistency, my vote is it should be either MAX_STRING_LENGTH or PATH_MAX (from limits.h).

K.O.
  1276   22 Apr 2017 Konstantin OlchanskiBug ReportMAX_STRING_LENGTH, stop form odbedit broken
> > Fixed a small buglet, now saving and reloading odb in the old ".odb" format will silently truncate all overlong strings to 256 bytes. (I think it always did that).
> 
> Not sure that we want that. There might be cases where people want to store long strings. I would remove the truncation completely when saving .odb or .xml files, and fix the load routines to 
> deal with overlong strings.
> 

Since I just looked at the code for reading/writing .odb format, I see that it uses fixed size buffer for reading lines from a file,
currently 2*MAX_STRING_LENGTH). I am not in the mood to rewrite and retest all that code. Never looked at the xml reader,
probably has same problem (xml writer truncates long strings via truncation in db_sprintf()).

Since we already have the json odb reader/writer that handles unlimited string length correctly (also handles unicode and
unusual odb names), perhaps we should make json as the default and be done with it.

K.O.
  1275   19 Apr 2017 Stefan RittBug ReportMAX_STRING_LENGTH, stop form odbedit broken
> Fixed a small buglet, now saving and reloading odb in the old ".odb" format will silently truncate all overlong strings to 256 bytes. (I think it always did that).

Not sure that we want that. There might be cases where people want to store long strings. I would remove the truncation completely when saving .odb or .xml files, and fix the load routines to 
deal with overlong strings.

Stefan
  1274   19 Apr 2017 Stefan RittBug ReportMAX_STRING_LENGTH, stop form odbedit broken
ODB name lengths (the name of a key) are limited to 256 characters, the length of strings in the ODB should NOT be limited. At some point we wanted to have complete web pages inside the ODB, 
which for sure are longer than 256 characters. While this was the idea, I see now that db_paste & co. is hopelessly broken. To fix it, everything should be changed to std::string which is in my opinion 
the only 'clean' solution. That would also remove the cumbersome strlcpy and strlcat.

But looking at odb.c, replacing everything with std::string would probably take a brave programmer a couple of weeks. Not sure if we should dive into that adventure right now. The quick fix would be:

a) The strings "CSS File", "Sqlite dir" etc. reported below get reduced to 256 characters (MAX_STRING_LENGTH). The value of 256 characters came from the file system limitation in linux (some many 
years ago), where a full path of a file could not exceed 256 characters. Not sure if this limit is still valid today, but having all file names in the ODB limited to 256 characters is maybe not a bad idea 
anyhow (who wants to type in file names with more than 256 characters ???).

b) Change the max string length in db_paste to 1024 to cover the few exceptions above.


If we go with a), KO has to change his ODB file names, in case of b) I can do the change.

So what is your opinion?

Best regards,
Stefan

> > 
> > I shall check on the use of MAX_STRING_LENGTH at least in odb itself...
> > 
> 
> Ok, I looked at the use of MAX_STRING_LENGTH in ODB (odb.c):
> 
> a) it is not used in any critical places for the database itself, so it is not a limit on maximum length of TID_STRING data. good.
> b) it is used in the code for saving/loading odb from .odb files (old format), not sure how it works against overlong strings, but probably 
> truncates/corrupts/crashes.
> c) it is used in the code for saving odb to odb.xml files. Overlong strings are truncated (I added a message about it).
> d) code for loading/saving to json files handles overlong strings okey.
> e) odbedit "ls" truncates overlong strings, mhttpd has some oddities against overlong strings.
> f) db_sprintf() truncates string text to MAX_STRING_LENGTH to avoid output buffer overflow (should use db_snprintf() instead).
> 
> Conclusion, overlong strings should be okey, but do not use the old .odb and .xml save files. (mlogger saves odb to output .mid file in xml 
> format, we should switch it to use json format).
> 
> > > CSS File = STRING : [1024] mhttpd.css
> > > Sqlite dir = STRING : [1024]
> > > History dir = STRING : [1024]
> > > Sound = STRING : [1000] alarm.mp3
> > > are exceeding the MAX_STRING_LENGTH 256 (defined in msystem.h)
> 
> So these should not cause any corruption or problem unless actual content length exceeds 255 bytes,
> even then they are okey if odb is only saved and loaded into json files.
> 
> > > 1) I get the error message that some strings are too long (exceeding
> > > MAX_STRING_LENGTH). Unfortunately the underlying routine doesn't tell which ODB
> > > variables this is.
> 
> this is in db_check_record(), where it compares odb content with user-supplied data descriptions (there is no system-supplied
> data descriptions with strings longer than MAX_STRING_LENGTH).
> 
> so I think what happened is you created a data structure with overlong strings, passed it to db_paste() or something,
> db_check_record() complained about it, and db_paste() corrupted memory.
> 
> > > 
> > > 2) After this reload, essentially nothing is working anymore. Any client I tried to start just crashed.
> > > 
> 
> Somebody corrupted some shared memory, most likely it was db_paste() corrupted odb shared memory.
> 
> K.O.
  1273   18 Apr 2017 Andreas SuterBug Reportrun start/stop oddity
I stumbled over an oddity which I would like to understand better. Here the
boundaries:
- Enable non-localhost RPC -> y
- Disable RPC hosts check  -> y

1) I am starting a run from ODBedit (start now -v):

07:13:11.272 2017/04/19 [ODBEdit,INFO] Run #26 started

07:13:25.516 2017/04/19 [Logger,LOG] File '/data/max/dlog/lem17_0026.root'
CRC32C checksum: 0x05ca4e7e, 1523383 bytes

On this little test experiment there is not much running, but it already shows
the effect I wanted to understand.

2) I am stopping the run from ODBedit (stop -v):

07:13:25.519 2017/04/19 [ODBEdit,INFO] Run #26 stopped

So, everything looks perfectly fine up to this point.

3) Now the 'strange' thing happens. To any point in time after this, I will stop
ODBEdit which results in the following messages:

07:13:32.335 2017/04/19 [ODBEdit,INFO] Program ODBEdit on host pc7962 stopped

07:13:32.335 2017/04/19 [Logger,ERROR] [midas.c:14079:rpc_server_receive,ERROR]
rpc check ok, abort canceled

This I do NOT understand! It looks as if the Logger (or any other client which
gets the run state transition) thinks that some Client (here ODBEdit) has a
broken connection. At least this is how I understand the comment in midas.c /
rpc_server_receive(). Is something broken in the de-registration from the RPC
server? By the way, all clients where running on the localhost, i.e. no remote
connection used here.

All this only happens if a run transition took place.

Unfortunately I do not understand the system well enough to suggest any fix to
this :-( and hence would appreciate any help. 
  1272   15 Apr 2017 Konstantin OlchanskiBug ReportMAX_STRING_LENGTH, stop form odbedit broken
> > 
> > I shall check on the use of MAX_STRING_LENGTH at least in odb itself...
> > 
> 
> Ok, I looked at the use of MAX_STRING_LENGTH in ODB (odb.c):
> 

Fixed a small buglet, now saving and reloading odb in the old ".odb" format will silently truncate all overlong strings to 256 bytes. (I think it always did that).

K.O.
  1271   15 Apr 2017 Konstantin OlchanskiBug ReportMAX_STRING_LENGTH, stop form odbedit broken
> 
> I shall check on the use of MAX_STRING_LENGTH at least in odb itself...
> 

Ok, I looked at the use of MAX_STRING_LENGTH in ODB (odb.c):

a) it is not used in any critical places for the database itself, so it is not a limit on maximum length of TID_STRING data. good.
b) it is used in the code for saving/loading odb from .odb files (old format), not sure how it works against overlong strings, but probably 
truncates/corrupts/crashes.
c) it is used in the code for saving odb to odb.xml files. Overlong strings are truncated (I added a message about it).
d) code for loading/saving to json files handles overlong strings okey.
e) odbedit "ls" truncates overlong strings, mhttpd has some oddities against overlong strings.
f) db_sprintf() truncates string text to MAX_STRING_LENGTH to avoid output buffer overflow (should use db_snprintf() instead).

Conclusion, overlong strings should be okey, but do not use the old .odb and .xml save files. (mlogger saves odb to output .mid file in xml 
format, we should switch it to use json format).

> > CSS File = STRING : [1024] mhttpd.css
> > Sqlite dir = STRING : [1024]
> > History dir = STRING : [1024]
> > Sound = STRING : [1000] alarm.mp3
> > are exceeding the MAX_STRING_LENGTH 256 (defined in msystem.h)

So these should not cause any corruption or problem unless actual content length exceeds 255 bytes,
even then they are okey if odb is only saved and loaded into json files.

> > 1) I get the error message that some strings are too long (exceeding
> > MAX_STRING_LENGTH). Unfortunately the underlying routine doesn't tell which ODB
> > variables this is.

this is in db_check_record(), where it compares odb content with user-supplied data descriptions (there is no system-supplied
data descriptions with strings longer than MAX_STRING_LENGTH).

so I think what happened is you created a data structure with overlong strings, passed it to db_paste() or something,
db_check_record() complained about it, and db_paste() corrupted memory.

> > 
> > 2) After this reload, essentially nothing is working anymore. Any client I tried to start just crashed.
> > 

Somebody corrupted some shared memory, most likely it was db_paste() corrupted odb shared memory.

K.O.
  1270   15 Apr 2017 Konstantin OlchanskiBug Reportstop form odbedit broken
> when I try to stop a run from odbedit I get a core dump.
> [ODBEdit1,INFO] Run #31 stopped odbedit: src/system.c:1223: ss_shm_flush:
> Assertion `size == mmap_size[handle]' failed. Aborted (core dumped)
> 

I am puzzled. The crash is at the very end of everything (save odb shared memory to odb.shm),
does the run actually stop, or the crash is before the run is fully stopped? (I guess if you want
to run more odbedit commands after stopping the run, so you care about not crashing).

K.O.
  1269   15 Apr 2017 Konstantin OlchanskiBug Reportbadly managed case in history_schema.cxx: dat file empty
> For an unknown reason, Logger died few days ago while writing the history. The
> file mhf_1489577446_20170315_system.dat was created, but was empty.

I ran into same problem installing new midas in the alpha experiment at cern. It should be fixed now:
https://bitbucket.org/tmidas/midas/commits/788021d9cb39a348a40e36f1b35b1440e06aa744

K.O.

> 
> When trying to restart Logger, I would get a seg fault without any special error
> message.
> 
> I tracked the issue to the "read_file_schema" function in history_schema.cxx
> 
> * L4731, a pointer to HsFileSchema *s is declared.
> * L4747, We enter a while(1) loop.
> * L4749, get char on the filename.
> In our case, the file was empty, so the variable "b" gets NULL and the loop breaks.
> 
> Problem: the memory allocation for "s" is later in the loop, L4768.
> Upon exiting the loop, L4854, we try to access record_size on a NULL pointer ==>
> SegFault. 
> 
> It would be nice to at least have a message before breaking the loop...
  1268   15 Apr 2017 Konstantin OlchanskiBug Reportwhere to report bugs, stop form odbedit broken
>
> What is the preferred channel to report potential bugs (elog / bitbucket issues)? 
>

I prefer that bugs be reported on this forum here. Most bugs affect every midas user, so best to notify the 
whole community.

Bitbucket have a nice bug tracking system, but there is a couple of problems:
a) only a couple of people see the bug reports for midas, minimizing probability of fix.
b) bug reports on bitbucket stay on bitbucket, we do not have backups and archives
of bug reports, if tomorrow bitbucket goes belly-up, our bug database goes poof! with them.
c) I can search the bug report on this forum using "grep" (i am sure there is a "find" button
on the bitbucket web page and it finds what I am looking for right away).

So if you have a bug report that others should know about (i.e. the "+" button on the status page does 
not work), I say use this forum.

If you have a bug that you think is unique to you, not interesting to others (i.e. my midas crashes when I 
do X), file it on bitbucket. If you see no activity on the bitbucket for a week or two, repost it here.

K.O.
ELOG V3.1.4-2e1708b5