Back Midas Rome Roody Rootana
  Midas DAQ System, Page 144 of 145  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
ID Date Author Topic Subject
  111   31 Oct 2003 Konstantin Olchanski Do not frob "/runinfo" in mhttpd.c
I found where we tickle the race condition in db_create_record().

1) in mhttpd.c,  every time we show the status page, we call
db_create_record(hDB, 0, "/Runinfo", strcomb(runinfo_str));
2) internally db_create_record() deletes /RunInfo
3) other programs read "/runinfo/run number" while it is deleted do not
check for the db_get_value() error code and happily get a zero run number.

Stephan fixed the race condition, and now I commited an mhttpd.c change that
only calls db_create_record(hDB, 0, "/Runinfo", strcomb(runinfo_str)); if
/runinfo does not exist. This seems to be redundant with a similar call in
cm_connect_experiment1(), called each time a new client starts up.

Files changed:
src/mhttpd.c

K.O.
  107   31 Oct 2003 Konstantin Olchanski more odb "run number" error checking
I added error checking to the places where we read "/runinfo/run number". In
general, I do this:

  status = db_get_value("/runinfo/run number",&run_number);
  assert(status==SUCCESS);
  assert(run_number >= 0); (and run_number>0, where appropriate)

Here is the rationale: if we cannot read the run number, something must be
very terribly wrong. I cannot think of any recovery action other than
abort() and make a core dump for our debugging enjoyment.

I considered and rejected adding a "retry" loop: if we allow db_get_value()
to intermittently fail, then it's every use has to be wrapped in a retry
loop, which then should be inside db_get_value(), making it pointless to
have external "retry" loops.

I am now pondering on proposing a "db_get_value_cannot_possibly_fail()"
function (it would abort(), exit() with an error or commit harakiri if it
can't get the value). They way most db_xxx() functions are used in midas,
maybe they should be made "void" and "unfailible", with "STATUS
db_xxx_yes_I_can_fail_and_return_an_error_code()" evil twins. I guess this
is why "they" invented C/C++ exceptions. Anyway, something to think about.

Affected files:
src/lazylogger.c
src/odbedit.c
src/mlogger.c
src/mfe.c
src/odb.c
src/mana.c
src/midas.c
src/mhttpd.c

K.O.
  117   31 Oct 2003 Konstantin Olchanski Disable "tab"s in xemacs
The default C indentation style in xemacs uses "tab" characters, violating
the MIDAS coding convention. To disable this misfeature in xemacs (emacs
too?), put this incantation in your .xemacs/custom.el file:

(custom-set-variables
 '(indent-tabs-mode nil))

K.O.
  114   31 Oct 2003 Konstantin Olchanski mana.c without ROOT and HBOOK
Stephan, why did you prohibit building mana.c without ROOT and HBOOK
support? I think such a configuration is valid and should be allowed.

Also, this prohibition broke the Midas Makefile, it now bombs building
mana.c. The Makefile is setup for building hmana.c with HBOOK support,
rmana.c with ROOT support (if ROOTSYS is set) and mana.c without HBOOK and
ROOT support (currently bombs on #error in mana.c).

K.O.
  118   30 Oct 2003 Stefan Ritt Fixed several potential problems for ODB corruption
I just realized that db_set_value, db_set_data, db_set_num_values and 
db_merge_data do not check for num_values == 0. With such a parameter the 
ODB can become corrupted, since zero length ODB entries are not allowed. I 
fixed the according places in odb.c and committed the changes. Everyone 
with ODB corruption problems should update that code.
  119   30 Oct 2003 Stefan Ritt 'umask' added to lazylogger for FTP connections
I had to add a 'umask' opiton to the loggers (lazy and mlogger) for the new 
PSI archive. One can now put a filename into the settings like:

archive,21,user,pw,dir,run%05d.mid,026

where the optional last parameter is used for a "umask 026" command just 
sent to the FTP server after the connection has been established. This 
changes the mode bits of the newly transferred file. We needed that so that 
the files are group readable, since several people from one group want to 
read the data.

I committed mlogger.c and ybos.c which contains the ftp code (should 
actually go into lazylogger.c instead of ybos.c).
  121   28 Oct 2003 Stefan Ritt Updated thread functions
> ss_thread_create now returns the thread ID on success, and zero on failure.
> Previously returned SS_SUCCESS or SS_NO_THREAD. User must now test the
> return value to determine result.
> 
> ss_thread_kill added to kill the passed thread ID. Returns SS_SUCCESS or
> SS_NO_THREAD.
> 
> Any thread creation must be verified now, and old code must be examined to
> ensure the return value is checked.

Thank you for that post. Internally, threads are not use in midas, so there 
should be no problem. Only experiments using threads explicitly should take 
care.
  120   16 Oct 2003 David Morris Updated thread functions
ss_thread_create now returns the thread ID on success, and zero on failure.
Previously returned SS_SUCCESS or SS_NO_THREAD. User must now test the
return value to determine result.

ss_thread_kill added to kill the passed thread ID. Returns SS_SUCCESS or
SS_NO_THREAD.

Any thread creation must be verified now, and old code must be examined to
ensure the return value is checked.
  124   15 Oct 2003 Stefan Ritt test
> > test
> > test
> > test
> 
> another test
> 
> K.O.

I got the two email notifications, if you have tried that...
  123   15 Oct 2003 Konstantin Olchanski test
> test
> test
> test

another test

K.O.
  122   15 Oct 2003 Konstantin Olchanski test
test
test
test
  128   13 Oct 2003 Stefan Ritt mhttpd: add Elog text to outgoing email.
> > > around to implement it, until now. I also added assert() traps for the 
most
> > > common array overruns in the Elog code.
> > 
> > In addition to the assert() one should use strlcat() and strlcpy() all 
over 
> > the code to avoid buffer overruns. The ELOG standalone code does that 
already 
> > properly.
> > 
> > - Stefan
> 
> Yes, the original authors should have used strlcat(). Now that I uncovered 
this source of mhttpd 
> memory corruption, maybe some volunteer will fix it up properly.
> 
> K.O.

I am the original author and will fix all that once I merged mhttpd and elog. 
Due to my current task list, this will happen probably in November.

- Stefan
  132   13 Oct 2003 Konstantin Olchanski Array overruns in mhttpd.c::submit_elog()
> > > While adding new functionality to submit_elog() ....
> 
> The whole elog functionality in mhttpd will be replaced (sometime) ...

I humbly submit that this has been the standard reply for the last 2 years since I was aware of 
the "last N days does not always work" problem (just saw it again yesterday).

K.O.
  127   13 Oct 2003 Konstantin Olchanski mhttpd: add Elog text to outgoing email.
> > around to implement it, until now. I also added assert() traps for the most
> > common array overruns in the Elog code.
> 
> In addition to the assert() one should use strlcat() and strlcpy() all over 
> the code to avoid buffer overruns. The ELOG standalone code does that already 
> properly.
> 
> - Stefan

Yes, the original authors should have used strlcat(). Now that I uncovered this source of mhttpd 
memory corruption, maybe some volunteer will fix it up properly.

K.O.
  126   13 Oct 2003 Stefan Ritt mhttpd: add Elog text to outgoing email.
> around to implement it, until now. I also added assert() traps for the most
> common array overruns in the Elog code.

In addition to the assert() one should use strlcat() and strlcpy() all over 
the code to avoid buffer overruns. The ELOG standalone code does that already 
properly.

- Stefan
  131   13 Oct 2003 Stefan Ritt Array overruns in mhttpd.c::submit_elog()
> > While adding new functionality to submit_elog() (add the message text to 
the
> > outgoing email), I noticed that the email text is being stored into an 
array
> > of size 256, mail_text[256], without any checks for array overrun. This
> > cannot be good. How should this be corrected?
> > K.O.
> 
> Similar problem exists in midas.c::el_submit(). The array "message[10000]" 
is
> easy to overrun by submitting a long elog message.
> 
> K.O.

The whole elog functionality in mhttpd will be replaced (sometime) by the 
standalone ELOG package, linked against mhttpd. The ELOG functionality is 
much richer and does not conatin all the mentioned problems which have been 
fixed there some time ago. For the time being it might however be worth to 
fix the mentioned problems, but without spending too much time on it.
  134   12 Oct 2003 Konstantin Olchanski Refuse to set run number zero
> I am debugging the frequent problem where the run number is mysteriously
> reset to zero. As a first step, I am commiting changes to mhttpd.c and midas.c:
> - abort on obviously corrupted "run number < 0"
> - abort on cm_transition() to run 0 (the only place where the run number is
> explicitely written to ODB)
> - in the mhttpd "Start run" form, reject user setting the run number to <= 0.

- abort on cm_transition() from run 0 to 1 during auto restart in mlogger.

Cvs diff:

RCS file: /usr/local/cvsroot/midas/src/mlogger.c,v
retrieving revision 1.65
diff -r1.65 mlogger.c
3277a3278,3283
>         if (run_number <= 0)
>           {
>           cm_msg(MERROR, "main", "aborting on attempt to use invalid run
number %d", run_number);
>           abort();
>           }
> 

K.O.
  133   12 Oct 2003 Konstantin Olchanski Refuse to set run number zero
I am debugging the frequent problem where the run number is mysteriously
reset to zero. As a first step, I am commiting changes to mhttpd.c and midas.c:
- abort on obviously corrupted "run number < 0"
- abort on cm_transition() to run 0 (the only place where the run number is
explicitely written to ODB)
- in the mhttpd "Start run" form, reject user setting the run number to <= 0.

Here is the CVS diff:

===================================================================
RCS file: /usr/local/cvsroot/midas/src/mhttpd.c,v
retrieving revision 1.253
diff -r1.253 mhttpd.c
2451a2452,2457
>   if (run_number < 0)
>     {
>     cm_msg(MERROR, "show_elog_new", "aborting on attempt to use invalid
run number %d",run_number);
>     abort();
>     }
> 
2506a2513,2519
> 
>     if (run_number < 0)
>       {
>       cm_msg(MERROR, "show_elog_new", "aborting on attempt to use invalid
run number %d",run_number);
>       abort();
>       }
> 
3582a3596,3602
> 
>   if (run_number < 0)
>     {
>     cm_msg(MERROR, "show_form_query", "aborting on attempt to use invalid
run number %d",run_number);
>     abort();
>     }
> 
5730a5751,5756
>   if (rn < 0) // value "zero" is okey
>     {
>     cm_msg(MERROR, "show_start_page", "aborting on attempt to use invalid
run number %d",rn);
>     abort();
>     }
> 
9684a9711,9719
>       if (i <= 0)
>         {
>         cm_msg(MERROR, "interprete", "Start run: invalid run number %d",i);
>         memset(str,0,sizeof(str));
>         snprintf(str,sizeof(str)-1,"Invalid run number %d",i);
>         show_error(str);
>         return;
>         }
> 
Index: src/midas.c
===================================================================
RCS file: /usr/local/cvsroot/midas/src/midas.c,v
retrieving revision 1.193
diff -r1.193 midas.c
3786c3786
<         status = cm_transition(_requested_transition | TR_DEFERRED, 0,
str, 256, SYNC, FALSE);
---
>         status = cm_transition(_requested_transition | TR_DEFERRED, 0,
str, sizeof(str), SYNC, FALSE);
3906a3907,3912
>   if (run_number <= 0)
>     {
>     cm_msg(MERROR, "cm_transition", "aborting on attempt to use invalid
run number %d",run_number);
>     abort();
>     }
> 
16069a16076,16081
>     }
> 
>   if (run_number < 0)
>     {
>     cm_msg(MERROR, "el_submit", "aborting on attempt to use invalid run
number %d", run_number);
>     abort();

K.O.
  125   12 Oct 2003 Konstantin Olchanski mhttpd: add Elog text to outgoing email.
This commit adds the elog message text to the outgoing email message. This
functionality has been requested a logn time ago, but I guess nobody got
around to implement it, until now. I also added assert() traps for the most
common array overruns in the Elog code.

Here is the cvs diff:

Index: src/mhttpd.c
===================================================================
RCS file: /usr/local/cvsroot/midas/src/mhttpd.c,v
retrieving revision 1.252
diff -r1.252 mhttpd.c
768a769
> #include <assert.h>
3740c3741
< char   mail_to[256], mail_from[256], mail_text[256], mail_list[256],
---
> char   mail_to[256], mail_from[256], mail_text[10000], mail_list[256],
3921a3923,3925
>         // zero out the array. needed because later strncat() does not
always add the trailing '\0'
>         memset(mail_text,0,sizeof(mail_text));
> 
3931a3936,3945
> 
>         assert(strlen(mail_text) + 100 < sizeof(mail_text)); // bomb out
on array overrun.
> 
>         strcat(mail_text+strlen(mail_text),"\n");
>         // this strncat() depends on the mail_text array being zeroed out:
>         // strncat() does not always add the trailing '\0'
>        
strncat(mail_text+strlen(mail_text),getparam("text"),sizeof(mail_text)-strlen(mail_text)-50);
>         strcat(mail_text+strlen(mail_text),"\n");
> 
>         assert(strlen(mail_text) < sizeof(mail_text)); // bomb out on
array overrun.
Index: src/midas.c
===================================================================
RCS file: /usr/local/cvsroot/midas/src/midas.c,v
retrieving revision 1.192
diff -r1.192 midas.c
604a605
> #include <assert.h>
16267a16269,16270
> 
>   assert(strlen(message) < sizeof(message)); // bomb out on array overrun.

K.O.
  130   12 Oct 2003 Konstantin Olchanski Array overruns in mhttpd.c::submit_elog()
> While adding new functionality to submit_elog() (add the message text to the
> outgoing email), I noticed that the email text is being stored into an array
> of size 256, mail_text[256], without any checks for array overrun. This
> cannot be good. How should this be corrected?
> K.O.

Similar problem exists in midas.c::el_submit(). The array "message[10000]" is
easy to overrun by submitting a long elog message.

K.O.
ELOG V3.1.4-2e1708b5