Back Midas Rome Roody Rootana
  Midas DAQ System, Page 74 of 150  Not logged in ELOG logo
ID Date Authordown Topic Subject
  266   08 Jun 2006 Konstantin OlchanskiBug ReportMidas does not build on Fedora 5
Fresh svn checkout of MIDAS does not build on Fedora 5, I get this error:

cc -c -g -O2 -Wall -Wuninitialized -Iinclude -Idrivers -I../mxml -Llinux/lib
-DINCLUDE_FTPLIB   -D_LARGEFILE64_SOURCE -DHAVE_ROOT -pthread
-I/triumfcs/trshare/olchansk/root/root_v5.10.00_SL40/include -m32 -DOS_LINUX
-fPIC -Wno-unused-function -o linux/lib/odb.o src/odb.c
src/odb.c: In function 'db_open_database':
src/odb.c:805: warning: dereferencing type-punned pointer will break
strict-aliasing rules
src/odb.c: In function 'db_lock_database':
src/odb.c:1350: warning: dereferencing type-punned pointer will break
strict-aliasing rules
cc: Internal error: Segmentation fault (program cc1)
Please submit a full bug report.
See <URL:http://bugzilla.redhat.com/bugzilla> for instructions.
make: *** [linux/lib/odb.o] Error 1

If I compile odb.c without "-O2", the rest of MIDAS builds without any more errors.

The observed warnings are (I do not know what they mean):
warning: dereferencing type-punned pointer will break strict-aliasing rules
warning: missing sentinel in function call (Cannot do without sentinels, eh?)
warning: pointer targets in passing argument 3 of 'getsockname' differ in signedness
warning: non-local variable '<anonymous struct> out_info' uses anonymous type

The "invalid lvalue" errors seem to have been successfully vanquished.

K.O.
  277   25 Jul 2006 Konstantin OlchanskiBug Reportmhttpd passwords broken for MacOS 10.4 Safari
I observe that the mhttpd passwords do not work correctly for the Safari web browser on MacOS 10.4.7: 
Safari 2.0.4 (419.3). For example, I cannot submit elog messages- the system gets stuck on the 
"Password" page. The Safari browser in MacOS 10.3 works fine. Mozilla/Firefox works fine. (Also would be 
useful if "remember password" worked with MIDAS, in any browser). K.O.
  281   28 Jul 2006 Konstantin OlchanskiBug Fixmhttpd: use more strlcpy(), fix a few bugs
While investigating the mhttpd password error with the MacOS Safari browser, I
found that it was caused by an strcpy() buffer overflow. With Stefan's blessing,
I now converted most uses of strcpy() and strcat() to strlcpy() and strlcat().

This fixes the Safari password problem (it was memory corruption in mhttpd).

While validating these changes, I also found an incorrect use of sizeof() in the
mhttpd history code for plotting run markers. I fixed that as well.

P.S. The remaining strcpy() calls look safe wrt buffer overflows. There are no
strcat() calls left. But there is still a large number of unsafe-looking
sprintf() uses.

K.O.
  282   31 Jul 2006 Konstantin OlchanskiBug FixFix user memory corruption in ODB
We have been seeing consistent user memory corruption while setting up a new
experiment. This has been traced to a user memory overwrite in ODB db_set_data()
function and this problem is now fixed. This error was triggered by our frontend
code constantly changing the size of a MIDAS data bank that was also written
into ODB via the RO_ODB option. K.O.
  283   01 Aug 2006 Konstantin OlchanskiBug FixUser-tunable buffer sizes
By default, MIDAS creates shared memory event data buffers of default size
EVENT_BUFFER_SIZE defined in midas.h and until now making of large data buffers
for high data rate or large event size experiments was complicated.

Now, bm_open_buffer() will try to read the event buffer size from ODB. If
"/Experiment/Buffer Sizes/BUFFER_NAME" of type DWORD exists, it's value is used
as the buffer size, overriding the default value.

For example, to increase the size of the default MIDAS event buffer ("SYSTEM")
to 2000000 bytes, shutdown all MIDAS programs, delete the old .SYSTEM.SHM file
(and the shared memory segment, using ipcrm). Then run odbedit, cd /Experiment,
mkdir "Buffer Sizes", cd "Buffer Sizes", create DWORD SYSTEM, set SYSTEM
2000000. Then start the rest of the MIDAS programs. Check that the buffer has
the correct size by looking at the size of .SYSTEM.SHM and of the shared memory
segment (ipcs).

This method work for all MIDAS buffers, except for ODB, where the size has to be
specified at creation time using the odbedit command "-s" argument.

K.O.
  290   07 Aug 2006 Konstantin OlchanskiBug FixRefactoring and rewrite of event buffer code
In close cooperation with Stefan, I refactored and rewrote the MIDAS event
buffering code (bm_send_event, bm_flush_cache, bm_receive_event and bm_push_event).

The main goal of this update is to make sure the event buffering code does not
have any infinite loops: in the past, we have seen mlogger and some frontends
loop forever consuming 100% CPU in the event buffering code. This should now be
completely fixed.

As additional bonuses, the refactored code is easier to read, has less code
duplication and should be more robust. A few potential logical problems have
been corrected and one case of reproducible infinite looping has been fixed.

The new code has passed the low-level consumer-producer tests, but has not yet
been used in anger in any real experiment. One hopes any new bugs introduced
would cause outright failures and core dumps (rather than silent data corruption).

All are welcome to try the new code. If it explodes, please send me the error
messages, stack traces and core dumps.

K.O.
  291   07 Aug 2006 Konstantin OlchanskiBug FixFix crash in mfe.c
Some time ago, I accidentally introduced a bug in mfe.c- if there is data
congestion in the system, mfe.c can exit with the error "bm_flush_cache(ASYNC)
error 209" because it did not expect the valid return value BM_ASYNC_RETURN
(209) from bm_flush_cache(ASYNC). This error has now been fixed. K.O.
  292   09 Aug 2006 Konstantin OlchanskiBug FixRefactoring and rewrite of event buffer code
> In close cooperation with Stefan, I refactored and rewrote the MIDAS event
> buffering code (bm_send_event, bm_flush_cache, bm_receive_event and bm_push_event).
>
> All are welcome to try the new code. If it explodes, please send me the error
> messages, stack traces and core dumps.

Stefan quickly found one new error (a typoe in a check against infinite looping) and
then I found one old error present in the old code that caused event loss when the
buffer became exactly 100% full (0 bytes free).

Both errors are now fixed in svn commit 3294.

K.O.
  294   17 Aug 2006 Konstantin OlchanskiBug Report"double" values are truncated
The mhttpd ODB displays and mhist truncate values of "float" and "double"
floating point variables to 6 digits. In reality, "float" has 7 significant
digits and "double" has 16. I recommend that db_sprintf() in odb.c be changed to
read this:

      case TID_FLOAT:
         sprintf(string, "%.7g", *(((float *) data) + index));
         break;
      case TID_DOUBLE:
         sprintf(string, "%.16g", *(((double *) data) + index));
         break;

K.O.
  296   19 Aug 2006 Konstantin OlchanskiBug Fixfixes for minor mhttpd problems
I commited fix for minor mhttpd problems (rev 3314):
- for a newly created experiment, the "history" button gave the error [history
panel "" does not exist] (new problem introduced in revision 3150)
- for very long history panel names (close to the 32-character limit) history
plots produce the error "Cannot find /history/display/foo/bar/variables" (broke
in revision 3190 "use strlcpy()", in previous revisions, this bug was silent
stack corruption)
- elog attachments did not work for file names containing character plus (+)
(attachement URLs should be properly encoded to escape special CGI characters)
K.O.
  297   26 Aug 2006 Konstantin OlchanskiBug Fixfixes for minor mhttpd problems
> I commited fix for minor mhttpd problems (rev 3314):
> - elog attachments did not work for file names containing character plus (+)
> (attachement URLs should be properly encoded to escape special CGI characters)

I accidentally indirectly learned that the above change produced incorrect URLs
when more than one experiment is defined. I now commited a fix to this problem.

K.O.
  299   04 Sep 2006 Konstantin OlchanskiBug FixFix MIDAS on MacOS 10.4.7
I commited minor fixes for building MIDAS on MacOS 10.4.7:
1) there is no linux/unistd.h
2) gcc 4.0.0 does not like "struct { ... } var;" although "struct Foo { ... } var;" is fine
3) there is no "_syscall0(...)" macro
4) there is no "gettid()", I used pthread_self() instead.
K.O.

P.S. ss_gettid() returns "int" instead of "midas_thread_t" (pthread_t, really). On MacOS 10.4.7 at least, 
pthread_t appears to be a pointer, not an int. Is that right?
  300   05 Sep 2006 Konstantin OlchanskiForumForums moved from dasdevpc.triumf.ca to ladd00.triumf.ca
For the record, the MIDAS (& co) forums have been physically moved from
dasdevpc.triumf.ca to our new server machine ladd00.triumf.ca. This change
should be transparent to all users, but if anything stops working, please let me
know at olchansk-at-triumf-dot-ca. K.O.
  304   23 Sep 2006 Konstantin OlchanskiBug Reportmhttpd elog corruption via double-edit
Aparently the mhttpd elog will corrupt the elog files if two (or more?) elog entries are being edited at the 
same time. K.O.
  305   23 Sep 2006 Konstantin OlchanskiBug Fixcommit latest ccusb.c CAMAC-USB driver
> I commited the latest driver for the Wiener CCUSB USB-CAMAC driver. It
> implements all functions from mcstd.h and has been tested to be plug-compatible
> with at least one of our CAMAC frontends. K.O.

This driver is known to not work with the latest CCUSB firmware (20x, 204, 30x, 303). I know what 
modifications are required and an updated driver will be available shortly. If there is a delay, and you need the 
driver ASAP, please drop me an email.

Also, I am thinking about dropping support for the very old CCUSB firmware revisions (before 204). (Any 
comments?)

K.O.
  307   27 Sep 2006 Konstantin OlchanskiBug Reportmhttpd elog corruption via double-edit
[quote="Stefan Ritt"][Quote="K.O.]Aparently the mhttpd elog will corrupt the
elog files if two (or more\?) elog entries are being edited at the same time.
K.O.[/quote]

The corruption is very simple. mhttpd elog indexes the elog entries by the elog
file and offset inside the file, i.e. "http://ladd00:8088/EL/060927.318",
"060927" corresponds to log file "060927.log", "318" is the offset inside the
file where the message is located.

During "edit", the code "remembers" the offset of the original message and in
el_submit() blindly writes the edited message into the file at the remembered
offset.

If another message was edited before the edit of the first message is submitted,
the remembered offset becomes invalid (messages have shifted inside the file)
and el_submit() writes the edited text into the wrong place in the file,
corrupting it.

I have now added a check for this and we crash instead of corrupting the elog
file (midas.c rev 3340).

I do not know how to "properly" fix this bug without changing the indexing
scheme to something similar to what is used by elogd- message numbers instead of
file indices. In the existing scheme, message editing also breaks URLs shown in
the email notifications (they contain file indices that point to the wrong
places after messages are moved around by editing) and "reply threading" links.

Here is how I reproduce this bug:

1) start with an empty elog
2) create two messages
3) "edit" the second message, but do not submit it yet.
4) "edit" the first message, change the text to make sure the message size
becomes different; submit this change.
5) submit the "edit" of the first message. !!BOOM!!

K.O.
  308   27 Sep 2006 Konstantin OlchanskiSuggestionIncrease of maximum event size
> The current event size in midas is limited to 512k (MAX_EVENT_SIZE in midas.h)

Yes, 512 kBytes is rather small. For the T2K prototype TPC DAQ, I built and ran
MIDAS with 4 MByte events, and it worked fine.

Now, we have per-buffer tunable size (see message
https://ladd00.triumf.ca/elog/Midas/283) and in the long run, I would prefer the
compiled-in limit to go away: already all memory is allocated dynamically and
the MAX_EVENT_SIZE is only useful as kind of a sanity check against frontend
misconfiguration or against malformed events.

If MAX_EVENT_SIZE goes away, the maximum event size becomes limited by the
largest SysV shared memory segment permitted by Linux (via sysctl kernel.shmmax).

To go beyound the limit on SysV shared memories, on can use mmap() based shared
memory: this is limited by available RAM+swap (and disk space for the
.SYSTEM.SHM file). Current MIDAS system.c has an experimental implementation of
mmap() shared memory, but AFAIK it has not been used in any production system, yet.

K.O.
  325   22 Jan 2007 Konstantin OlchanskiForumMidas on a x86_64
>    has anyone managed to get midas to work on a x86_64 processor. I followed the
> instructions for the 64-bit opteron but i am getting runtime error when trying
> the examples.


We run 64-bit MIDAS on RHEL4 with 64-bit ROOT and everything generally works,
except for compatibility problems with 32-bit MIDAS.

Everything should work if you ensure that on your 64-bit machine everything is
compiled 64-bit (including the mserver - we always forget to install the correct version
to /usr/local/bin). 32-bit MIDAS programs running on other machine
can talk to 64-bit MIDAS via the mserver.

The big problem is that 64-bit and 32-bit ODB turned out to be incompatible - several data
fields have different sizes - and we did not decide yet how to fix this. Any fix will involve
breaking the binary ODB for one of the two platforms (we could break both, just to be fair, heh!)

>  When running example/basic/odb_test I getting errors like
> [odb.c:6818:db_get_record] struct size mismatch for "/Alarms/Alarms/Demo ODB" (464 instead of 452)

Yes, data size mismatch errors indicates that you mixed 32-bit and 64-bit MIDAS. Recompile everyting
as 64-bit, remove all the dot-ODB files, remove all the shared memory segments (ipcrm),
then everything should work.

K.O.
  338   05 Feb 2007 Konstantin OlchanskiBug Reportwrong version in include/midas.h?
The present .../include/midas.h contains
[alpha@laddvme06 ~/online]$ grep 1.9.5 /home/alpha/packages/midas/include/*
/home/alpha/packages/midas/include/midas.h:#define MIDAS_VERSION "1.9.5"

All MIDAS utilities (odbedit ver) presently report version 1.9.5, even for svn
trunk, and this may confuse people as to what version of midas they are using,
and may complicate reporting of bugs.

Perhaps the trunk version should say something like "svn-22233344" (the svn
revision number)? The present "1.9.5" is wrong...

K.O.
  343   11 Feb 2007 Konstantin OlchanskiInfosvn and "make indent" trashed my svn checkout tree...
Fuming, fuming, fuming.

The combination of "make indent" and "svn update" completely trashed my work copy of midas. Half of 
the files now show as status "M", half as status "C" ("in conflict"), even those I never edited myself (e.g. 
mscb firmware files).

I think what happened as that once I ran "make indent", the indent program did things to the source 
files (changed indentation, added spaces in "foo(a,b,c); --> foo(a, b, c);" etc, so now svn thinks that I 
edited the files and they are in conflict with later modifications.

I suggest that nobody ever ever ever should use "make indent", and if they do, they should better 
commit their "changes" made by indent very quickly, before their midas tree is trashed by the next "svn 
update".

And if they commit the changes made by "make indent", beware that "make indent" is not idempotent, 
running it multiple times, it keeps changing files (keeps moving some dox comments around).

Also beware of entering a tug-of-war with Stefan - at least on my machines, my "make indent" seems 
to produce different output from his.

Still fuming, even after some venting...
K.O.
ELOG V3.1.4-2e1708b5