Back Midas Rome Roody Rootana
  Midas DAQ System, Page 120 of 136  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
ID Date Author Topic Subject
  323   21 Jan 2007 Denis BilenkoBug Reportbuffer bugs
Hello,

We've been using midas and have stumbled upon some inconsistent behaviour:
1. Blocking calls to midas api aren't usable when client is connected through
mserver. This is true at least for bm_receive_event, but seems to be a more
general problem - midas application has call cm_yield within 10 seconds (or
whatever timeout is set) to remain alive.
That not the case when RPC is not used.

2. On Windows, two processes on the same machine can send/receive events to
each other only if they both use midas locally (through shared mem) or they
both use midas via RPC (through mserver), but not if they use different ways.

3. Receiving/sending same events from the same process - was possible in
1.9.5-1, not so in the current version (revision 3501, mxml revision 45). Is this an intended behavior fix?
To explain how to reproduce bugs, I will use 2 helper programs evprint.py and
evsend.py - for receiving and sending events respectively. You don't need
them, just something to send and receive events. (These are part of pymidas, which will be
released to public any time soon, but is quite usable already).

They both accept
* --path option in "host/experiment" format (for cm_connect_experiment call)
* --log option which command them to trace all midas' calls to terminal

evprint.py have two ways of receiving events
1) via looping over bm_receive_event
2) via providing callback to bm_request_event and looping over cm_yield(400) call

Example of use:
first-console$ python evprint.py receive
second-console$ python evsend.py 123
[first console]
id=2007 mask=2007 serial=2007 time=1169372833 len=3 '123'
So,

1. Blocking calls to midas api aren't usable when client is connected through
mserver.
$ python evprint.py --log --path 127.0.0.1/online receive"
cm_connect_experiment('127.0.0.1', 'online', 'evprint.py', None)
bm_open_buffer('SYSTEM', 1048576, &c_long(2)) -> BM_CREATED
bm_request_event(2, -1, -1, 2, &c_long(0), None)
... wait for a couple of seconds ...
[midas.c:9348:rpc_call] rpc timeout, routine = "bm_receive_event"
[system.c:3570:send_tcp] send(socket=0,size=8) returned -1, errno: 88 (Socket 
operation on non-socket)
[midas.c:9326:rpc_call] send_tcp() failed

bm_receive_event(2, ...) -> RPC_TIMEOUT

bm_remove_event_request(2, 0) -> BM_INVALID_HANDLE
bm_close_buffer(2) -> BM_INVALID_HANDLE
cm_disconnect_experiment()

2. Missing events on windows
a) Both use midas locally - works
   1: python evprint.py receive
   2: python evsend.py 123
   1: id=2007 mask=2007 serial=2007 time=1169372833 len=3 '123'
b) Both use midas via RPC - works
   1: python evprint.py --path 127.0.0.1/ dispatch
   2: python evsend.py --path 127.0.0.1/ 123
   1: id=2007 mask=2007 serial=2007 time=1169373366 len=3 '123'
c) Receiver uses midas locally, sender uses mserver - doesn't work on windows
   1: python evprint.py dispatch
   2: python evsend.py --path 127.0.0.1/ 123
   1: (nothing printed)
d) The other way around - doesn't work on windows
   1: python evprint.py --path 127.0.0.1/ dispatch
   2: python evsend.py 123
   1: (nothing printed)
No such problem on linux.

3. Receiving/sending same events from the same process.
To reproduce this, just request events, send one and then try to receive
it – via cm_yield. I care for this, because I have a test in pymidas which
relies on this behavior.

hope this will help.
  322   11 Jan 2007 Stefan RittForumShared memory problems
That sounds like you mix versions: You have an old executable (maybe your mlogger) which
has been linked against the old midas version, but you create the ODB with the new
odbedit or frontend. The new version complains if it finds an ODB from a previous version
(the error you reported first), but an old program does not have that version check, so
it finds a different binary ODB structure and crashes.

> Thanks for your help.  I tried again and it got me back to the initial problem I had.
>  The frontend will start, and the analyzer starts (complains about there not being a
> last.root, but other than that it's fine), and then when starting mlogger, I get:
> 
> [odb.c:860:db_validate_db] Warning: database corruption, first_free_key 0x0001A4
> 04
> [odb.c:3666:db_get_key] invalid key handle
> [midas.c:1970:cm_check_client] cannot delete client info
> [odb.c:3666:db_get_key] invalid key handle
> [midas.c:1970:cm_check_client] cannot delete client info
> [odb.c:3666:db_get_key] invalid key handle
> 
> 
> And it continues to shoot out error messages about invalid key handles until I kill
> it.  Then trying to start the frontend again fails until I remove the .ODB.SHM file. 
> Any other ideas?
> 
> > > Hello,
> > > 
> > > Just did a fresh install of MIDAS from the SVN repository under CentOS and
> > > everything compiles fine, but when I go to run the frontend (using dio), I get
> > > the following error message:
> > > 
> > > Connect to experiment ...[odb.c:868:db_open_database] Different database format:
> > >  Shared memory is 14, program is 2
> > > [midas.c:1763:cm_connect_experiment1] cannot open database
> > > 
> > > 
> > > Any ideas on what the problem could be, or how to fix it?  
> > 
> > You have an old .ODB.SHM from a previous version in your directoy (note the '.' in
> > front, so you need a 'ls -alg' to see it). Delete that file and try again.
  321   11 Jan 2007 Steve HardyForumShared memory problems
Thanks for your help.  I tried again and it got me back to the initial problem I had.
 The frontend will start, and the analyzer starts (complains about there not being a
last.root, but other than that it's fine), and then when starting mlogger, I get:

[odb.c:860:db_validate_db] Warning: database corruption, first_free_key 0x0001A4
04
[odb.c:3666:db_get_key] invalid key handle
[midas.c:1970:cm_check_client] cannot delete client info
[odb.c:3666:db_get_key] invalid key handle
[midas.c:1970:cm_check_client] cannot delete client info
[odb.c:3666:db_get_key] invalid key handle


And it continues to shoot out error messages about invalid key handles until I kill
it.  Then trying to start the frontend again fails until I remove the .ODB.SHM file. 
Any other ideas?

> > Hello,
> > 
> > Just did a fresh install of MIDAS from the SVN repository under CentOS and
> > everything compiles fine, but when I go to run the frontend (using dio), I get
> > the following error message:
> > 
> > Connect to experiment ...[odb.c:868:db_open_database] Different database format:
> >  Shared memory is 14, program is 2
> > [midas.c:1763:cm_connect_experiment1] cannot open database
> > 
> > 
> > Any ideas on what the problem could be, or how to fix it?  
> 
> You have an old .ODB.SHM from a previous version in your directoy (note the '.' in
> front, so you need a 'ls -alg' to see it). Delete that file and try again.
  320   11 Jan 2007 Stefan RittForumShared memory problems
> Hello,
> 
> Just did a fresh install of MIDAS from the SVN repository under CentOS and
> everything compiles fine, but when I go to run the frontend (using dio), I get
> the following error message:
> 
> Connect to experiment ...[odb.c:868:db_open_database] Different database format:
>  Shared memory is 14, program is 2
> [midas.c:1763:cm_connect_experiment1] cannot open database
> 
> 
> Any ideas on what the problem could be, or how to fix it?  

You have an old .ODB.SHM from a previous version in your directoy (note the '.' in
front, so you need a 'ls -alg' to see it). Delete that file and try again.
  319   11 Jan 2007 Steve HardyForumShared memory problems
Hello,

Just did a fresh install of MIDAS from the SVN repository under CentOS and
everything compiles fine, but when I go to run the frontend (using dio), I get
the following error message:

Connect to experiment ...[odb.c:868:db_open_database] Different database format:
 Shared memory is 14, program is 2
[midas.c:1763:cm_connect_experiment1] cannot open database


Any ideas on what the problem could be, or how to fix it?  


~Steve
  318   08 Jan 2007 Stefan RittSuggestionAccess to out_info from mana.c
I changed out_info into a global structure definition ANA_OUTPUT_INFO and put it into
midas.h, so it can be accessed easily from the user analyzer source code.

> Would it be relevant to transform out_info into a *non-static* variable of a type
> defined by a *named* struct?
> Currently,  programs that  try to access out_info cannot do it anymore; and they
> typically copy the struct definition from mana.c, which is not robust against future
> changes in mana.c.
> 
> If mana.c could be changed in the way described above, that would be great . 
> Otherwise, is it safe to patch it myself for local use?  or is there a better way of
> accessing out_info from mana.c?
> 
> As always, any help would be much appreciated :)
> 
> EOL
> 
> > Hello,
> > 
> > Is it possible to access out_info (defined in mana.c) from another program?
> > 
> > In fact, out_info is now defined as an (anonymous) "static struct" in mana.c,
> > which it seems to me precludes any direct use in another program.  Is there an
> > indirect way of getting ahold of out_info?  or of the information it contains?
> > 
> > out_info used to be defined as a *non-static* struct, and the code I'm currently
> > modifying used to compile seamlessly: it now stops the compilation during
> > linking time, as out_info is now static and the program I have to compile
> > contains an "extern struct {} out_info".
> > 
> > Any help would be much appreciated!  I searched in vain in this forum for
> > details about out_info and I really need to access the information it contains!
> > 
> > EOL (a pure MIDAS novice)
  317   05 Jan 2007 Eric-Olivier LE BIGOTSuggestionAccess to out_info from mana.c
Would it be relevant to transform out_info into a *non-static* variable of a type
defined by a *named* struct?
Currently,  programs that  try to access out_info cannot do it anymore; and they
typically copy the struct definition from mana.c, which is not robust against future
changes in mana.c.

If mana.c could be changed in the way described above, that would be great . 
Otherwise, is it safe to patch it myself for local use?  or is there a better way of
accessing out_info from mana.c?

As always, any help would be much appreciated :)

EOL

> Hello,
> 
> Is it possible to access out_info (defined in mana.c) from another program?
> 
> In fact, out_info is now defined as an (anonymous) "static struct" in mana.c,
> which it seems to me precludes any direct use in another program.  Is there an
> indirect way of getting ahold of out_info?  or of the information it contains?
> 
> out_info used to be defined as a *non-static* struct, and the code I'm currently
> modifying used to compile seamlessly: it now stops the compilation during
> linking time, as out_info is now static and the program I have to compile
> contains an "extern struct {} out_info".
> 
> Any help would be much appreciated!  I searched in vain in this forum for
> details about out_info and I really need to access the information it contains!
> 
> EOL (a pure MIDAS novice)
  316   27 Dec 2006 Eric-Olivier LE BIGOTForumAccess to out_info from mana.c
Hello,

Is it possible to access out_info (defined in mana.c) from another program?

In fact, out_info is now defined as an (anonymous) "static struct" in mana.c,
which it seems to me precludes any direct use in another program.  Is there an
indirect way of getting ahold of out_info?  or of the information it contains?

out_info used to be defined as a *non-static* struct, and the code I'm currently
modifying used to compile seamlessly: it now stops the compilation during
linking time, as out_info is now static and the program I have to compile
contains an "extern struct {} out_info".

Any help would be much appreciated!  I searched in vain in this forum for
details about out_info and I really need to access the information it contains!

EOL (a pure MIDAS novice)
  315   26 Oct 2006 Hans FynboForumSetup of Ortec ADC AD413A in MIDAS
We are new to MIDAS and try to setup a simple system with one ortec camac ADC
AD413A and the hytec1331 controler. Has anyone used this module in MIDAS we
would be grateful for the corresponding frontend.c etc. 

It would be very useful to have somewhere examples of files used by various
experiments in addition to the example files provided in the installation.

Best regards,
Hans 
  314   16 Oct 2006 Stefan RittBug FixBuild error with mana.c while using CERNLIB, svn 3366
Committed, thanks.
  313   16 Oct 2006 Stefan RittBug Fix"make install" error on MacOS 10.4.7, svn 3366
> While executing "make install" under MacOS 10.4.7, you may encounter errors about "dio". It is the 
> problem of "Makefile". I did some change to it and attach the diff file here.

I committed your patch. Thank you.
  312   16 Oct 2006 Exaos LeeBug Fix"make install" error on MacOS 10.4.7, svn 3366
While executing "make install" under MacOS 10.4.7, you may encounter errors about "dio". It is the 
problem of "Makefile". I did some change to it and attach the diff file here.
  311   16 Oct 2006 Exaos LeeBug FixBuild error with mana.c while using CERNLIB, svn 3366
If you use CERNLIB to build hmana.o, you may encounter the following error:
src/mana.c: In function ‘write_event_hbook’:
src/mana.c:2881: error: invalid assignment
or somthing like this:
src/mana.c: In function ‘write_event_hbook’:
src/mana.c:2881: warning: target of assignment not really an lvalue; this will be a hard error in the future
So I checked the mana.c and found these lines
2880            /* shift data pointer to next item */
2881            (char *) pdata += key.item_size * key.num_values;
should be changed to
2880            /* shift data pointer to next item */
2881            pdata += key.item_size * key.num_values * sizeof(char) ;
  310   28 Sep 2006 Stefan RittBug Reportmhttpd elog corruption via double-edit
> I do not know how to "properly" fix this bug without changing the indexing
> scheme to something similar to what is used by elogd- message numbers instead of
> file indices. In the existing scheme, message editing also breaks URLs shown in
> the email notifications (they contain file indices that point to the wrong
> places after messages are moved around by editing) and "reply threading" links.

Well, the development of elogd with it's message numbers was actually stimulated by
the problem you mentioned. After that all those problems went away. Another
incarnation of that problem is if you edit an mhttpd log file manually. Afterwards
the file offsets are different and the system gets corrupted. To fix this properly,
one would have to backport the el_xxx functions from elogd to mhttpd, or, even
simpler, remove the elog functionality in mhttpd and "force" everybody to use elogd
(after doing elconv to convert the files into the new format).
  309   28 Sep 2006 Stefan RittSuggestionIncrease of maximum event size

K.O. wrote:
Now, we have per-buffer tunable size (see message
https://ladd00.triumf.ca/elog/Midas/283) and in the long run, I would prefer the
compiled-in limit to go away: already all memory is allocated dynamically and
the MAX_EVENT_SIZE is only useful as kind of a sanity check against frontend
misconfiguration or against malformed events.

If MAX_EVENT_SIZE goes away, the maximum event size becomes limited by the
largest SysV shared memory segment permitted by Linux (via sysctl kernel.shmmax).

To go beyound the limit on SysV shared memories, on can use mmap() based shared
memory: this is limited by available RAM+swap (and disk space for the
.SYSTEM.SHM file). Current MIDAS system.c has an experimental implementation of
mmap() shared memory, but AFAIK it has not been used in any production system, yet.


MAX_EVENT_SIZE is also used for the RPC layer, since the receiving buffer must hold at
least one event. It is right that this can and should be made dynamically. Concerning
the shared memory there is the problem that it cannot be increased when any program is
running and attached to the shared memory, so it can only be defined at startup of the
first program creating the shared memory.

The sanity check in the frontend is done against max_event_size defined in frontend.c which can be smaller than MAX_EVENT_SIZE (some front-ends have limited memory).

So I agree that this issue may need revision, maybe something for me next visit Wink
  308   27 Sep 2006 Konstantin OlchanskiSuggestionIncrease of maximum event size
> The current event size in midas is limited to 512k (MAX_EVENT_SIZE in midas.h)

Yes, 512 kBytes is rather small. For the T2K prototype TPC DAQ, I built and ran
MIDAS with 4 MByte events, and it worked fine.

Now, we have per-buffer tunable size (see message
https://ladd00.triumf.ca/elog/Midas/283) and in the long run, I would prefer the
compiled-in limit to go away: already all memory is allocated dynamically and
the MAX_EVENT_SIZE is only useful as kind of a sanity check against frontend
misconfiguration or against malformed events.

If MAX_EVENT_SIZE goes away, the maximum event size becomes limited by the
largest SysV shared memory segment permitted by Linux (via sysctl kernel.shmmax).

To go beyound the limit on SysV shared memories, on can use mmap() based shared
memory: this is limited by available RAM+swap (and disk space for the
.SYSTEM.SHM file). Current MIDAS system.c has an experimental implementation of
mmap() shared memory, but AFAIK it has not been used in any production system, yet.

K.O.
  307   27 Sep 2006 Konstantin OlchanskiBug Reportmhttpd elog corruption via double-edit
[quote="Stefan Ritt"][Quote="K.O.]Aparently the mhttpd elog will corrupt the
elog files if two (or more\?) elog entries are being edited at the same time.
K.O.[/quote]

The corruption is very simple. mhttpd elog indexes the elog entries by the elog
file and offset inside the file, i.e. "http://ladd00:8088/EL/060927.318",
"060927" corresponds to log file "060927.log", "318" is the offset inside the
file where the message is located.

During "edit", the code "remembers" the offset of the original message and in
el_submit() blindly writes the edited message into the file at the remembered
offset.

If another message was edited before the edit of the first message is submitted,
the remembered offset becomes invalid (messages have shifted inside the file)
and el_submit() writes the edited text into the wrong place in the file,
corrupting it.

I have now added a check for this and we crash instead of corrupting the elog
file (midas.c rev 3340).

I do not know how to "properly" fix this bug without changing the indexing
scheme to something similar to what is used by elogd- message numbers instead of
file indices. In the existing scheme, message editing also breaks URLs shown in
the email notifications (they contain file indices that point to the wrong
places after messages are moved around by editing) and "reply threading" links.

Here is how I reproduce this bug:

1) start with an empty elog
2) create two messages
3) "edit" the second message, but do not submit it yet.
4) "edit" the first message, change the text to make sure the message size
becomes different; submit this change.
5) submit the "edit" of the first message. !!BOOM!!

K.O.
  306   24 Sep 2006 Stefan RittBug Reportmhttpd elog corruption via double-edit

K.O. wrote:
Aparently the mhttpd elog will corrupt the elog files if two (or more\?) elog entries are being edited at the same time. K.O.


That's strange. Since mhttpd is single threaded, there should not be any multi-thread/process conflict there, since the elog files cannot be written simultaneously from two different browser sessions. If entries are edited at the same time, they get then submitted one after the other. Of course it is possible to edit the same entry, in which case the second submission "wins", overwriting the first one without notification. Withing the standalone elog server there is the option to lock entries ("use lock = 1") to prevent this, but this feature is not present in the mhttpd elog.
  305   23 Sep 2006 Konstantin OlchanskiBug Fixcommit latest ccusb.c CAMAC-USB driver
> I commited the latest driver for the Wiener CCUSB USB-CAMAC driver. It
> implements all functions from mcstd.h and has been tested to be plug-compatible
> with at least one of our CAMAC frontends. K.O.

This driver is known to not work with the latest CCUSB firmware (20x, 204, 30x, 303). I know what 
modifications are required and an updated driver will be available shortly. If there is a delay, and you need the 
driver ASAP, please drop me an email.

Also, I am thinking about dropping support for the very old CCUSB firmware revisions (before 204). (Any 
comments?)

K.O.
  304   23 Sep 2006 Konstantin OlchanskiBug Reportmhttpd elog corruption via double-edit
Aparently the mhttpd elog will corrupt the elog files if two (or more?) elog entries are being edited at the 
same time. K.O.
ELOG V3.1.4-2e1708b5