11 Feb 2007, Konstantin Olchanski, Info, svn and "make indent" trashed my svn checkout tree...
|
Fuming, fuming, fuming.
The combination of "make indent" and "svn update" completely trashed my work copy of midas. Half of
the files now show as status "M", half as status "C" ("in conflict"), even those I never edited myself (e.g.
mscb firmware files).
I think what happened as that once I ran "make indent", the indent program did things to the source
files (changed indentation, added spaces in "foo(a,b,c); --> foo(a, b, c);" etc, so now svn thinks that I
edited the files and they are in conflict with later modifications.
I suggest that nobody ever ever ever should use "make indent", and if they do, they should better
commit their "changes" made by indent very quickly, before their midas tree is trashed by the next "svn
update".
And if they commit the changes made by "make indent", beware that "make indent" is not idempotent,
running it multiple times, it keeps changing files (keeps moving some dox comments around).
Also beware of entering a tug-of-war with Stefan - at least on my machines, my "make indent" seems
to produce different output from his.
Still fuming, even after some venting...
K.O. |
02 Feb 2007, Exaos Lee, Bug Report, Compiling failed with SVN3562 under Ubuntu 6.10
|
I tried to solve the problem by adding a ";". It was wrong. In fact, the macro "_syscall0(..)" doesn't need the ";".
I searched and found that somebody said "the overall _syscall$magicnumber will disappear". I don't mind whether the "_syscall" disappear or not. I just want to compile the code and do my job. I deleted the additional ";" and recompiled. The error output is as the attachment [elog:335/1]. |
02 Feb 2007, Exaos Lee, Bug Fix, Problem solved by Re-define _syscall0(...)
|
OK, I searched and found that my kernel doesn't support "_syscall0" any more. So I patched the system.c as the following (from line 954):
#if defined(OS_DARWIN)
// blank
#elif defined(OS_LINUX)
#include <sys/syscall.h>
#include <unistd.h>
#undef _syscall0
#define _syscall0(type, name) \
type name(void) \
{\
return syscall(__NR_##name); \
}
_syscall0(pid_t,gettid)
#endif
My kernel version:exaos@memes midas>$ uname -a
Linux memes 2.6.17-10-generic #2 SMP Tue Dec 5 22:28:26 UTC 2006 i686 GNU/Linux
Maybe it's not the perfect way, but it works. |
06 Feb 2007, Stefan Ritt, Bug Fix, Problem solved by Re-define _syscall0(...)
|
Exaos Lee wrote: | Maybe it's not the perfect way, but it works. |
I changed it to:
#ifdef OS_UNIX
return syscall(SYS_gettid);
#endif /* OS_UNIX */
[/code1]
without any #define.
Does this work for you?
- Stefan |
05 Feb 2007, Konstantin Olchanski, Bug Report, wrong version in include/midas.h?
|
The present .../include/midas.h contains
[alpha@laddvme06 ~/online]$ grep 1.9.5 /home/alpha/packages/midas/include/*
/home/alpha/packages/midas/include/midas.h:#define MIDAS_VERSION "1.9.5"
All MIDAS utilities (odbedit ver) presently report version 1.9.5, even for svn
trunk, and this may confuse people as to what version of midas they are using,
and may complicate reporting of bugs.
Perhaps the trunk version should say something like "svn-22233344" (the svn
revision number)? The present "1.9.5" is wrong...
K.O. |
06 Feb 2007, Stefan Ritt, Bug Report, wrong version in include/midas.h?
|
> The present .../include/midas.h contains
> [alpha@laddvme06 ~/online]$ grep 1.9.5 /home/alpha/packages/midas/include/*
> /home/alpha/packages/midas/include/midas.h:#define MIDAS_VERSION "1.9.5"
>
> All MIDAS utilities (odbedit ver) presently report version 1.9.5, even for svn
> trunk, and this may confuse people as to what version of midas they are using,
> and may complicate reporting of bugs.
>
> Perhaps the trunk version should say something like "svn-22233344" (the svn
> revision number)? The present "1.9.5" is wrong...
Fully agree. I added a svn_revision string into midas.h, which gets reported now
by "odbedit ver". Unfortunately this reflects only changes in midas.c. If one
changes odb.c for example, the svn revision in midas.c does not get modified by
the SVN system. In addition I changed the present version 1.9.5 to 2.0.0. I made
the tar and zip files. After some internal testing, it will be announced
officially in a few days. |
02 Feb 2007, Exaos Lee, Bug Report, Compiling failed with SVN3562 under Ubuntu 6.10
|
The error log is as the following:
cc -c -g -O2 -Wall -Wuninitialized -Iinclude -Idrivers -I../mxml -Llinux/lib -DINCLUDE_FTPLIB -D_LARGEFILE64_SOURCE -DHAVE_MYSQL -DHAVE_ROOT -pthread -I/opt/root/current/include -DOS_LINUX -fPIC -Wno-unused-function -o linux/lib/system.o src/system.c
src/system.c:958: error: expected declaration specifiers or ‘...’ before ‘gettid’
src/system.c:958: warning: data definition has no type or storage class
src/system.c:958: warning: type defaults to ‘int’ in declaration of ‘_syscall0’
src/system.c: In function ‘ss_gettid’:
src/system.c:1005: warning: implicit declaration of function ‘gettid’
src/system.c: In function ‘ss_suspend_init_ipc’:
src/system.c:2948: warning: pointer targets in passing argument 3 of ‘getsockname’ differ in signedness
src/system.c: In function ‘ss_suspend’:
src/system.c:3414: warning: pointer targets in passing argument 6 of ‘recvfrom’ differ in signedness
src/system.c:3441: warning: pointer targets in passing argument 6 of ‘recvfrom’ differ in signedness
make: *** [linux/lib/system.o] 错误 1
The error might be here:
void ss_force_single_thread()
{
_single_thread = TRUE;
}
#if defined(OS_DARWIN)
// blank
#elif defined(OS_LINUX)
_syscall0(pid_t,gettid);
#endif
INT ss_gettid(void)
I have no idea about the usage of _syscall0(...). |
02 Feb 2007, Exaos Lee, Bug Report, Compiling failed with SVN3562 under Ubuntu 6.10
|
I tried to solve the problem by adding a ";". It was wrong. In fact, the macro "_syscall0(..)" doesn't need the ";".
I searched and found that somebody said "the overall _syscall$magicnumber will disappear". I don't mind whether the "_syscall" disappear or not. I just want to compile the code and do my job. I deleted the additional ";" and recompiled. The error output is as the attachment [elog:335/1]. |
02 Feb 2007, Exaos Lee, Bug Fix, Problem solved by Re-define _syscall0(...)
|
OK, I searched and found that my kernel doesn't support "_syscall0" any more. So I patched the system.c as the following (from line 954):
#if defined(OS_DARWIN)
// blank
#elif defined(OS_LINUX)
#include <sys/syscall.h>
#include <unistd.h>
#undef _syscall0
#define _syscall0(type, name) \
type name(void) \
{\
return syscall(__NR_##name); \
}
_syscall0(pid_t,gettid)
#endif
My kernel version:exaos@memes midas>$ uname -a
Linux memes 2.6.17-10-generic #2 SMP Tue Dec 5 22:28:26 UTC 2006 i686 GNU/Linux
Maybe it's not the perfect way, but it works. |
06 Feb 2007, Stefan Ritt, Bug Fix, Problem solved by Re-define _syscall0(...)
|
Exaos Lee wrote: | Maybe it's not the perfect way, but it works. |
I changed it to:
#ifdef OS_UNIX
return syscall(SYS_gettid);
#endif /* OS_UNIX */
[/code1]
without any #define.
Does this work for you?
- Stefan |
30 Jan 2007, Stefan Ritt, Bug Report, Large files under Windows XP
|
Hello,
We have problems analyzing large files under Windows XP. For small file sizes,
everything is ok. We have events of 2.8 MB each, and we can read ~30 events per
second. But if the file gets larger than typically 600-800 MB, then access
becomes very slow, about 1 event per second. This is not the case under Linux,
where it stays at 30 Hz (~90 MB/sec).
Looking at the low level file access, it is obvious that this has nothing to do
with midas, this problem can be reproduced with a simple program reading chunks
of 3MB from a 1GB file. The Windows XP file system is NTFS, default formatting.
Does anyone else have observed a similar problem or maybe even have some
suggestions? Unfortunately many people here want to analyze midas data under
Windows...
Stefan Ritt |
26 Jan 2007, Carl Metelko, Forum, Front end electronics broadcast data over ethernet, can midas read this in
|
Hi,
the system I'm building will have data read into the frontend nodes
via ethernet (optic). Is this possible?> |
21 Jan 2007, Denis Bilenko, Bug Report, buffer bugs
|
Hello,
We've been using midas and have stumbled upon some inconsistent behaviour:
1. Blocking calls to midas api aren't usable when client is connected through
mserver. This is true at least for bm_receive_event, but seems to be a more
general problem - midas application has call cm_yield within 10 seconds (or
whatever timeout is set) to remain alive.
That not the case when RPC is not used.
2. On Windows, two processes on the same machine can send/receive events to
each other only if they both use midas locally (through shared mem) or they
both use midas via RPC (through mserver), but not if they use different ways.
3. Receiving/sending same events from the same process - was possible in
1.9.5-1, not so in the current version (revision 3501, mxml revision 45). Is this an intended behavior fix?
To explain how to reproduce bugs, I will use 2 helper programs evprint.py and
evsend.py - for receiving and sending events respectively. You don't need
them, just something to send and receive events. (These are part of pymidas, which will be
released to public any time soon, but is quite usable already).
They both accept
* --path option in "host/experiment" format (for cm_connect_experiment call)
* --log option which command them to trace all midas' calls to terminal
evprint.py have two ways of receiving events
1) via looping over bm_receive_event
2) via providing callback to bm_request_event and looping over cm_yield(400) call
Example of use:
first-console$ python evprint.py receive
second-console$ python evsend.py 123
[first console]
id=2007 mask=2007 serial=2007 time=1169372833 len=3 '123'
So,
1. Blocking calls to midas api aren't usable when client is connected through
mserver.
$ python evprint.py --log --path 127.0.0.1/online receive"
cm_connect_experiment('127.0.0.1', 'online', 'evprint.py', None)
bm_open_buffer('SYSTEM', 1048576, &c_long(2)) -> BM_CREATED
bm_request_event(2, -1, -1, 2, &c_long(0), None)
... wait for a couple of seconds ...
[midas.c:9348:rpc_call] rpc timeout, routine = "bm_receive_event"
[system.c:3570:send_tcp] send(socket=0,size=8) returned -1, errno: 88 (Socket
operation on non-socket)
[midas.c:9326:rpc_call] send_tcp() failed
bm_receive_event(2, ...) -> RPC_TIMEOUT
bm_remove_event_request(2, 0) -> BM_INVALID_HANDLE
bm_close_buffer(2) -> BM_INVALID_HANDLE
cm_disconnect_experiment()
2. Missing events on windows
a) Both use midas locally - works
1: python evprint.py receive
2: python evsend.py 123
1: id=2007 mask=2007 serial=2007 time=1169372833 len=3 '123'
b) Both use midas via RPC - works
1: python evprint.py --path 127.0.0.1/ dispatch
2: python evsend.py --path 127.0.0.1/ 123
1: id=2007 mask=2007 serial=2007 time=1169373366 len=3 '123'
c) Receiver uses midas locally, sender uses mserver - doesn't work on windows
1: python evprint.py dispatch
2: python evsend.py --path 127.0.0.1/ 123
1: (nothing printed)
d) The other way around - doesn't work on windows
1: python evprint.py --path 127.0.0.1/ dispatch
2: python evsend.py 123
1: (nothing printed)
No such problem on linux.
3. Receiving/sending same events from the same process.
To reproduce this, just request events, send one and then try to receive
it – via cm_yield. I care for this, because I have a test in pymidas which
relies on this behavior.
hope this will help. |
22 Jan 2007, Stefan Ritt, Bug Report, buffer bugs
|
Denis Bilenko wrote: | 1. Blocking calls to midas api aren't usable when client is connected through mserver. This is true at least for bm_receive_event, but seems to be a more general problem - midas application has call cm_yield within 10 seconds (or whatever timeout is set) to remain alive.
That not the case when RPC is not used. |
The 10 seconds timeout you see comes from the RPC layer. If you call bm_receive_event and it blocks, then the client will consider a RPC timeout after 10 seconds. Has nothing to do with cm_yield(). Calling a blocking function via a sever connection is not a good idea anyhow, since this process then cannot respond on anything else, like run transitions. That's why I never used it and that's why I have not realized that behaviour. I did change it however such that bm_receive_event, if called without the ASYNC flag, disables the RPC timeout for this call and restores it afterwards. This is now in midas.c revision 3502. You can try this with midas/examples/lowlevel/produce and consume easily.
Denis Bilenko wrote: | 2. On Windows, two processes on the same machine can send/receive events to each other only if they both use midas locally (through shared mem) or they both use midas via RPC (through mserver), but not if they use different ways. |
I just tried again and it did work. I used produce/consume. If you enter just <return> for the host name, these programs connect locally. So I tried both producer locally, consumer remote, and vice versa, and both worked. I did however use consume with the callback functionality. I did not try your Python programs however. If you find out that produce/consume does work and your Python program don't, then adapt your Python programs to resemble produce/consume.
Denis Bilenko wrote: | 3. Receiving/sending same events from the same process - was possible in 1.9.5-1, not so in the current version (revision 3501, mxml revision 45). Is this an intended behavior fix? |
Yes. It was introduced in revision 3186 on July 28th, 2006. It fixed a problem that the buffer level was always shown as 100% full, even if there were no other clients registered. By ignoring the own process, the buffer level now correctly shows the "contents" of a buffer from 0..100%. It also gave a small speed improvement. If you want to send events to the own process, you have to do it from the calling level. Like if you call bm_send_event(), you call manually process_event or however your event receiving routine is called. This is also much faster than going through the buffer. |
23 Jan 2007, Denis Bilenko, Bug Report, buffer bugs
|
1 & 3 - thanks for the fix and the explanation, as for 2 - I've tried consume and produce
and still has a problem:
Config: GET_ALL, event id = 1, event size = 10, Receive via callback,
OS = Windows XP SP2
I restart mserver manually from command-line every time (not using system service).
I start produce first, then I start consume.
In two cases of four starting 'consume' causes 'produce' to exit immediatelly.
Guess which two
both local or both remote - works (i.e. non-zero rates in both consoles)
produce local, consume via rpc and vice versa - 'produce' exits with error
1. produce via rpc, consume locally
first console:D:\denis\cmd\midas\current\06jan21-export\midas\NT\bin>produce.exe
ID of event to produce: 1
Host to connect: 127.0.0.1
Event size: 10
Level: 0.0 %, Rate: 0.64 MB/sec
flush
Level: 0.0 %, Rate: 0.64 MB/sec
Level: 0.0 %, Rate: 0.63 MB/sec
Level: 0.0 %, Rate: 0.64 MB/sec
Level: 0.0 %, Rate: 0.61 MB/sec
Level: 0.0 %, Rate: 0.62 MB/sec
Level: 0.0 %, Rate: 0.62 MB/sec
Level: 0.0 %, Rate: 0.64 MB/sec
Level: 0.0 %, Rate: 0.63 MB/sec
Level: 0.0 %, Rate: 0.63 MB/sec
Level: 0.0 %, Rate: 0.64 MB/sec
flush
Level: 0.0 %, Rate: 0.62 MB/sec
## Now I've started consume in the other console ##
[system.c:3570:send_tcp] send(socket=1900,size=8136) returned -1, errno: 0 (No error)
send_tcp() returned -1
[midas.c:9669:rpc_send_event] send_tcp() failed
rpc_send_event returned error 503, event_size 10 second console:D:\denis\cmd\midas\current\06jan21-export\midas\NT\bin>consume.exe
ID of event to request: 1
Host to connect:
Get all events (0/1): 1
Receive via callback ([y]/n):
Level: 0.0 %, Rate: 0.00 MB/sec, ser mismatches: 0
Level: 0.0 %, Rate: 0.00 MB/sec, ser mismatches: 0
Level: 0.0 %, Rate: 0.00 MB/sec, ser mismatches: 0
Received break. Aborting... mserver's output:D:\denis\cmd\midas\current\06jan21-export\midas\NT\bin\mserver.exe started interactively
[midas.c:2315:bm_validate_client_index] Invalid client index 0 in buffer 'SYSTEM'.
Client name 'Power Consumer', pid 1964 should be 3216 2. produce locally, consume via rpc
D:\denis\cmd\midas\current\06jan21-export\midas\NT\bin>produce.exe
ID of event to produce: 1
Host to connect:
Event size: 10
Client 'Producer' (PID 2584) on 'ODB' removed by cm_watchdog (idle 144.1s,TO 10s)
Level: 0.0 %, Rate: 3.20 MB/sec
flush
Level: 0.0 %, Rate: 3.20 MB/sec
Level: 0.0 %, Rate: 3.11 MB/sec
Level: 0.0 %, Rate: 3.13 MB/sec
Level: 0.0 %, Rate: 3.06 MB/sec
Level: 0.0 %, Rate: 3.20 MB/sec
Level: 0.0 %, Rate: 2.96 MB/sec
Level: 0.0 %, Rate: 3.11 MB/sec
Level: 0.0 %, Rate: 3.18 MB/sec
Level: 0.0 %, Rate: 3.13 MB/sec
Level: 0.0 %, Rate: 3.17 MB/sec
flush
Level: 0.0 %, Rate: 3.19 MB/sec
Level: 0.0 %, Rate: 3.08 MB/sec
Level: 0.0 %, Rate: 3.06 MB/sec
## Now I've started consume ##
[midas.c:2315:bm_validate_client_index] Invalid client index 0 in buffer 'SYSTEM'. Client name '', pid 0 should be 760
Second console:
D:\denis\cmd\midas\current\06jan21-export\midas\NT\bin>consume.exe
ID of event to request: 1
Host to connect: 127.0.0.1
Get all events (0/1): 1
Receive via callback ([y]/n):
Level: 0.0 %, Rate: 0.00 MB/sec, ser mismatches: 0
Level: 0.0 %, Rate: 0.00 MB/sec, ser mismatches: 0
Received break. Aborting...
Level: 0.0 %, Rate: 0.00 MB/sec, ser mismatches: 0
mserver haven't said anything.
3. Both remote (just for comparison)
D:\denis\cmd\midas\current\06jan21-export\midas\NT\bin>produce.exe
ID of event to produce: 1
Host to connect: 127.0.0.1
Event size: 10
Level: 0.0 %, Rate: 0.65 MB/sec
flush
Level: 0.0 %, Rate: 0.66 MB/sec
Level: 0.0 %, Rate: 0.65 MB/sec
Level: 0.0 %, Rate: 0.60 MB/sec
Level: 0.0 %, Rate: 0.64 MB/sec
Level: 0.0 %, Rate: 0.63 MB/sec
Level: 0.0 %, Rate: 0.61 MB/sec
Level: 0.0 %, Rate: 0.63 MB/sec
Level: 0.0 %, Rate: 0.65 MB/sec
Level: 0.0 %, Rate: 0.65 MB/sec
Level: 0.0 %, Rate: 0.67 MB/sec
flush
Level: 0.0 %, Rate: 0.66 MB/sec
Level: 0.0 %, Rate: 0.65 MB/sec
Level: 0.0 %, Rate: 0.65 MB/sec
Level: 0.0 %, Rate: 0.66 MB/sec
Level: 0.0 %, Rate: 0.66 MB/sec
Level: 0.0 %, Rate: 0.65 MB/sec
Level: 0.0 %, Rate: 0.66 MB/sec
Level: 0.0 %, Rate: 0.66 MB/sec
Level: 0.0 %, Rate: 0.66 MB/sec
Level: 66.8 %, Rate: 0.66 MB/sec
flush
Level: 0.0 %, Rate: 0.00 MB/sec
Level: 66.8 %, Rate: 0.31 MB/sec
Level: 57.2 %, Rate: 0.15 MB/sec
Level: 57.3 %, Rate: 0.14 MB/sec
Level: 57.3 %, Rate: 0.15 MB/sec
Level: 57.3 %, Rate: 0.14 MB/sec
Level: 57.3 %, Rate: 0.14 MB/sec
Level: 57.3 %, Rate: 0.14 MB/sec
Received break. Aborting...
Received 2nd break. Hard abort.
[midas.c:1581:] cm_disconnect_experiment not called at end of program
Second console:
D:\denis\cmd\midas\current\06jan21-export\midas\NT\bin>consume.exe
ID of event to request: 1
Host to connect: 127.0.0.1
Get all events (0/1): 1
Receive via callback ([y]/n):
[consume.c:73:process_event] Serial number mismatch: Ser: 1397076, OldSer: 0, ID: 1, size: 10
Level: 37.1 %, Rate: 0.00 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.15 MB/sec, ser mismatches: 1
Level: 95.4 %, Rate: 0.08 MB/sec, ser mismatches: 1
Level: 66.8 %, Rate: 0.14 MB/sec, ser mismatches: 1
Level: 66.8 %, Rate: 0.12 MB/sec, ser mismatches: 1
Level: 76.3 %, Rate: 0.12 MB/sec, ser mismatches: 1
Level: 95.4 %, Rate: 0.11 MB/sec, ser mismatches: 1
Level: 57.3 %, Rate: 0.15 MB/sec, ser mismatches: 1
Level: 66.8 %, Rate: 0.11 MB/sec, ser mismatches: 1
Level: 85.9 %, Rate: 0.11 MB/sec, ser mismatches: 1
Level: 95.5 %, Rate: 0.12 MB/sec, ser mismatches: 1
Level: 57.4 %, Rate: 0.15 MB/sec, ser mismatches: 1
Level: 9.7 %, Rate: 0.15 MB/sec, ser mismatches: 1
[Producer] [midas.c:1581:] cm_disconnect_experiment not called at end of program
Level: 0.0 %, Rate: 0.03 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.00 MB/sec, ser mismatches: 1
Received break. Aborting...
|
23 Jan 2007, Stefan Ritt, Bug Report, buffer bugs
|
Denis Bilenko wrote: | 1 & 3 - thanks for the fix and the explanation, as for 2 - I've tried consume and produce
and still has a problem |
Acknowledged. I could reproduce it with the information you supplied, thank you very much. Also the data rate is slower than what I expect. I will investigate and fix this, but it could take some time. |
24 Jan 2007, Stefan Ritt, Bug Report, buffer bugs
|
I tried again and could not reproduce the problem. Last time I was probably confused by some old mserver.exe executable I had lying around. I updated to the most recent version (3516) and did a C:\midas> nmake -f makefile.nt. Last time I was also confused about the low rate, but that was caused by a mserver.exe executable which was not compiled with optimization. For small event sizes (such as 10 bytes) there is a big difference between optimized and non-optimized code. So I got:
First Console wrote: | ID of event to produce: 1
Host to connect: localhost
Event size: 10
Level: 0.0 %, Rate: 0.46 MB/sec
flush
Level: 0.0 %, Rate: 0.43 MB/sec
Level: 0.0 %, Rate: 0.43 MB/sec
Level: 0.0 %, Rate: 0.42 MB/sec
Level: 0.0 %, Rate: 0.42 MB/sec
Level: 0.0 %, Rate: 0.43 MB/sec
Level: 0.0 %, Rate: 0.43 MB/sec
Level: 0.0 %, Rate: 0.44 MB/sec
Level: 0.0 %, Rate: 0.42 MB/sec
Level: 0.0 %, Rate: 0.43 MB/sec
Level: 0.0 %, Rate: 0.43 MB/sec
flush
Level: 0.0 %, Rate: 0.44 MB/sec
Level: 0.0 %, Rate: 0.44 MB/sec
Level: 0.0 %, Rate: 0.40 MB/sec
Level: 0.0 %, Rate: 0.42 MB/sec
Level: 0.0 %, Rate: 0.43 MB/sec
Level: 0.0 %, Rate: 0.43 MB/sec
Level: 0.0 %, Rate: 0.44 MB/sec
Level: 0.0 %, Rate: 0.43 MB/sec
Level: 0.0 %, Rate: 0.43 MB/sec
Level: 0.0 %, Rate: 0.43 MB/sec
flush
|
and
Second Console wrote: | C:\midas\NT\bin>.\consume
ID of event to request: 1
Host to connect:
Get all events (0/1): 1
Receive via callback ([y]/n):
[consume.c:73:process_event] Serial number mismatch: Ser: 1169666, OldSer: 0, ID
: 1, size: 10
Level: 0.0 %, Rate: 0.00 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.42 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.42 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 2.4 %, Rate: 0.35 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.50 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.41 MB/sec, ser mismatches: 1
Level: 0.0 %, Rate: 0.40 MB/sec, ser mismatches: 1
Received break. Aborting...
|
Actually sending remote and receiving local is a very common thing. Most experiments use that. They have a remote frontend, and the logger and analyzer work locally. If that would not work, all these experiments would have a problem. So I only can encourage you to try again, make sure to update and recompile the executables. Maybe delete any old *.SHM file. Maybe try on another PC or under Linux. |
11 Jan 2007, Steve Hardy, Forum, Shared memory problems
|
Hello,
Just did a fresh install of MIDAS from the SVN repository under CentOS and
everything compiles fine, but when I go to run the frontend (using dio), I get
the following error message:
Connect to experiment ...[odb.c:868:db_open_database] Different database format:
Shared memory is 14, program is 2
[midas.c:1763:cm_connect_experiment1] cannot open database
Any ideas on what the problem could be, or how to fix it?
~Steve |
11 Jan 2007, Stefan Ritt, Forum, Shared memory problems
|
> Hello,
>
> Just did a fresh install of MIDAS from the SVN repository under CentOS and
> everything compiles fine, but when I go to run the frontend (using dio), I get
> the following error message:
>
> Connect to experiment ...[odb.c:868:db_open_database] Different database format:
> Shared memory is 14, program is 2
> [midas.c:1763:cm_connect_experiment1] cannot open database
>
>
> Any ideas on what the problem could be, or how to fix it?
You have an old .ODB.SHM from a previous version in your directoy (note the '.' in
front, so you need a 'ls -alg' to see it). Delete that file and try again. |
11 Jan 2007, Steve Hardy, Forum, Shared memory problems
|
Thanks for your help. I tried again and it got me back to the initial problem I had.
The frontend will start, and the analyzer starts (complains about there not being a
last.root, but other than that it's fine), and then when starting mlogger, I get:
[odb.c:860:db_validate_db] Warning: database corruption, first_free_key 0x0001A4
04
[odb.c:3666:db_get_key] invalid key handle
[midas.c:1970:cm_check_client] cannot delete client info
[odb.c:3666:db_get_key] invalid key handle
[midas.c:1970:cm_check_client] cannot delete client info
[odb.c:3666:db_get_key] invalid key handle
And it continues to shoot out error messages about invalid key handles until I kill
it. Then trying to start the frontend again fails until I remove the .ODB.SHM file.
Any other ideas?
> > Hello,
> >
> > Just did a fresh install of MIDAS from the SVN repository under CentOS and
> > everything compiles fine, but when I go to run the frontend (using dio), I get
> > the following error message:
> >
> > Connect to experiment ...[odb.c:868:db_open_database] Different database format:
> > Shared memory is 14, program is 2
> > [midas.c:1763:cm_connect_experiment1] cannot open database
> >
> >
> > Any ideas on what the problem could be, or how to fix it?
>
> You have an old .ODB.SHM from a previous version in your directoy (note the '.' in
> front, so you need a 'ls -alg' to see it). Delete that file and try again. |
11 Jan 2007, Stefan Ritt, Forum, Shared memory problems
|
That sounds like you mix versions: You have an old executable (maybe your mlogger) which
has been linked against the old midas version, but you create the ODB with the new
odbedit or frontend. The new version complains if it finds an ODB from a previous version
(the error you reported first), but an old program does not have that version check, so
it finds a different binary ODB structure and crashes.
> Thanks for your help. I tried again and it got me back to the initial problem I had.
> The frontend will start, and the analyzer starts (complains about there not being a
> last.root, but other than that it's fine), and then when starting mlogger, I get:
>
> [odb.c:860:db_validate_db] Warning: database corruption, first_free_key 0x0001A4
> 04
> [odb.c:3666:db_get_key] invalid key handle
> [midas.c:1970:cm_check_client] cannot delete client info
> [odb.c:3666:db_get_key] invalid key handle
> [midas.c:1970:cm_check_client] cannot delete client info
> [odb.c:3666:db_get_key] invalid key handle
>
>
> And it continues to shoot out error messages about invalid key handles until I kill
> it. Then trying to start the frontend again fails until I remove the .ODB.SHM file.
> Any other ideas?
>
> > > Hello,
> > >
> > > Just did a fresh install of MIDAS from the SVN repository under CentOS and
> > > everything compiles fine, but when I go to run the frontend (using dio), I get
> > > the following error message:
> > >
> > > Connect to experiment ...[odb.c:868:db_open_database] Different database format:
> > > Shared memory is 14, program is 2
> > > [midas.c:1763:cm_connect_experiment1] cannot open database
> > >
> > >
> > > Any ideas on what the problem could be, or how to fix it?
> >
> > You have an old .ODB.SHM from a previous version in your directoy (note the '.' in
> > front, so you need a 'ls -alg' to see it). Delete that file and try again. |
27 Dec 2006, Eric-Olivier LE BIGOT, Forum, Access to out_info from mana.c
|
Hello,
Is it possible to access out_info (defined in mana.c) from another program?
In fact, out_info is now defined as an (anonymous) "static struct" in mana.c,
which it seems to me precludes any direct use in another program. Is there an
indirect way of getting ahold of out_info? or of the information it contains?
out_info used to be defined as a *non-static* struct, and the code I'm currently
modifying used to compile seamlessly: it now stops the compilation during
linking time, as out_info is now static and the program I have to compile
contains an "extern struct {} out_info".
Any help would be much appreciated! I searched in vain in this forum for
details about out_info and I really need to access the information it contains!
EOL (a pure MIDAS novice) |
05 Jan 2007, Eric-Olivier LE BIGOT, Suggestion, Access to out_info from mana.c
|
Would it be relevant to transform out_info into a *non-static* variable of a type
defined by a *named* struct?
Currently, programs that try to access out_info cannot do it anymore; and they
typically copy the struct definition from mana.c, which is not robust against future
changes in mana.c.
If mana.c could be changed in the way described above, that would be great .
Otherwise, is it safe to patch it myself for local use? or is there a better way of
accessing out_info from mana.c?
As always, any help would be much appreciated :)
EOL
> Hello,
>
> Is it possible to access out_info (defined in mana.c) from another program?
>
> In fact, out_info is now defined as an (anonymous) "static struct" in mana.c,
> which it seems to me precludes any direct use in another program. Is there an
> indirect way of getting ahold of out_info? or of the information it contains?
>
> out_info used to be defined as a *non-static* struct, and the code I'm currently
> modifying used to compile seamlessly: it now stops the compilation during
> linking time, as out_info is now static and the program I have to compile
> contains an "extern struct {} out_info".
>
> Any help would be much appreciated! I searched in vain in this forum for
> details about out_info and I really need to access the information it contains!
>
> EOL (a pure MIDAS novice) |
08 Jan 2007, Stefan Ritt, Suggestion, Access to out_info from mana.c
|
I changed out_info into a global structure definition ANA_OUTPUT_INFO and put it into
midas.h, so it can be accessed easily from the user analyzer source code.
> Would it be relevant to transform out_info into a *non-static* variable of a type
> defined by a *named* struct?
> Currently, programs that try to access out_info cannot do it anymore; and they
> typically copy the struct definition from mana.c, which is not robust against future
> changes in mana.c.
>
> If mana.c could be changed in the way described above, that would be great .
> Otherwise, is it safe to patch it myself for local use? or is there a better way of
> accessing out_info from mana.c?
>
> As always, any help would be much appreciated :)
>
> EOL
>
> > Hello,
> >
> > Is it possible to access out_info (defined in mana.c) from another program?
> >
> > In fact, out_info is now defined as an (anonymous) "static struct" in mana.c,
> > which it seems to me precludes any direct use in another program. Is there an
> > indirect way of getting ahold of out_info? or of the information it contains?
> >
> > out_info used to be defined as a *non-static* struct, and the code I'm currently
> > modifying used to compile seamlessly: it now stops the compilation during
> > linking time, as out_info is now static and the program I have to compile
> > contains an "extern struct {} out_info".
> >
> > Any help would be much appreciated! I searched in vain in this forum for
> > details about out_info and I really need to access the information it contains!
> >
> > EOL (a pure MIDAS novice) |
26 Oct 2006, Hans Fynbo, Forum, Setup of Ortec ADC AD413A in MIDAS
|
We are new to MIDAS and try to setup a simple system with one ortec camac ADC
AD413A and the hytec1331 controler. Has anyone used this module in MIDAS we
would be grateful for the corresponding frontend.c etc.
It would be very useful to have somewhere examples of files used by various
experiments in addition to the example files provided in the installation.
Best regards,
Hans |
16 Oct 2006, Exaos Lee, Bug Fix, Build error with mana.c while using CERNLIB, svn 3366
|
If you use CERNLIB to build hmana.o, you may encounter the following error:
src/mana.c: In function ‘write_event_hbook’:
src/mana.c:2881: error: invalid assignment
or somthing like this:
src/mana.c: In function ‘write_event_hbook’:
src/mana.c:2881: warning: target of assignment not really an lvalue; this will be a hard error in the future
So I checked the mana.c and found these lines
2880 /* shift data pointer to next item */
2881 (char *) pdata += key.item_size * key.num_values;
should be changed to
2880 /* shift data pointer to next item */
2881 pdata += key.item_size * key.num_values * sizeof(char) ;
|
16 Oct 2006, Stefan Ritt, Bug Fix, Build error with mana.c while using CERNLIB, svn 3366
|
Committed, thanks. |
23 Sep 2006, Konstantin Olchanski, Bug Report, mhttpd elog corruption via double-edit
|
Aparently the mhttpd elog will corrupt the elog files if two (or more?) elog entries are being edited at the
same time. K.O. |
24 Sep 2006, Stefan Ritt, Bug Report, mhttpd elog corruption via double-edit
|
K.O. wrote: | Aparently the mhttpd elog will corrupt the elog files if two (or more\?) elog entries are being edited at the same time. K.O. |
That's strange. Since mhttpd is single threaded, there should not be any multi-thread/process conflict there, since the elog files cannot be written simultaneously from two different browser sessions. If entries are edited at the same time, they get then submitted one after the other. Of course it is possible to edit the same entry, in which case the second submission "wins", overwriting the first one without notification. Withing the standalone elog server there is the option to lock entries ("use lock = 1") to prevent this, but this feature is not present in the mhttpd elog. |
27 Sep 2006, Konstantin Olchanski, Bug Report, mhttpd elog corruption via double-edit
|
[quote="Stefan Ritt"][Quote="K.O.]Aparently the mhttpd elog will corrupt the
elog files if two (or more\?) elog entries are being edited at the same time.
K.O.[/quote]
The corruption is very simple. mhttpd elog indexes the elog entries by the elog
file and offset inside the file, i.e. "http://ladd00:8088/EL/060927.318",
"060927" corresponds to log file "060927.log", "318" is the offset inside the
file where the message is located.
During "edit", the code "remembers" the offset of the original message and in
el_submit() blindly writes the edited message into the file at the remembered
offset.
If another message was edited before the edit of the first message is submitted,
the remembered offset becomes invalid (messages have shifted inside the file)
and el_submit() writes the edited text into the wrong place in the file,
corrupting it.
I have now added a check for this and we crash instead of corrupting the elog
file (midas.c rev 3340).
I do not know how to "properly" fix this bug without changing the indexing
scheme to something similar to what is used by elogd- message numbers instead of
file indices. In the existing scheme, message editing also breaks URLs shown in
the email notifications (they contain file indices that point to the wrong
places after messages are moved around by editing) and "reply threading" links.
Here is how I reproduce this bug:
1) start with an empty elog
2) create two messages
3) "edit" the second message, but do not submit it yet.
4) "edit" the first message, change the text to make sure the message size
becomes different; submit this change.
5) submit the "edit" of the first message. !!BOOM!!
K.O. |
28 Sep 2006, Stefan Ritt, Bug Report, mhttpd elog corruption via double-edit
|
> I do not know how to "properly" fix this bug without changing the indexing
> scheme to something similar to what is used by elogd- message numbers instead of
> file indices. In the existing scheme, message editing also breaks URLs shown in
> the email notifications (they contain file indices that point to the wrong
> places after messages are moved around by editing) and "reply threading" links.
Well, the development of elogd with it's message numbers was actually stimulated by
the problem you mentioned. After that all those problems went away. Another
incarnation of that problem is if you edit an mhttpd log file manually. Afterwards
the file offsets are different and the system gets corrupted. To fix this properly,
one would have to backport the el_xxx functions from elogd to mhttpd, or, even
simpler, remove the elog functionality in mhttpd and "force" everybody to use elogd
(after doing elconv to convert the files into the new format). |
20 Sep 2006, Stefan Ritt, Suggestion, Increase of maximum event size
|
Dear midas users,
The current event size in midas is limited to 512k (MAX_EVENT_SIZE in midas.h). This is mainly due to old (pre 2.2) linux kernels which had only a very limited shared memory pool. These days this limit has increased considerably and I question if we should increase the default event size and to which size we should increase it.
The drawback of a larger event size is that the SYSTEM event buffer has to hold at least two events, and when the last midas program is stopped or started, this buffer has to be written to or read from the .SYSTEM.SHM file, which slows down the start/stop of the program. But writing/reading a few MB is fast these days anyhow so this again might now be a big problem. So what do you think how big we should make the default max event size?
- Stefan |
20 Sep 2006, Stefan Ritt, Suggestion, Increase of maximum event size
|
Since nobody complained so far, I increased MAX_EVENT_SIZE to 2MB. If anybody has problems with this setting, please report. Note that after updating to SVN revision 3327 it will be necessary to recompile all midas programs and to delete any old SYSTEM.SHM or .SYSTEM.SHM. I added some code which should check for inconsistent SYSTEM.SHM sizes, but I'm not sure if it works everywhere. |
27 Sep 2006, Konstantin Olchanski, Suggestion, Increase of maximum event size
|
> The current event size in midas is limited to 512k (MAX_EVENT_SIZE in midas.h)
Yes, 512 kBytes is rather small. For the T2K prototype TPC DAQ, I built and ran
MIDAS with 4 MByte events, and it worked fine.
Now, we have per-buffer tunable size (see message
https://ladd00.triumf.ca/elog/Midas/283) and in the long run, I would prefer the
compiled-in limit to go away: already all memory is allocated dynamically and
the MAX_EVENT_SIZE is only useful as kind of a sanity check against frontend
misconfiguration or against malformed events.
If MAX_EVENT_SIZE goes away, the maximum event size becomes limited by the
largest SysV shared memory segment permitted by Linux (via sysctl kernel.shmmax).
To go beyound the limit on SysV shared memories, on can use mmap() based shared
memory: this is limited by available RAM+swap (and disk space for the
.SYSTEM.SHM file). Current MIDAS system.c has an experimental implementation of
mmap() shared memory, but AFAIK it has not been used in any production system, yet.
K.O. |
28 Sep 2006, Stefan Ritt, Suggestion, Increase of maximum event size
|
K.O. wrote: | Now, we have per-buffer tunable size (see message
https://ladd00.triumf.ca/elog/Midas/283) and in the long run, I would prefer the
compiled-in limit to go away: already all memory is allocated dynamically and
the MAX_EVENT_SIZE is only useful as kind of a sanity check against frontend
misconfiguration or against malformed events.
If MAX_EVENT_SIZE goes away, the maximum event size becomes limited by the
largest SysV shared memory segment permitted by Linux (via sysctl kernel.shmmax).
To go beyound the limit on SysV shared memories, on can use mmap() based shared
memory: this is limited by available RAM+swap (and disk space for the
.SYSTEM.SHM file). Current MIDAS system.c has an experimental implementation of
mmap() shared memory, but AFAIK it has not been used in any production system, yet. |
MAX_EVENT_SIZE is also used for the RPC layer, since the receiving buffer must hold at
least one event. It is right that this can and should be made dynamically. Concerning
the shared memory there is the problem that it cannot be increased when any program is
running and attached to the shared memory, so it can only be defined at startup of the
first program creating the shared memory.
The sanity check in the frontend is done against max_event_size defined in frontend.c which can be smaller than MAX_EVENT_SIZE (some front-ends have limited memory).
So I agree that this issue may need revision, maybe something for me next visit |
05 Sep 2006, Konstantin Olchanski, Forum, Forums moved from dasdevpc.triumf.ca to ladd00.triumf.ca
|
For the record, the MIDAS (& co) forums have been physically moved from
dasdevpc.triumf.ca to our new server machine ladd00.triumf.ca. This change
should be transparent to all users, but if anything stops working, please let me
know at olchansk-at-triumf-dot-ca. K.O. |
04 Sep 2006, Konstantin Olchanski, Bug Fix, Fix MIDAS on MacOS 10.4.7
|
I commited minor fixes for building MIDAS on MacOS 10.4.7:
1) there is no linux/unistd.h
2) gcc 4.0.0 does not like "struct { ... } var;" although "struct Foo { ... } var;" is fine
3) there is no "_syscall0(...)" macro
4) there is no "gettid()", I used pthread_self() instead.
K.O.
P.S. ss_gettid() returns "int" instead of "midas_thread_t" (pthread_t, really). On MacOS 10.4.7 at least,
pthread_t appears to be a pointer, not an int. Is that right? |
01 Sep 2006, pohl, Forum, Hytec 5331 CAMAC kernel 2.6 driver problem
|
Grüezi,
I am new to this list.
We are using MIDAS in the Muonic Hydrogen Lamb Shift experiment at PSI. Previously the DAQ was maintained by Paul Knowles. For the upcoming beamtime I took over.
Now I have problems with the kernel driver khyt1331_26 with Midas svn 3315.
I have compiled the driver, and modprobe khyt1331 works.
Then: "cat /proc/khyt1331" gives, with the CAMAC crate switched OFF:
Hytec 5331 card found at address 0xCC40, using interrupt 10
Device not in use
CAMAC crate 0: not responding
CAMAC crate 1: not responding
CAMAC crate 2: not responding
CAMAC crate 3: not responding
When I switch the crate on and do the "cat" again, the computer freezes.
When I switch the crate OFF again, the computer screen turns black and the computer beeps.
Is anybody using the Hytec 5331 PCI CAMAC card plus the Hytec 1331 CAMAC crate controller and can help me?
I would greatly appreciate any help. Otherwise I am lost.
Cheers,
Randolf
More info:
------------------------------------------------------
Using SuSE 9.3 on a P4. Tried HyterThreading on and off.
uname -a:
Linux mpq1p13 2.6.11.4-21.13-smp #1 SMP Mon Jul 17 09:21:59 UTC 2006 i686 i686 i386 GNU/Linux
------------------------------------------------------
This is exactly what I did (my logbook):
> cd $MIDASSYS/drivers/kernel/khyt1331_26
edit kyt1331.c:
replace (line 36):
# include <config/modversions.h>
with
# include <linux/config.h>
now
> make
> make install
Works, but produces irrelevant error:
install: cannot stat `../doc/*.9': No such file or directory
(Some doc stuff missing)
Finish "make install" by hand by typing
> /sbin/depmod
Load the driver and check it is there:
> modprobe khyt1331
> lsmod | grep khyt
gives on my machine:
"khyt1331 13084 0 "
Now try
> cat /proc/khyt1331
Gives on my machine (no CAMAC crate attached)
Hytec 5331 card found at address 0xCC40, using interrupt 10
Device not in use
CAMAC crate 0: not responding
CAMAC crate 1: not responding
CAMAC crate 2: not responding
CAMAC crate 3: not responding
Finally we need the character device with major number 60 ("char-major-60)
called "/dev/camac".
First check that no device with major=60 exitst:
> ls -l /dev | grep "60,"
should not produce any output.
So we create this device by
> mknod /dev/camac c 60 0
And
> ls -l /dev | grep "60,"
results in
crw-r--r-- 1 root root 60, 0 2006-09-01 14:25 camac
(Here start the problems described above. I had the same problems when I tried the "cat" with CAMAC on BEFORE I did the "mknod")
----------------------------------------------------------
Uncommenting all "prink" in ../drivers/kernel/khyt1331_26/khyt1331.c I get the following kernel logs in /var/log/messages:
Sep 1 17:15:55 mpq1p13 kernel: khyt1331: module not supported by Novell, settin
g U taint flag.
Sep 1 17:15:55 mpq1p13 kernel: khyt1331: start initialization
Sep 1 17:15:55 mpq1p13 kernel: khyt1331: Found 5331 card at CC40, irq 10
Sep 1 17:15:55 mpq1p13 kernel: khyt1331: initialization finished
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 0
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 1
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 2
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 3
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 0
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 1
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 2
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 3
And then it dies. |
19 Aug 2006, Konstantin Olchanski, Bug Fix, fixes for minor mhttpd problems
|
I commited fix for minor mhttpd problems (rev 3314):
- for a newly created experiment, the "history" button gave the error [history
panel "" does not exist] (new problem introduced in revision 3150)
- for very long history panel names (close to the 32-character limit) history
plots produce the error "Cannot find /history/display/foo/bar/variables" (broke
in revision 3190 "use strlcpy()", in previous revisions, this bug was silent
stack corruption)
- elog attachments did not work for file names containing character plus (+)
(attachement URLs should be properly encoded to escape special CGI characters)
K.O. |
26 Aug 2006, Konstantin Olchanski, Bug Fix, fixes for minor mhttpd problems
|
> I commited fix for minor mhttpd problems (rev 3314):
> - elog attachments did not work for file names containing character plus (+)
> (attachement URLs should be properly encoded to escape special CGI characters)
I accidentally indirectly learned that the above change produced incorrect URLs
when more than one experiment is defined. I now commited a fix to this problem.
K.O. |
17 Aug 2006, Konstantin Olchanski, Bug Report, "double" values are truncated
|
The mhttpd ODB displays and mhist truncate values of "float" and "double"
floating point variables to 6 digits. In reality, "float" has 7 significant
digits and "double" has 16. I recommend that db_sprintf() in odb.c be changed to
read this:
case TID_FLOAT:
sprintf(string, "%.7g", *(((float *) data) + index));
break;
case TID_DOUBLE:
sprintf(string, "%.16g", *(((double *) data) + index));
break;
K.O. |
17 Aug 2006, Stefan Ritt, Bug Report, "double" values are truncated
|
> The mhttpd ODB displays and mhist truncate values of "float" and "double"
> floating point variables to 6 digits. In reality, "float" has 7 significant
> digits and "double" has 16. I recommend that db_sprintf() in odb.c be changed to
> read this:
>
> case TID_FLOAT:
> sprintf(string, "%.7g", *(((float *) data) + index));
> break;
> case TID_DOUBLE:
> sprintf(string, "%.16g", *(((double *) data) + index));
> break;
>
> K.O.
I had there
case TID_FLOAT:
if (ss_isnan(*(((float *) data) + index)))
sprintf(string, "NAN");
else
sprintf(string, "%g", *(((float *) data) + index));
break;
case TID_DOUBLE:
if (ss_isnan(*(((double *) data) + index)))
sprintf(string, "NAN");
else
sprintf(string, "%lg", *(((double *) data) + index));
break;
so I assumed that "%g" takes care of the maximal resolution. But apparently it does
not. So I changed it as you proposed. |
12 Aug 2006, Pierre-André Amaudruz, Release, Midas updates
|
Midas development:
Over the last 2 weeks (Jul26-Aug09), Stefan Ritt has been at Triumf for the "becoming" traditional Midas development 'brainstorming/hackathon' (every second year).
A list with action items has been setup combining the known problems and the wish list from several Midas users.
The online documentation has been updated to reflect the modifications.
Not all the points have been covered, as more points were added daily but the main issues that have been dealt or at least discussed are:
- ODB over Frontend precedence.
When starting a FE client, the equipment settings are taken from the ODB if this equipment already existed. This meant the ODB has precedence over the EQUIPEMENT structure and whatever change you apply to the C-Structure, it will NOT be taken in consideration until you clean (remove) the equipment tree in ODB.
- Revived 64 bit support. This was required as more OS are already supporting such architecture. Originally Midas did support Alpha/OSF/1 which operated on 64 bit machine. This new code has been tested on SL4.2 with Dual-Core 64-bit AMD Opterons.
- Multi-threading in Slow Control equipments.
Check entry 289 in Midas Elog from Stefan.
- mhttpd using external Elog.
The standalone ELOG package can be coupled to an existing experiment and therefore supersede the internal elog functionality from mhttpd.
This requires a particular configuration which is described in the documentation.
- MySQL test in mlogger
A reminder that mlogger can generate entries in a MySQL database as long as the pre-compilation flag -HAVE_MYSQL is enabled during system built. The access and form filling is then defined from the ODB under Logger/SQL once the logger is running, see documentation.
- Directory destination for midas.log and odb dump files
It is now possible to specify an individual directory to the default midas.log file as well as to the "ODB Dump file" destination. If either of these fields contains a preceding directory, it will take the string as an absolute path to the file.
- User defined "event Data buffer size" (ODB)
The event buffer size has been until now defined at the system level in midas.h. It is now possible to optimize the memory allocation specific to the event buffer with an entry in the ODB under /experiment, see documentation.
- History group display
It is now possible to display an individual group of history plots. No documentation on that topics as it should be self explanatory.
- History export option
From the History web page, it is possible to export to a ASCII .csv file the history content. This file can later be imported into excel for example. No documentation on that topics as it should be self explanatory.
- Multiple "minor" corrections:
- Alarm reset for multiple experiment (return directly to the experiment).
- mdump -b option bug fixed.
- Alarm evaluation function fixed.
- mlogger/SQL boolean handling fixed.
- bm_get_buffer_level() was returning a wrong value which has been fixed now.
- Event buffer bug traced and exterminated (Thanks to Konstantin).
|
07 Aug 2006, Konstantin Olchanski, Bug Fix, Refactoring and rewrite of event buffer code
|
In close cooperation with Stefan, I refactored and rewrote the MIDAS event
buffering code (bm_send_event, bm_flush_cache, bm_receive_event and bm_push_event).
The main goal of this update is to make sure the event buffering code does not
have any infinite loops: in the past, we have seen mlogger and some frontends
loop forever consuming 100% CPU in the event buffering code. This should now be
completely fixed.
As additional bonuses, the refactored code is easier to read, has less code
duplication and should be more robust. A few potential logical problems have
been corrected and one case of reproducible infinite looping has been fixed.
The new code has passed the low-level consumer-producer tests, but has not yet
been used in anger in any real experiment. One hopes any new bugs introduced
would cause outright failures and core dumps (rather than silent data corruption).
All are welcome to try the new code. If it explodes, please send me the error
messages, stack traces and core dumps.
K.O. |
09 Aug 2006, Konstantin Olchanski, Bug Fix, Refactoring and rewrite of event buffer code
|
> In close cooperation with Stefan, I refactored and rewrote the MIDAS event
> buffering code (bm_send_event, bm_flush_cache, bm_receive_event and bm_push_event).
>
> All are welcome to try the new code. If it explodes, please send me the error
> messages, stack traces and core dumps.
Stefan quickly found one new error (a typoe in a check against infinite looping) and
then I found one old error present in the old code that caused event loss when the
buffer became exactly 100% full (0 bytes free).
Both errors are now fixed in svn commit 3294.
K.O. |
|