12 Jul 2007, Konstantin Olchanski, Forum, Midas on a x86_64 - incompatible with x86_32
|
> We run 64-bit MIDAS on RHEL4 with 64-bit ROOT and everything generally works,
> except for compatibility problems with 32-bit MIDAS.
>
> The big problem is that 64-bit and 32-bit ODB turned out to be incompatible ...
I have now identified 3 data structures that change size when compiled with "-m64":
EVENT_REQUEST: stores a pointer to a function. Pointer size is 4 bytes with -m32 and 8 bytes with -m64.
This structure is part of an array inside BUFFER_HEADER, resulting in a sizable size mismatch between 32
bit and 64 bit shared memory data buffers.
The fix is simple: the function pointer is not used anywhere. Replace is with a "DWORD unused_filler"
makes -m32 and -m64 data buffers compatible. (But breaks compatibility with previous -m64 compiled midas).
CHN_SETTINGS and CHN_STATISTICS: apparently, -m32 and -m64 GCC has different packing rules and in -m64
mode, 4 bytes of padding are added to these data structures. Size size mismatch appears to be benign,
but will result in "size mismatch" complaints from ODB.
The fix is simple: adding "__attribute__ ((__packed__))" to the definition of the data structure makes
-m64 identical to -m32.
The "svn diff" of changes involved is attached below.
The biggest problem here is that making 32-bit ODB and 64-bit ODB compatible requires breaking one or
the other (My proposed changes break the 64-bit version. Alternatively, one could add explicit padding
to these data structures and break the 32-bit ODB).
I think it is important to make 32-bit and 64-bit code compatible: at TRIUMF we have to use a mixed
environment because out latest host computers all run 64-bit Linux while all our VME processors and all
older machines can only run 32-bit code; this incompatibility causes us weekly headaches.
Any thoughts?
K.O.
(this output of svn diff is doctored for clarity)
ladd00:midas$ svn diff
Index: include/midas.h
===================================================================
--- include/midas.h (revision 3744)
+++ include/midas.h (working copy)
- void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
+ INT unused; // was void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
} EVENT_REQUEST;
--- include/msystem.h (revision 3744)
+++ include/msystem.h (working copy)
+#define PACKED __attribute__ ((__packed__)) <--- this goes into midas.h inside the #ifdef "we use GCC"
-typedef struct {
+typedef struct PACKED { ... CHN_SETTINGS
-typedef struct {
+typedef struct PACKED { ... CHN_STATISTICS |
13 Jul 2007, Stefan Ritt, Forum, Midas on a x86_64 - incompatible with x86_32
|
> The biggest problem here is that making 32-bit ODB and 64-bit ODB compatible requires breaking one or
> the other (My proposed changes break the 64-bit version. Alternatively, one could add explicit padding
> to these data structures and break the 32-bit ODB).
>
> I think it is important to make 32-bit and 64-bit code compatible: at TRIUMF we have to use a mixed
> environment because out latest host computers all run 64-bit Linux while all our VME processors and all
> older machines can only run 32-bit code; this incompatibility causes us weekly headaches.
>
> Any thoughts?
I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
This ensures to keep the native 64-bit packing, which probably will be somehow optimized for 64-bit
architectures and therefore might be a bit faster in the long run, when most systems are 64-bit. After this
has been implemented and well tested, I would go with an official announcement of the 32-bit break in the ODB,
and release a new version, so people can update from a TAR file if necessary. Existing ODB's can be converted
to the new format by exporting them in XML form and importing them again after the upgrade. |
12 Aug 2007, Konstantin Olchanski, Forum, Midas on a x86_64 - incompatible with x86_32
|
> I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
> in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
I now have the patches to implement this. Changes turned out to be minimal:
1) midas.h: remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
2) msystem.h: add 32-bit padding to CHN_STATISTICS and CHN_SETTINGS
(Pedantic note: the C/C++ languages permit compilers to arbitrary pad data members inside structures and one is
not supposed to rely on the specific layout of "struct"s, they could changing from day to day depending on
compiler vendor, version, 32/64 bit, optimization level, etc. This is quite silly, but I guess it was the only way
"they" could agree on a standard)
In practice, compilers are will behaved and one can follow simple rules and stay out of trouble.
1) if all data members are of the same size -> no padding
2) do not use "double" (64-bit) and "short" (16-bit), make all char[] arrays divisible by 4 -> size of everything
is 32-bit, see rule 1
3) if you have to use "short", they have to come in pairs to keep everything else aligned to 32-bit
4) if you have to use "double" (or uint64_t), keep them aligned to 64-bit, i.e. struct { int a,b,c; double x;} is
*bad* (4-byte padding may be added between c and x). struct { int a,b,c,d; double x; } is good.
Below are is "svn diff include/midas.h include/msystem.h". These changes have been tested on SL4 32-bit and
64-bit, SL5 32/64, F7 32/64 and SL4/ICC (Intel compiler) 32 bit and 64 bit.
The testing was done by adding checks on sizes of all struct's kept on ODB, i.e.
assert(sizeof(CHN_SETTINGS ) == 640); // ODB v3 with padding
assert(sizeof(CHN_STATISTICS ) == 32); // ODB v3 with padding
... etc ...
K.O.
ladd03:midas$ svn diff include/midas.h include/msystem.h
Index: include/midas.h
===================================================================
--- include/midas.h (revision 3798)
+++ include/midas.h (working copy)
@@ -38,7 +38,7 @@
* @{ */
/* has to be changed whenever binary ODB format changes */
-#define DATABASE_VERSION 2
+#define DATABASE_VERSION 3
/* MIDAS version number which will be incremented for every release */
#define MIDAS_VERSION "2.0.0"
@@ -810,8 +810,6 @@
short int event_id; /**< event ID */
short int trigger_mask; /**< trigger mask */
INT sampling_type; /**< GET_ALL, GET_SOME, GET_FARM */
- /**< dispatch function */
- void (*dispatch) (HNDLE, HNDLE, EVENT_HEADER *, void *);
} EVENT_REQUEST;
typedef struct {
Index: include/msystem.h
===================================================================
--- include/msystem.h (revision 3798)
+++ include/msystem.h (working copy)
@@ -454,6 +454,7 @@
INT event_id;
INT trigger_mask;
DWORD event_limit;
+ INT pad; // FIXME 64-bit "double" should be 64-bit aligned
double byte_limit;
double tape_capacity;
char subdir_format[32];
@@ -465,6 +466,7 @@
double bytes_written;
double bytes_written_total;
INT files_written;
+ INT pad; // FIXME pad data structure to be 64-bit aligned
} CHN_STATISTICS;
typedef struct {
ladd03:midas$ |
20 Aug 2007, Konstantin Olchanski, Forum, Midas on a x86_64 - incompatible with x86_32
|
> > I agree to make 32-bit and 64-bit compatible. In the long run, everything will be 64-bit, so I would suggest
> > in breaking the 32-bit ODB, add some padding there where needed, probably with some conditional compiling.
>
> I now have the patches to implement this. Changes turned out to be minimal:
>
> 1) midas.h: remove unused field "dispatch" from EVENT_REQUEST and bump DATABASE_VERSION from 2 to 3
> 2) msystem.h: add 32-bit padding to CHN_STATISTICS and CHN_SETTINGS
The padding of CHN_STATISTICS and CHN_SETTINGS is not working right - somehow mhttpd and mlogger keep recreating the
data in ODB and erasing the padding fields. I am looking into this.
K.O. |
22 Jan 2007, Carl Metelko, Forum, Midas on a x86_64
|
Hi,
has anyone managed to get midas to work on a x86_64 processor. I followed the
instructions for the 64-bit opteron but i am getting runtime error when trying
the examples.
When running example/basic/odb_test I getting errors like
[odb.c:6818:db_get_record] struct size mismatch for "/Alarms/Alarms/Demo ODB"
(464 instead of 452)
[odb.c:6818:db_get_record] struct size mismatch for "/Alarms/Alarms/Demo ODB"
(464 instead of 452)
[midas.c:16576:al_check] Cannot get alarm record
Any ideas what is wrong? |
22 Jan 2007, Konstantin Olchanski, Forum, Midas on a x86_64
|
> has anyone managed to get midas to work on a x86_64 processor. I followed the
> instructions for the 64-bit opteron but i am getting runtime error when trying
> the examples.
We run 64-bit MIDAS on RHEL4 with 64-bit ROOT and everything generally works,
except for compatibility problems with 32-bit MIDAS.
Everything should work if you ensure that on your 64-bit machine everything is
compiled 64-bit (including the mserver - we always forget to install the correct version
to /usr/local/bin). 32-bit MIDAS programs running on other machine
can talk to 64-bit MIDAS via the mserver.
The big problem is that 64-bit and 32-bit ODB turned out to be incompatible - several data
fields have different sizes - and we did not decide yet how to fix this. Any fix will involve
breaking the binary ODB for one of the two platforms (we could break both, just to be fair, heh!)
> When running example/basic/odb_test I getting errors like
> [odb.c:6818:db_get_record] struct size mismatch for "/Alarms/Alarms/Demo ODB" (464 instead of 452)
Yes, data size mismatch errors indicates that you mixed 32-bit and 64-bit MIDAS. Recompile everyting
as 64-bit, remove all the dot-ODB files, remove all the shared memory segments (ipcrm),
then everything should work.
K.O. |
26 Jan 2007, Carl Metelko, Forum, Midas on a x86_64
|
I upgraded from 1.9.5 to the latest on SVN an it works fine |
20 Oct 2009, Peter Simpson, Forum, Midas in linux
|
Hi,
I'm new to both Linux and Midas and having trouble installing the programme -
the install file suggeats that I should have a directory:
midas/unix
but this doesn't appear.
Also, when running the "make" for the library (step 3 of the installation), I
don't see the directory "zlib.a", but there is a "libz.a" and I note the
original .tar file has no libz.a file in it. Is this a typo on the
installation?
The terminal window at this point displays:
undefined reference to `errno'
make: ***[example] Error 1
Further help will no doubt be required as I've used windows throughout my
research and now looking to learn how to use linux. Any help greatly
appreciated. Thanks! |
08 Jun 2006, Konstantin Olchanski, Bug Report, Midas does not build on Fedora 5
|
Fresh svn checkout of MIDAS does not build on Fedora 5, I get this error:
cc -c -g -O2 -Wall -Wuninitialized -Iinclude -Idrivers -I../mxml -Llinux/lib
-DINCLUDE_FTPLIB -D_LARGEFILE64_SOURCE -DHAVE_ROOT -pthread
-I/triumfcs/trshare/olchansk/root/root_v5.10.00_SL40/include -m32 -DOS_LINUX
-fPIC -Wno-unused-function -o linux/lib/odb.o src/odb.c
src/odb.c: In function 'db_open_database':
src/odb.c:805: warning: dereferencing type-punned pointer will break
strict-aliasing rules
src/odb.c: In function 'db_lock_database':
src/odb.c:1350: warning: dereferencing type-punned pointer will break
strict-aliasing rules
cc: Internal error: Segmentation fault (program cc1)
Please submit a full bug report.
See <URL:http://bugzilla.redhat.com/bugzilla> for instructions.
make: *** [linux/lib/odb.o] Error 1
If I compile odb.c without "-O2", the rest of MIDAS builds without any more errors.
The observed warnings are (I do not know what they mean):
warning: dereferencing type-punned pointer will break strict-aliasing rules
warning: missing sentinel in function call (Cannot do without sentinels, eh?)
warning: pointer targets in passing argument 3 of 'getsockname' differ in signedness
warning: non-local variable '<anonymous struct> out_info' uses anonymous type
The "invalid lvalue" errors seem to have been successfully vanquished.
K.O. |
29 Apr 2024, Musaab Al-Bakry, Forum, Midas Sequencer with less than 1 second wait
|
Hello there,
I am working on a task that involves some ODB changes that happen within 20-500
ms. The wait command for Midas Sequencer only works for > 1 second. As a
workaround, I tried calling a python script that has a time.sleep() command, but
the sequencer doesn't wait for the python script to terminate before moving to the
next command. Obviously, I could just move the entire script to python, but that
would cause further issues to us. Is there a way to have a wait that has precision
in order of milliseconds from within the Midas Sequencer? If there is no Midas-
native methods for doing this, what workaround will you suggest to get this to
work? |
29 Apr 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait
|
I guess the simplest way to do that without breaking with existing code is to change the
second number to a float. So a
WAIT SECONDS, 1
will still work, and you can then write
WAIT SECONDS, 0.01
to get a 10 ms wait. Would that work for you?
Stefan |
30 Apr 2024, Scott Oser, Forum, Midas Sequencer with less than 1 second wait
|
> I guess the simplest way to do that without breaking with existing code is to change the
> second number to a float. So a
>
> WAIT SECONDS, 1
>
> will still work, and you can then write
>
> WAIT SECONDS, 0.01
>
> to get a 10 ms wait. Would that work for you?
This would work fine in principle, but isn't implemented in the current MIDAS sequencer as we understand it. (We tried!) Is your proposal to rewrite the sequencer
to allow fractional waits? Right now the code seems to store the start_time as a DWORD and uses atoi to parse the wait time, and uses ss_time (which seems only get
the time to the nearest second) to fetch the time. |
30 Apr 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait
|
> This would work fine in principle, but isn't implemented in the current MIDAS sequencer as we understand it. (We tried!) Is your proposal to rewrite the sequencer
> to allow fractional waits? Right now the code seems to store the start_time as a DWORD and uses atoi to parse the wait time, and uses ss_time (which seems only get
> the time to the nearest second) to fetch the time.
No it's not implemented, was just my idea. If it would work for you, I can implement it in the next couple of days.
Stefan |
30 Apr 2024, Scott Oser, Forum, Midas Sequencer with less than 1 second wait
|
> > This would work fine in principle, but isn't implemented in the current MIDAS sequencer as we understand it. (We tried!) Is your proposal to rewrite the sequencer
> > to allow fractional waits? Right now the code seems to store the start_time as a DWORD and uses atoi to parse the wait time, and uses ss_time (which seems only get
> > the time to the nearest second) to fetch the time.
>
> No it's not implemented, was just my idea. If it would work for you, I can implement it in the next couple of days.
>
> Stefan
Yes, please! Something like WAIT seconds, 0.01 would be perfect. |
30 Apr 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait
|
While I will do it, i'm not sure if this is what you want. If I understand correctly, some process gets triggered and then writes some values to the ODB, then the sequencer
should continue. Putting a wait there is dangerous. Maybe your process always takes like 10-20 ms, so you put a wait of let's say 100ms, and things are fine with you. Your
script runs many days, but then once in a while your machine is on heavy load because someone starts a web browser, and your process takes 110ms, and you script crashes.
I would rather go following path: put a "done" flag in the ODB, which is the last one which gets set by your process. Then the sequencer does a
WAIT ODBvalue, /path/value, =, 1
which will work always, independend of the delay of your process.
Stefan |
30 Apr 2024, Scott Oser, Forum, Midas Sequencer with less than 1 second wait
|
> While I will do it, i'm not sure if this is what you want. If I understand correctly, some process gets triggered and then writes some values to the ODB, then the sequencer
> should continue. Putting a wait there is dangerous. Maybe your process always takes like 10-20 ms, so you put a wait of let's say 100ms, and things are fine with you. Your
> script runs many days, but then once in a while your machine is on heavy load because someone starts a web browser, and your process takes 110ms, and you script crashes.
>
> I would rather go following path: put a "done" flag in the ODB, which is the last one which gets set by your process. Then the sequencer does a
>
> WAIT ODBvalue, /path/value, =, 1
>
> which will work always, independend of the delay of your process.
>
> Stefan
Our use case is pretty simple and I don't think is affected by the scenario you envision. We want to turn on a setting in our equipment, and turn it off again some 0.2 s later. We don't need msec timing. So something like:
ODBSET /somekey 1 # this will cause a front-end to flip a bit in our hardware
WAIT seconds, 0.2
ODBSET /somekey 0 # this will cause a front-end to reset a bit in our hardware
It is true that if the load is high there could be a little delay, and the time that the bit is set will not be 0.2 seconds, but on average it should work,
and it should be good enough we think.
Yes, we could also check an ODB key to see that something is done, but we'd still need the ability to wait for time intervals less than 1 second, which
right now doesn't exist. |
02 May 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait
|
Ok, I implemented the float second wait function. Internally it works in ms, so the maximum resolution is 0.001 s.
Best,
Stefan |
02 May 2024, Scott Oser, Forum, Midas Sequencer with less than 1 second wait
|
> Ok, I implemented the float second wait function. Internally it works in ms, so the maximum resolution is 0.001 s.
>
> Best,
> Stefan
Thank you, we will test this soon and let you know if we see any issues (but we're not expecting any). |
05 May 2024, Musaab Al-Bakry, Forum, Midas Sequencer with less than 1 second wait
|
> > Ok, I implemented the float second wait function. Internally it works in ms, so the maximum resolution is 0.001 s.
> >
> > Best,
> > Stefan
>
> Thank you, we will test this soon and let you know if we see any issues (but we're not expecting any).
Hello Stefan,
Thank you for the help you provided for us so far. I tried your code changes on our midas fork. Now, I notice that any
wait command takes at least 0.2 seconds to run.
For example, when I use the following script:
SCRIPT source scripts/time_print.sh
WAIT Seconds, 0.1
SCRIPT source scripts/time_print.sh
WAIT Seconds, 0.1
SCRIPT source scripts/time_print.sh
The time_print.sh script prints time segments separated by almost exactly 0.2 seconds. Same goes for when I use 0.01
second waits.
However, when I use 0.2 seconds wait, then I get time segments separated by 0.3 seconds. I also tried something like
this:
SCRIPT source scripts/time_print.sh
WAIT Seconds, 0.2
WAIT Seconds, 0.2
SCRIPT source scripts/time_print.sh
WAIT Seconds, 0.2
WAIT Seconds, 0.2
SCRIPT source scripts/time_print.sh
This script results in time segements of 0.6 seconds difference. It is not immidiately clear to me from the sequencer
code what causes this effect. The code as it stands is a lot better than what we had before the changes, but I am
wondering if this can be reduced to the order of 1ms or 10ms.
Best regards,
Musaab Faozi |
06 May 2024, Stefan Ritt, Forum, Midas Sequencer with less than 1 second wait
|
Indeed there was a sleep(100ms) in the sequencer in each loop. I reduced it now to 10ms. I need at least 10ms since otherwise
the sequencer would run in an infinite loop during the wait and burn 100% CPU. The smallest time slice on Linux to sleep is
10ms, so that's why I set it to that. Give it a try.
Stefan |
|