Back Midas Rome Roody Rootana
  Midas DAQ System, Page 80 of 146  Not logged in ELOG logo
ID Datedown Author Topic Subject
  1333   04 Dec 2017 Frederik WautersBug Reportsmall bug in mfe.c init
> > There is a small bug in the mfe.c initialization for the EQ_POLLED mode. There 
> > is a routine where the number of polls fitting in eq_info->period is counted:
> > 
> > 
> >          count = 1;
> >          do {
> >             if (display_period)
> >                printf(".");
> > 
> >             start_time = ss_millitime();
> > 
> >             poll_event(equipment[idx].info.source, (INT)count, TRUE);
> > 
> >             delta_time = ss_millitime() - start_time;
> > 
> >             ...
> > 
> >             if (delta_time > 0)
> >                count = count * eq_info->period / delta_time;
> >             else
> >                count *= 100;
> >             
> >             // avoid overflows
> >             if (count > 2147483647.0) {
> >                count = 2147483647.0;
> >                break;
> >             }
> >             
> >          } while (delta_time > eq_info->period * 1.2 || delta_time < eq_info-
> > >period * 0.8);
> > 
> > As "start_time = ss_millitime();" resets "delta_time" each time, only the 
> > "avoid overflows" addition saves the day. 
> > 
> > start_time = ss_millitime(); show be out of the loop.
> 
> Nope.
> 
> What I want is to determine how often I have to call poll_event to stay there for a certain time (usually 100ms). So I iterate "count" until I roughly get to my 100ms. Each call to 
> poll_event with a different count is a new measurement, therefore I initialize start_time before each measurement. If i do it outside the loop, and kind of incrementally increase 
> it, then the whole code inside the loop is added to the measurement which makes it (slightly) wrong.
> 
> The whole loop optimization has some background. Polling can be sow (think of talking to a device via Ethernet which can easily take milli seconds). So how often do we poll 
> before we do other things in the main look (like looking if a run has been started). If I only poll once, then the average front-end response time would be poor, because I mostly 
> look if a run has been started in the main loop. This is not effective. If I poll too often inside the poll_event loop, then the front-end does not react on run stops any more. So 
> there is some optimum, and this is set by the polling time of usually 100ms. This ensures that the front-end does optimal polling - without ANYTHING in between - for about 
> 100ms. But how can I know how often I should poll for 100 ms? As said above, polling can be very fast (reading a memory cell) or very slow (network). The the best method I 
> found is to do a calibration at the startup, and this is what the code above does. Maybe there are better ways today, but that code worked nicely in the last 25 years.
> 
> Stefan

Thanks, I misunderstood the loop then. If poll_event(equipment[idx].info.source, (INT)count, TRUE); doesn`t do anything with "count", the loop becomes infinite except for the overflow 
check. 
  1332   01 Dec 2017 Stefan RittBug Reportsmall bug in mfe.c init
> There is a small bug in the mfe.c initialization for the EQ_POLLED mode. There 
> is a routine where the number of polls fitting in eq_info->period is counted:
> 
> 
>          count = 1;
>          do {
>             if (display_period)
>                printf(".");
> 
>             start_time = ss_millitime();
> 
>             poll_event(equipment[idx].info.source, (INT)count, TRUE);
> 
>             delta_time = ss_millitime() - start_time;
> 
>             ...
> 
>             if (delta_time > 0)
>                count = count * eq_info->period / delta_time;
>             else
>                count *= 100;
>             
>             // avoid overflows
>             if (count > 2147483647.0) {
>                count = 2147483647.0;
>                break;
>             }
>             
>          } while (delta_time > eq_info->period * 1.2 || delta_time < eq_info-
> >period * 0.8);
> 
> As "start_time = ss_millitime();" resets "delta_time" each time, only the 
> "avoid overflows" addition saves the day. 
> 
> start_time = ss_millitime(); show be out of the loop.

Nope.

What I want is to determine how often I have to call poll_event to stay there for a certain time (usually 100ms). So I iterate "count" until I roughly get to my 100ms. Each call to 
poll_event with a different count is a new measurement, therefore I initialize start_time before each measurement. If i do it outside the loop, and kind of incrementally increase 
it, then the whole code inside the loop is added to the measurement which makes it (slightly) wrong.

The whole loop optimization has some background. Polling can be sow (think of talking to a device via Ethernet which can easily take milli seconds). So how often do we poll 
before we do other things in the main look (like looking if a run has been started). If I only poll once, then the average front-end response time would be poor, because I mostly 
look if a run has been started in the main loop. This is not effective. If I poll too often inside the poll_event loop, then the front-end does not react on run stops any more. So 
there is some optimum, and this is set by the polling time of usually 100ms. This ensures that the front-end does optimal polling - without ANYTHING in between - for about 
100ms. But how can I know how often I should poll for 100 ms? As said above, polling can be very fast (reading a memory cell) or very slow (network). The the best method I 
found is to do a calibration at the startup, and this is what the code above does. Maybe there are better ways today, but that code worked nicely in the last 25 years.

Stefan
  Draft   01 Dec 2017 Frederik WautersForumODB as JSON file and reverse
> Hi. Is it currently possible to automatically save the MIDAS ODB as a JSON file?
> I can do it manually in odbedit, but it looks like the only option for the
> automatic ODB save for each run is the standard .ODB format. Is there a way to
> change this?

I have the reverse question: Once we have a json file, can it be loaded again in the odb? We also use the odb as our analysis config. I don`t like .odb files in our repo, so we currently load settings in shell scripts (e.g. run dependent detector calibrations)
  1330   01 Dec 2017 Frederik WautersBug Reportsmall bug in mfe.c init
There is a small bug in the mfe.c initialization for the EQ_POLLED mode. There 
is a routine where the number of polls fitting in eq_info->period is counted:


         count = 1;
         do {
            if (display_period)
               printf(".");

            start_time = ss_millitime();

            poll_event(equipment[idx].info.source, (INT)count, TRUE);

            delta_time = ss_millitime() - start_time;

            ...

            if (delta_time > 0)
               count = count * eq_info->period / delta_time;
            else
               count *= 100;
            
            // avoid overflows
            if (count > 2147483647.0) {
               count = 2147483647.0;
               break;
            }
            
         } while (delta_time > eq_info->period * 1.2 || delta_time < eq_info-
>period * 0.8);

As "start_time = ss_millitime();" resets "delta_time" each time, only the 
"avoid overflows" addition saves the day. 

start_time = ss_millitime(); show be out of the loop.
  1329   21 Nov 2017 Konstantin OlchanskiReleasePending release of midas
We are readying a new release of midas and it is almost here except for a few buglets on the new html status page.

The current release candidate branch is "feature/midas-2017-10" and if you have problems with the older versions
of midas, I recommend that you try this release candidate to check if your problem is already fixed. If the problem
still exists, please file a bug report on this forum or on the bitbucket issue tracker
https://bitbucket.org/tmidas/midas/issues?status=new&status=open

Highlights of the new release include
- new and improved web pages done in html and javascript
- many bug fixes and improvements for json and json-rpc support, including improvements in handling of long strings in odb
- locked (protected) operation of odb, where odb shared memory is not writable outside of odb operations
- improved multithead support for odb
- fixes for odb corruption when odb becomes 100% full

For the next release we hope to switch midas from C to fully C++ (building everything with C++ already works). To support el6 we avoid use of 
c++11 language constructs.

K.O.
  1328   21 Nov 2017 Konstantin OlchanskiInfoMIDAS support on el5?
It has been reported that the current midas release candidate does not build on el5 linux (SL/RHEL/CentOS-5).

According to Red Hat, el5 is end-of-life, last SL 5 (SL5.11) was done in 2014, so this linux is very old. Also as it happens, I do not have access to any 
el5 machines to check if midas builds or runs (but this can be fixed).

https://www.scientificlinux.org/downloads/sl-versions/sl5/
https://access.redhat.com/support/policy/updates/errata

On the midas web page (https://midas.triumf.ca) we do not explicitly state which versions of which linux we definitely support. Most other open-
source projects only support current major linux distributions, hardly anybody supports end-of-life linuxes such as el5. Some projects do not even 
support recent linuxes still widely in use (ROOT6 does not build on stock el6 and there is no KDE5 for el7).

So back to midas. Support for different operating systems comes down to:

1) C/C++ language support. We still use el6 (GCC 4.4.7), so use of c++-11 language features should be avoided
2) operating system features support:
a) sysv semaphores (sysv shared memory no longer used, cannot be used on macos)
aa) (macos also is missing parts of the sysv semaphore api, such as "wait for lock, with timeout", we are using an ugly work-around)
b) posix shared memory with mprotect() & co
c) posix mutexes, including recursive-type mutexes (this seems to be the problem on el5)
d) bsd networking (need to migrate from select() to poll() and from gethostbyname() to getaddrinfo() & co (for IPv6 support))

Not all of these operating system functions are required for all of midas. Running mhttpd and mlogger requires
pretty much everything. Running just a frontend connected to midas through the mserver requires the least features,
just the networking is enough, I think.

Obviously we cannot support midas in perpetuity on all versions of all operating systems, once I do not have
access to a machine, I cannot even check that midas builds and that it runs the basic functions.

Instead, we could provide a "feature reduced" build of midas (makefile target) that includes "just enough" of midas
to (say) run a frontend, maybe even odbedit. We already have some provisions for this, but no obvious documented
way actually doing it.

So back to el5.

How important it is to support very old operating systems?
How many people still use el5?
How about old  versions of Ubuntu? Macos?

If you use anything older than el6, can you speak up,
(and if possible say why you cannot migrate to an up-to-date linux).

K.O.
  1327   21 Nov 2017 Konstantin OlchanskiBug Reportbug in init of hv class driver
> 
> The original idea behind the hv driver is that all voltages in the ODB and the class driver are
> positive. If you have a negative power supply, then the voltage is inverted at the device
> driver level. That's why you have MIN and MAX in the class driver.
> 

This rings a bell. I used the hv class driver to write a frontend for the L1440 mainframe (negative voltage),
on ODB it will be positive values, when writing to the device I had to add a minus sign,
and when reading back they came back negative and I had to add an fabs() in the comparison
between readback and demand.

Persons with bipolar power supplies need not apply.

K.O.
  1326   21 Nov 2017 Stefan RittSuggestionFeature request: Separate ODB flag to show programs on "Programs page"
> > Currently one has to set the required flag in the ODB (e.g., /Programs/Logger/Required) to "y" for the program 
> > to appear on the "Programs page" and being able to start and stop the program easily.
> > 
> > However, if one wants to run with the "Prevent start on required progs" in /Experiment enabled, all the 
> > programs in the "Programs page" need to be running and one cannot have one of them stopped while still 
> > taking a run.
> > 
> > It would be nice to separate these two functionalities: Have a flag that makes the program appear on the 
> > "Programs page" and have a flag that controls the "Prevent start on required frogs" functionality.
> 
> I agree. All the programs should be always visible on the "programs" page, there should be /Programs/xxx/hidden to 
> hide them, and /Programs/xxx/required should be used for "Prevent start on required progs".

Konstantin, since you wrote the current "Programs" page, can you add that feature to the display (well, when you have time). I guess we 
event don't have to change the subdirectory structure (which might lead to incopatibilities), but just show a program if the "Start command" 
is non-null. If there is no start command, it does not make sense to start that program, so it can be hidden.

Stefan
  1325   21 Nov 2017 Stefan RittBug Reportbug in init of hv class driver
> bug in init
> -----------
> 
> I used the lv.c class driver, combined with a custom device driver, to control 
> our Keithley2611B source meter. This to set negative voltage on Si detectors.
> 
> In the 'init' routing, the class driver sets the hv:
> 
>   hv_info->demand_mirror[i] = MIN(hv_info->demand[i], hv_info->voltage_limit[i]);
> 
> This fails for negative voltage, as it sets the (negative) voltage limit, instead 
> of the demand voltage. A simple 'fabs' solves this.
> 
> suggestion for 'idle'
> ---------------------
> 
> I let the device do the ramping, not the driver. This also means I have to reset 
> the state of the device (current limit) after ramping. The easiest way to to 
> this, is using CMD_IDLE of the device driver. This is currently not done in the 
> hv.c class driver.

I can't find the line you quote in the class driver. Why don't you make a git pull request
and I will approve it.

The original idea behind the hv driver is that all voltages in the ODB and the class driver are
positive. If you have a negative power supply, then the voltage is inverted at the device
driver level. That's why you have MIN and MAX in the class driver.

Stefan
  1324   17 Nov 2017 Konstantin OlchanskiBug Reportbug in init of hv class driver
Hi, Frederick, this is my personal opinion on the slow controls hv classes, I have
used them a couple of times and I found them full of little buglets like this,
plus some incomplete functions, plus some missing features, plus it is all
written in C trying to do object oriented programming. On the balance my opinion
is that it is less work to write a high voltage control program in C++ from scratch
using the regular midas frontend infrastructure compared to having to understand
the hv class driver, write the missing bits, fix the little buglets, debug
the crashes in the C string handling, and what not. (For example I had to debug
mysterious failures to pass float and double values through the C stdarg interface,
there are more fun things to do out there).

K.O.

> bug in init
> -----------
> 
> I used the lv.c class driver, combined with a custom device driver, to control 
> our Keithley2611B source meter. This to set negative voltage on Si detectors.
> 
> In the 'init' routing, the class driver sets the hv:
> 
>   hv_info->demand_mirror[i] = MIN(hv_info->demand[i], hv_info->voltage_limit[i]);
> 
> This fails for negative voltage, as it sets the (negative) voltage limit, instead 
> of the demand voltage. A simple 'fabs' solves this.
> 
> suggestion for 'idle'
> ---------------------
> 
> I let the device do the ramping, not the driver. This also means I have to reset 
> the state of the device (current limit) after ramping. The easiest way to to 
> this, is using CMD_IDLE of the device driver. This is currently not done in the 
> hv.c class driver.
  1322   17 Nov 2017 Konstantin OlchanskiSuggestionFeature request: Separate ODB flag to show programs on "Programs page"
> Currently one has to set the required flag in the ODB (e.g., /Programs/Logger/Required) to "y" for the program 
> to appear on the "Programs page" and being able to start and stop the program easily.
> 
> However, if one wants to run with the "Prevent start on required progs" in /Experiment enabled, all the 
> programs in the "Programs page" need to be running and one cannot have one of them stopped while still 
> taking a run.
> 
> It would be nice to separate these two functionalities: Have a flag that makes the program appear on the 
> "Programs page" and have a flag that controls the "Prevent start on required frogs" functionality.

I agree. All the programs should be always visible on the "programs" page, there should be /Programs/xxx/hidden to 
hide them, and /Programs/xxx/required should be used for "Prevent start on required progs".

K.O.
  1321   15 Nov 2017 Andreas KnechtSuggestionFeature request: Separate ODB flag to show programs on "Programs page"
Currently one has to set the required flag in the ODB (e.g., /Programs/Logger/Required) to "y" for the program 
to appear on the "Programs page" and being able to start and stop the program easily.

However, if one wants to run with the "Prevent start on required progs" in /Experiment enabled, all the 
programs in the "Programs page" need to be running and one cannot have one of them stopped while still 
taking a run.

It would be nice to separate these two functionalities: Have a flag that makes the program appear on the 
"Programs page" and have a flag that controls the "Prevent start on required frogs" functionality.
  1320   10 Nov 2017 Frederik WautersBug Reportbug in init of hv class driver
bug in init
-----------

I used the lv.c class driver, combined with a custom device driver, to control 
our Keithley2611B source meter. This to set negative voltage on Si detectors.

In the 'init' routing, the class driver sets the hv:

  hv_info->demand_mirror[i] = MIN(hv_info->demand[i], hv_info->voltage_limit[i]);

This fails for negative voltage, as it sets the (negative) voltage limit, instead 
of the demand voltage. A simple 'fabs' solves this.

suggestion for 'idle'
---------------------

I let the device do the ramping, not the driver. This also means I have to reset 
the state of the device (current limit) after ramping. The easiest way to to 
this, is using CMD_IDLE of the device driver. This is currently not done in the 
hv.c class driver.
  1319   02 Nov 2017 Konstantin OlchanskiBug FixFixed mlogger memory corruption, updated mxml
I the agdaq system I see memory corruption in the mlogger. There were at least two bugs: one 
memory allocation error in mxml and one incorrect memset() in mlogger.cxx. The mxml bug is fixed 
in the mxml repository, mlogger.cxx bug is fixed in the midas-2017-10 branch.

I suggest that all update mxml to the latest version: (without waiting for the new midas release)
https://bitbucket.org/tmidas/mxml/commits/branch/master

K.O.
  1318   13 Oct 2017 Konstantin OlchanskiInfoodb multithread support repaired
multithreaded access to odb was implemented back in 2013-2014. but recently a bug surfaced - 
there was a race condition in the odb locking code against cm_watchdog(). Somehow this only 
affected the mserver for the DRAGON experiment at TRIUMF. This is now fixed on the branch 
feature/midas-2017-10. (this branch collects all the code that needs additional testing before 
merging into develop and becoming the next release of midas).
K.O.
  1317   11 Oct 2017 Konstantin OlchanskiInfoadded support for ucLinux
Support for building for ucLinux was added to MIDAS. I use the emcraft toolchain and userland on 
some kind of embedded ARM CPU that does not have an MMU. See the Makefile for details. The 
main difference of ucLinux is lack of fork(), which cannot be done without an MMU. Not everything 
works, but at the least I can run a frontend and connect to an experiment on a remote host 
computer (mserver connection). K.O.
  1316   13 Aug 2017 Stefan RittSuggestionIncreasing Max Number of Frontends
The type INT has been defined in 1989 when I for the first time sent data between a 16-bit MS-DOS computer and a 32-bit VAX computer (good old 
days!). At that time, uint32_t was not available at all. So much for the historical background.

I agree that switching from INT to int32_t is getting closer to standards and might help new people better understand things. This means however to 
touch all midas files and change about 5000 (!) locations:

BYTE -> uint8_t
WORD -> uint16_t
DWORD -> uint32_t
INT -> int32_t

Next we have the midas data types TID_xxx?

The nice thing now is that for example WORD and TID_WORD belong together and this is obvious. For uint16_t and TID_WORD is is not so obvious 
any more, so I guess we should rename TID_WORD to TID_UINT16_t. The same fore 

TID_BYTE -> TID_UINT8_T
TID_SBYTE -> TID_INT8_T
TID_WORD -> TID_UINT16_T
TID_DWORD -> TID_UINT32_T
TID_INT -> TID_INT32_T

But if we changer TID_XXX, the ASCII representations of the ODB break compatibility! Right now we have for example

[/Experiment]
midas http port = INT : 8080

which will become

[/Experiment]
midas http port = INT32_T : 8080

so one cannot load old ODB files any more!

With JSON encoding it's better because only the type number is stored, not the string. So INT -> 7 could stay, although in my opinion encoding the 
type in an integer number is not good for readability. Nobody knows what "7" means as a type. You always have to do a look-up in midas.c and count 
array indices manually.

I'm not sure how many experiments use the ASCII ODB format in one way or the other in some custom scripts. It might be that changing the format 
might have severe side effects for some experiments, so before we undertake this endeavor I would like to get some feedback here on the forum 
about people from other experiments and see what they think.

Stefan

> > if (sizeof(INT) != 4) then severe_error_and_stop_all_programs()
> 
> Quick reply.
> 
> Today, for fixed size data types one should use uint32_t & co, see
> stdint.h
> https://en.wikipedia.org/wiki/C_data_types#stdint.h
> https://en.wikipedia.org/wiki/C99 (scroll down and click to open "implementation -> compiler support"
> 
> The other popular convention is "u32" used by the Linux kernel, you will see it in the linux kernel drivers.
> 
> If I remember right, WORD and DWORD grow legs from the 16-bit Motorolla 68xxx processors,
> VxWorks and the VME bus. At some point the data buses were 16-bit wide and that we the WORD.
> 
> (I do not think UNIX ever used the WORD/DWORD names, i.e. MacOS has int32_t and u_int32_t).
> 
> K.O.
  1315   13 Aug 2017 Konstantin OlchanskiSuggestionIncreasing Max Number of Frontends
> if (sizeof(INT) != 4) then severe_error_and_stop_all_programs()

Quick reply.

Today, for fixed size data types one should use uint32_t & co, see
stdint.h
https://en.wikipedia.org/wiki/C_data_types#stdint.h
https://en.wikipedia.org/wiki/C99 (scroll down and click to open "implementation -> compiler support"

The other popular convention is "u32" used by the Linux kernel, you will see it in the linux kernel drivers.

If I remember right, WORD and DWORD grow legs from the 16-bit Motorolla 68xxx processors,
VxWorks and the VME bus. At some point the data buses were 16-bit wide and that we the WORD.

(I do not think UNIX ever used the WORD/DWORD names, i.e. MacOS has int32_t and u_int32_t).

K.O.
  1314   13 Aug 2017 Stefan RittSuggestionIncreasing Max Number of Frontends
I agree that the binary compatibility checks are crucial. But I kind of find it strange if one gets an assert failure some where if one tries to change MAX_CLIENTS. It is then not straight 
forward to relate both things and understand the consequences. That's why I put a comment next to the definition of MAX_CLIENTS saying:

/* note that if you change any of the following items, the ODB and the event shared memory buffers 
   become binary incopatible and one has to recompile ALL programs which are locally connected to the 
   ODB and to event buffers */

I think this is more descriptive than just a failing assert. 

If you look carefully in my proposal below, you will see that I rather used

sizeof(INT)

and not 

sizeof(int)

since as KO stated correctly sizeof(int) can change between different architectures. The derived type INT (all uppercase) has been carefully designed to have 32 bits on all architectures. So 
it will NOT change between them. If it does change, then we have a principal problem and many more things will break down. We should therefore have something like

if (sizeof(INT) != 4) then severe_error_and_stop_all_programs()

Now given that sizeof(INT) is everywhere the same, we can use it in the test

sizeof(BUFFER_HEADER) == NAME_LENGTH + 7*sizeof(INT) + MAX_CLIENTS *(NAME_LENGTH+13*sizeof(INT)+sizeof(float)+2*sizeof(DWORD)+MAX_EVENT_REQUESTS*4*sizeof(INT))

which then basically tests the structure byte alignment and padding. The comment above should warn users to change MAX_CLIENTS without thinking. 

Another strategy would be to put sizeof(BUFFER_HEADER) as the first two byes of the structure itself. We can the dynamically test the size of each bm_open_buffer(), and if the local size 
differs from the one saved in the buffer header, the program refuses to start, so we know exactly which program should have to be recompiled. The downside of this would be that the 
header structure has to be changed and we break binary compatibility with all existing programs. But maybe we should do this step once and be safe in the future.

Stefan


> The checks for byte sizes of critical data structures have been added to ensure (enforce) binary compatibility
> of midas with itself on different platforms (32-bit and 64-bit intel, on PPC, on ARM, etc).
> 
> This has worked well in the past and helped avoid problems and subtle bugs in the transition
> from 32-bit to 64-bit machines a few years ago. Of course now 32-bit machines are back
> as ARM CPUs and FPGA synthetic CPUs.
> 
> Replacing the checks with "computed" values will defeat this purpose because the values may be computed
> differently on different machines.
> 
> Specifically as proposed by Stefan, sizeof(int) can change depending on the target machine and depending
> on the compiler settings.
> 
> Of course this needs to be balanced against flexibility to adjust important settings like MAX_CLIENTS and MAX_EVENT_REQUESTS.
> 
> I would say the present system is just fine. You can change MAX_CLIENTS, rebuild MIDAS and it will not run (assert failure) giving
> you an indication that you are doing something non-trivial that will cause problems if you do it without thinking about it.
> 
> For example, one may think nothing of changing midas.h and recompiling MIDAS. But having to change odb.c
> may ring the little bell to tell you that you *also* have to rebuild *all* of your frontends. Even one unrebuilt frontend
> will corrupt all shared memory and crash everything.
> 
> I guess one other way to look at this is as a balance between something a few people do rarely against
> a function that protects everybody all the time.
> 
> That said, I think the checks should be reworked, instead of an assert failure they should give the error message
> and tell the user exactly what number to adjust in the size test. Also some checks are obsolete, there is no longer
> need to check the size of many ODB structures (equipment, etc). Once we are done with the db_get_record() rework,
> only checks for data structures in shared memory shall remain.
> 
> As the bottom line, to change MAX_CLIENTS, you already have to edit midas.h, asking you to also edit odb.c does
> not add much to the burden.
> 
> P.S. We are thinking how to make all these values dynamically changable, but basically it requires rolling out
> a new binary-incompatible version of MIDAS with added bugs. Maybe some day.
> 
> K.O.
> 
> 
> > The sizeof checks were originally invented by KO to check for binary compatibility between processes attached to the same ODB and event buffers. So if a 
> > compiler generates different structure sizes due to different padding, one would see that immediately. I wonder however if the absolute numbers make sense 
> > here. We could replace the 16444 by
> > 
> > NAME_LENGTH + 7*sizeof(INT) + MAX_CLIENTS *(NAME_LENGTH+13*sizeof(INT)+sizeof(float)+2*sizeof(DWORD)+MAX_EVENT_REQUESTS*4*sizeof(INT))
> > 
> > which makes this value automatically scale when one changes MAX_CLIENTS.
> > 
> > People of course have to be aware that if one changes MAX_CLIENTS, then all programs connected to the same ODB or event buffer need to be re-compiled 
> > and the ODB needs to be re-created from an ASCII file, but at least this would avoid tedious manual calculations.
> > 
> > Any opinion?
> > 
> > Stefan
> > 
> > 
> > > Below are the steps we used to increase the maximum number of frontends that we could run.
> > > 
> > > In midas.h
> > > 
> > > #define MAX_CLIENTS            64
> > > 
> > > changed to
> > > 
> > > #define MAX_CLIENTS            128
> > > 
> > > In msystem.h:
> > > 
> > > #define MAX_RPC_CONNECTION     64
> > > 
> > > changed to
> > > 
> > > #define MAX_RPC_CONNECTION     128
> > > 
> > > In odb.c:
> > > 
> > > assert(sizeof(BUFFER_HEADER) == 16444); 
> > > 
> > > GUESS: 256*64+60 = 16444, so change 64 to 128
> > > 
> > > changed to:                                                                                                                         
> > > 
> > > assert(sizeof(BUFFER_HEADER) == 32828); //256*128+60
> > > 
> > >  
> > > 
> > > DATABASE_HEADER = 64 + 64*DATABASE_CLIENT = 64 + 64*8256 = 528448
> > > 
> > > changed to:
> > > 
> > > DATABASE_HEADER = 64 + 128*DATABASE_CLIENT = 64 + 128*8256 = 1056832.
  1313   12 Aug 2017 Konstantin OlchanskiSuggestionIncreasing Max Number of Frontends
The checks for byte sizes of critical data structures have been added to ensure (enforce) binary compatibility
of midas with itself on different platforms (32-bit and 64-bit intel, on PPC, on ARM, etc).

This has worked well in the past and helped avoid problems and subtle bugs in the transition
from 32-bit to 64-bit machines a few years ago. Of course now 32-bit machines are back
as ARM CPUs and FPGA synthetic CPUs.

Replacing the checks with "computed" values will defeat this purpose because the values may be computed
differently on different machines.

Specifically as proposed by Stefan, sizeof(int) can change depending on the target machine and depending
on the compiler settings.

Of course this needs to be balanced against flexibility to adjust important settings like MAX_CLIENTS and MAX_EVENT_REQUESTS.

I would say the present system is just fine. You can change MAX_CLIENTS, rebuild MIDAS and it will not run (assert failure) giving
you an indication that you are doing something non-trivial that will cause problems if you do it without thinking about it.

For example, one may think nothing of changing midas.h and recompiling MIDAS. But having to change odb.c
may ring the little bell to tell you that you *also* have to rebuild *all* of your frontends. Even one unrebuilt frontend
will corrupt all shared memory and crash everything.

I guess one other way to look at this is as a balance between something a few people do rarely against
a function that protects everybody all the time.

That said, I think the checks should be reworked, instead of an assert failure they should give the error message
and tell the user exactly what number to adjust in the size test. Also some checks are obsolete, there is no longer
need to check the size of many ODB structures (equipment, etc). Once we are done with the db_get_record() rework,
only checks for data structures in shared memory shall remain.

As the bottom line, to change MAX_CLIENTS, you already have to edit midas.h, asking you to also edit odb.c does
not add much to the burden.

P.S. We are thinking how to make all these values dynamically changable, but basically it requires rolling out
a new binary-incompatible version of MIDAS with added bugs. Maybe some day.

K.O.


> The sizeof checks were originally invented by KO to check for binary compatibility between processes attached to the same ODB and event buffers. So if a 
> compiler generates different structure sizes due to different padding, one would see that immediately. I wonder however if the absolute numbers make sense 
> here. We could replace the 16444 by
> 
> NAME_LENGTH + 7*sizeof(INT) + MAX_CLIENTS *(NAME_LENGTH+13*sizeof(INT)+sizeof(float)+2*sizeof(DWORD)+MAX_EVENT_REQUESTS*4*sizeof(INT))
> 
> which makes this value automatically scale when one changes MAX_CLIENTS.
> 
> People of course have to be aware that if one changes MAX_CLIENTS, then all programs connected to the same ODB or event buffer need to be re-compiled 
> and the ODB needs to be re-created from an ASCII file, but at least this would avoid tedious manual calculations.
> 
> Any opinion?
> 
> Stefan
> 
> 
> > Below are the steps we used to increase the maximum number of frontends that we could run.
> > 
> > In midas.h
> > 
> > #define MAX_CLIENTS            64
> > 
> > changed to
> > 
> > #define MAX_CLIENTS            128
> > 
> > In msystem.h:
> > 
> > #define MAX_RPC_CONNECTION     64
> > 
> > changed to
> > 
> > #define MAX_RPC_CONNECTION     128
> > 
> > In odb.c:
> > 
> > assert(sizeof(BUFFER_HEADER) == 16444); 
> > 
> > GUESS: 256*64+60 = 16444, so change 64 to 128
> > 
> > changed to:                                                                                                                         
> > 
> > assert(sizeof(BUFFER_HEADER) == 32828); //256*128+60
> > 
> >  
> > 
> > DATABASE_HEADER = 64 + 64*DATABASE_CLIENT = 64 + 64*8256 = 528448
> > 
> > changed to:
> > 
> > DATABASE_HEADER = 64 + 128*DATABASE_CLIENT = 64 + 128*8256 = 1056832.
ELOG V3.1.4-2e1708b5