Back Midas Rome Roody Rootana
  Midas DAQ System, Page 129 of 144  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
ID Date Author Topic Subject
  308   27 Sep 2006 Konstantin OlchanskiSuggestionIncrease of maximum event size
> The current event size in midas is limited to 512k (MAX_EVENT_SIZE in midas.h)

Yes, 512 kBytes is rather small. For the T2K prototype TPC DAQ, I built and ran
MIDAS with 4 MByte events, and it worked fine.

Now, we have per-buffer tunable size (see message
https://ladd00.triumf.ca/elog/Midas/283) and in the long run, I would prefer the
compiled-in limit to go away: already all memory is allocated dynamically and
the MAX_EVENT_SIZE is only useful as kind of a sanity check against frontend
misconfiguration or against malformed events.

If MAX_EVENT_SIZE goes away, the maximum event size becomes limited by the
largest SysV shared memory segment permitted by Linux (via sysctl kernel.shmmax).

To go beyound the limit on SysV shared memories, on can use mmap() based shared
memory: this is limited by available RAM+swap (and disk space for the
.SYSTEM.SHM file). Current MIDAS system.c has an experimental implementation of
mmap() shared memory, but AFAIK it has not been used in any production system, yet.

K.O.
  307   27 Sep 2006 Konstantin OlchanskiBug Reportmhttpd elog corruption via double-edit
[quote="Stefan Ritt"][Quote="K.O.]Aparently the mhttpd elog will corrupt the
elog files if two (or more\?) elog entries are being edited at the same time.
K.O.[/quote]

The corruption is very simple. mhttpd elog indexes the elog entries by the elog
file and offset inside the file, i.e. "http://ladd00:8088/EL/060927.318",
"060927" corresponds to log file "060927.log", "318" is the offset inside the
file where the message is located.

During "edit", the code "remembers" the offset of the original message and in
el_submit() blindly writes the edited message into the file at the remembered
offset.

If another message was edited before the edit of the first message is submitted,
the remembered offset becomes invalid (messages have shifted inside the file)
and el_submit() writes the edited text into the wrong place in the file,
corrupting it.

I have now added a check for this and we crash instead of corrupting the elog
file (midas.c rev 3340).

I do not know how to "properly" fix this bug without changing the indexing
scheme to something similar to what is used by elogd- message numbers instead of
file indices. In the existing scheme, message editing also breaks URLs shown in
the email notifications (they contain file indices that point to the wrong
places after messages are moved around by editing) and "reply threading" links.

Here is how I reproduce this bug:

1) start with an empty elog
2) create two messages
3) "edit" the second message, but do not submit it yet.
4) "edit" the first message, change the text to make sure the message size
becomes different; submit this change.
5) submit the "edit" of the first message. !!BOOM!!

K.O.
  306   24 Sep 2006 Stefan RittBug Reportmhttpd elog corruption via double-edit

K.O. wrote:
Aparently the mhttpd elog will corrupt the elog files if two (or more\?) elog entries are being edited at the same time. K.O.


That's strange. Since mhttpd is single threaded, there should not be any multi-thread/process conflict there, since the elog files cannot be written simultaneously from two different browser sessions. If entries are edited at the same time, they get then submitted one after the other. Of course it is possible to edit the same entry, in which case the second submission "wins", overwriting the first one without notification. Withing the standalone elog server there is the option to lock entries ("use lock = 1") to prevent this, but this feature is not present in the mhttpd elog.
  305   23 Sep 2006 Konstantin OlchanskiBug Fixcommit latest ccusb.c CAMAC-USB driver
> I commited the latest driver for the Wiener CCUSB USB-CAMAC driver. It
> implements all functions from mcstd.h and has been tested to be plug-compatible
> with at least one of our CAMAC frontends. K.O.

This driver is known to not work with the latest CCUSB firmware (20x, 204, 30x, 303). I know what 
modifications are required and an updated driver will be available shortly. If there is a delay, and you need the 
driver ASAP, please drop me an email.

Also, I am thinking about dropping support for the very old CCUSB firmware revisions (before 204). (Any 
comments?)

K.O.
  304   23 Sep 2006 Konstantin OlchanskiBug Reportmhttpd elog corruption via double-edit
Aparently the mhttpd elog will corrupt the elog files if two (or more?) elog entries are being edited at the 
same time. K.O.
  303   20 Sep 2006 Stefan RittSuggestionIncrease of maximum event size
Since nobody complained so far, I increased MAX_EVENT_SIZE to 2MB. If anybody has problems with this setting, please report. Note that after updating to SVN revision 3327 it will be necessary to recompile all midas programs and to delete any old SYSTEM.SHM or .SYSTEM.SHM. I added some code which should check for inconsistent SYSTEM.SHM sizes, but I'm not sure if it works everywhere.
  302   20 Sep 2006 Stefan RittSuggestionIncrease of maximum event size
Dear midas users,

The current event size in midas is limited to 512k (MAX_EVENT_SIZE in midas.h). This is mainly due to old (pre 2.2) linux kernels which had only a very limited shared memory pool. These days this limit has increased considerably and I question if we should increase the default event size and to which size we should increase it.

The drawback of a larger event size is that the SYSTEM event buffer has to hold at least two events, and when the last midas program is stopped or started, this buffer has to be written to or read from the .SYSTEM.SHM file, which slows down the start/stop of the program. But writing/reading a few MB is fast these days anyhow so this again might now be a big problem. So what do you think how big we should make the default max event size?

- Stefan
  301   08 Sep 2006 Ryu SawadaBug ReportLatest FC5 Compilation attempt
GCC developers fixed this problem in development version of GCC 4.2.

There will not be this problem in GCC 4.2 release version.
  300   05 Sep 2006 Konstantin OlchanskiForumForums moved from dasdevpc.triumf.ca to ladd00.triumf.ca
For the record, the MIDAS (& co) forums have been physically moved from
dasdevpc.triumf.ca to our new server machine ladd00.triumf.ca. This change
should be transparent to all users, but if anything stops working, please let me
know at olchansk-at-triumf-dot-ca. K.O.
  299   04 Sep 2006 Konstantin OlchanskiBug FixFix MIDAS on MacOS 10.4.7
I commited minor fixes for building MIDAS on MacOS 10.4.7:
1) there is no linux/unistd.h
2) gcc 4.0.0 does not like "struct { ... } var;" although "struct Foo { ... } var;" is fine
3) there is no "_syscall0(...)" macro
4) there is no "gettid()", I used pthread_self() instead.
K.O.

P.S. ss_gettid() returns "int" instead of "midas_thread_t" (pthread_t, really). On MacOS 10.4.7 at least, 
pthread_t appears to be a pointer, not an int. Is that right?
  298   01 Sep 2006 pohlForumHytec 5331 CAMAC kernel 2.6 driver problem
Grüezi,

I am new to this list.
We are using MIDAS in the Muonic Hydrogen Lamb Shift experiment at PSI. Previously the DAQ was maintained by Paul Knowles. For the upcoming beamtime I took over.

Now I have problems with the kernel driver khyt1331_26 with Midas svn 3315.

I have compiled the driver, and modprobe khyt1331 works.
Then: "cat /proc/khyt1331" gives, with the CAMAC crate switched OFF:

Hytec 5331 card found at address 0xCC40, using interrupt 10
Device not in use
CAMAC crate 0: not responding
CAMAC crate 1: not responding
CAMAC crate 2: not responding
CAMAC crate 3: not responding


When I switch the crate on and do the "cat" again, the computer freezes.
When I switch the crate OFF again, the computer screen turns black and the computer beeps.


Is anybody using the Hytec 5331 PCI CAMAC card plus the Hytec 1331 CAMAC crate controller and can help me?

I would greatly appreciate any help. Otherwise I am lost.

Cheers,

Randolf


More info:
------------------------------------------------------
Using SuSE 9.3 on a P4. Tried HyterThreading on and off.
uname -a:
Linux mpq1p13 2.6.11.4-21.13-smp #1 SMP Mon Jul 17 09:21:59 UTC 2006 i686 i686 i386 GNU/Linux

------------------------------------------------------
This is exactly what I did (my logbook):
> cd $MIDASSYS/drivers/kernel/khyt1331_26
edit kyt1331.c:
replace (line 36):
# include <config/modversions.h>
with
# include <linux/config.h>
now
> make
> make install
Works, but produces irrelevant error:
install: cannot stat `../doc/*.9': No such file or directory
(Some doc stuff missing)
Finish "make install" by hand by typing
> /sbin/depmod

Load the driver and check it is there:
> modprobe khyt1331
> lsmod | grep khyt
gives on my machine:
"khyt1331 13084 0 "

Now try
> cat /proc/khyt1331

Gives on my machine (no CAMAC crate attached)
Hytec 5331 card found at address 0xCC40, using interrupt 10
Device not in use
CAMAC crate 0: not responding
CAMAC crate 1: not responding
CAMAC crate 2: not responding
CAMAC crate 3: not responding

Finally we need the character device with major number 60 ("char-major-60)
called "/dev/camac".
First check that no device with major=60 exitst:
> ls -l /dev | grep "60,"
should not produce any output.
So we create this device by
> mknod /dev/camac c 60 0
And
> ls -l /dev | grep "60,"
results in
crw-r--r-- 1 root root 60, 0 2006-09-01 14:25 camac

(Here start the problems described above. I had the same problems when I tried the "cat" with CAMAC on BEFORE I did the "mknod")

----------------------------------------------------------
Uncommenting all "prink" in ../drivers/kernel/khyt1331_26/khyt1331.c I get the following kernel logs in /var/log/messages:

Sep 1 17:15:55 mpq1p13 kernel: khyt1331: module not supported by Novell, settin
g U taint flag.
Sep 1 17:15:55 mpq1p13 kernel: khyt1331: start initialization
Sep 1 17:15:55 mpq1p13 kernel: khyt1331: Found 5331 card at CC40, irq 10
Sep 1 17:15:55 mpq1p13 kernel: khyt1331: initialization finished
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 0
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 1
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 2
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 3
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 0
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 1
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 2
Sep 1 17:15:59 mpq1p13 kernel: khyt1331: ioctl 3, param 3

And then it dies.
  297   26 Aug 2006 Konstantin OlchanskiBug Fixfixes for minor mhttpd problems
> I commited fix for minor mhttpd problems (rev 3314):
> - elog attachments did not work for file names containing character plus (+)
> (attachement URLs should be properly encoded to escape special CGI characters)

I accidentally indirectly learned that the above change produced incorrect URLs
when more than one experiment is defined. I now commited a fix to this problem.

K.O.
  296   19 Aug 2006 Konstantin OlchanskiBug Fixfixes for minor mhttpd problems
I commited fix for minor mhttpd problems (rev 3314):
- for a newly created experiment, the "history" button gave the error [history
panel "" does not exist] (new problem introduced in revision 3150)
- for very long history panel names (close to the 32-character limit) history
plots produce the error "Cannot find /history/display/foo/bar/variables" (broke
in revision 3190 "use strlcpy()", in previous revisions, this bug was silent
stack corruption)
- elog attachments did not work for file names containing character plus (+)
(attachement URLs should be properly encoded to escape special CGI characters)
K.O.
  295   17 Aug 2006 Stefan RittBug Report"double" values are truncated
> The mhttpd ODB displays and mhist truncate values of "float" and "double"
> floating point variables to 6 digits. In reality, "float" has 7 significant
> digits and "double" has 16. I recommend that db_sprintf() in odb.c be changed to
> read this:
> 
>       case TID_FLOAT:
>          sprintf(string, "%.7g", *(((float *) data) + index));
>          break;
>       case TID_DOUBLE:
>          sprintf(string, "%.16g", *(((double *) data) + index));
>          break;
> 
> K.O.

I had there

      case TID_FLOAT:
         if (ss_isnan(*(((float *) data) + index)))
            sprintf(string, "NAN");
         else
            sprintf(string, "%g", *(((float *) data) + index));
         break;
      case TID_DOUBLE:
         if (ss_isnan(*(((double *) data) + index)))
            sprintf(string, "NAN");
         else
            sprintf(string, "%lg", *(((double *) data) + index));
         break;

so I assumed that "%g" takes care of the maximal resolution. But apparently it does
not. So I changed it as you proposed.
  294   17 Aug 2006 Konstantin OlchanskiBug Report"double" values are truncated
The mhttpd ODB displays and mhist truncate values of "float" and "double"
floating point variables to 6 digits. In reality, "float" has 7 significant
digits and "double" has 16. I recommend that db_sprintf() in odb.c be changed to
read this:

      case TID_FLOAT:
         sprintf(string, "%.7g", *(((float *) data) + index));
         break;
      case TID_DOUBLE:
         sprintf(string, "%.16g", *(((double *) data) + index));
         break;

K.O.
  293   12 Aug 2006 Pierre-André AmaudruzReleaseMidas updates
Midas development:

Over the last 2 weeks (Jul26-Aug09), Stefan Ritt has been at Triumf for the "becoming" traditional Midas development 'brainstorming/hackathon' (every second year).

A list with action items has been setup combining the known problems and the wish list from several Midas users.
The online documentation has been updated to reflect the modifications.

Not all the points have been covered, as more points were added daily but the main issues that have been dealt or at least discussed are:

  • ODB over Frontend precedence.
    When starting a FE client, the equipment settings are taken from the ODB if this equipment already existed. This meant the ODB has precedence over the EQUIPEMENT structure and whatever change you apply to the C-Structure, it will NOT be taken in consideration until you clean (remove) the equipment tree in ODB.

  • Revived 64 bit support. This was required as more OS are already supporting such architecture. Originally Midas did support Alpha/OSF/1 which operated on 64 bit machine. This new code has been tested on SL4.2 with Dual-Core 64-bit AMD Opterons.

  • Multi-threading in Slow Control equipments.
    Check entry 289 in Midas Elog from Stefan.

  • mhttpd using external Elog.
    The standalone ELOG package can be coupled to an existing experiment and therefore supersede the internal elog functionality from mhttpd.
    This requires a particular configuration which is described in the documentation.

  • MySQL test in mlogger
    A reminder that mlogger can generate entries in a MySQL database as long as the pre-compilation flag -HAVE_MYSQL is enabled during system built. The access and form filling is then defined from the ODB under Logger/SQL once the logger is running, see documentation.

  • Directory destination for midas.log and odb dump files
    It is now possible to specify an individual directory to the default midas.log file as well as to the "ODB Dump file" destination. If either of these fields contains a preceding directory, it will take the string as an absolute path to the file.

  • User defined "event Data buffer size" (ODB)
    The event buffer size has been until now defined at the system level in midas.h. It is now possible to optimize the memory allocation specific to the event buffer with an entry in the ODB under /experiment, see documentation.

  • History group display
    It is now possible to display an individual group of history plots. No documentation on that topics as it should be self explanatory.

  • History export option
    From the History web page, it is possible to export to a ASCII .csv file the history content. This file can later be imported into excel for example. No documentation on that topics as it should be self explanatory.

  • Multiple "minor" corrections:
    - Alarm reset for multiple experiment (return directly to the experiment).
    - mdump -b option bug fixed.
    - Alarm evaluation function fixed.
    - mlogger/SQL boolean handling fixed.
    - bm_get_buffer_level() was returning a wrong value which has been fixed now.
    - Event buffer bug traced and exterminated (Thanks to Konstantin).
  292   09 Aug 2006 Konstantin OlchanskiBug FixRefactoring and rewrite of event buffer code
> In close cooperation with Stefan, I refactored and rewrote the MIDAS event
> buffering code (bm_send_event, bm_flush_cache, bm_receive_event and bm_push_event).
>
> All are welcome to try the new code. If it explodes, please send me the error
> messages, stack traces and core dumps.

Stefan quickly found one new error (a typoe in a check against infinite looping) and
then I found one old error present in the old code that caused event loss when the
buffer became exactly 100% full (0 bytes free).

Both errors are now fixed in svn commit 3294.

K.O.
  291   07 Aug 2006 Konstantin OlchanskiBug FixFix crash in mfe.c
Some time ago, I accidentally introduced a bug in mfe.c- if there is data
congestion in the system, mfe.c can exit with the error "bm_flush_cache(ASYNC)
error 209" because it did not expect the valid return value BM_ASYNC_RETURN
(209) from bm_flush_cache(ASYNC). This error has now been fixed. K.O.
  290   07 Aug 2006 Konstantin OlchanskiBug FixRefactoring and rewrite of event buffer code
In close cooperation with Stefan, I refactored and rewrote the MIDAS event
buffering code (bm_send_event, bm_flush_cache, bm_receive_event and bm_push_event).

The main goal of this update is to make sure the event buffering code does not
have any infinite loops: in the past, we have seen mlogger and some frontends
loop forever consuming 100% CPU in the event buffering code. This should now be
completely fixed.

As additional bonuses, the refactored code is easier to read, has less code
duplication and should be more robust. A few potential logical problems have
been corrected and one case of reproducible infinite looping has been fixed.

The new code has passed the low-level consumer-producer tests, but has not yet
been used in anger in any real experiment. One hopes any new bugs introduced
would cause outright failures and core dumps (rather than silent data corruption).

All are welcome to try the new code. If it explodes, please send me the error
messages, stack traces and core dumps.

K.O.
  289   07 Aug 2006 Stefan RittInfoNew multi-threaded midas slow control system
Multi-threaded slow control system

The Midas slow control system has been modified to support multi-threaded slow control front-ends. Each device gets it's own thread in the front-end, which has several advantages:

- the communication of all devices runs in parallel and therefor is much faster
- slow devices cannot block any more the front-end. Response times to run transitions etc. become therefore much faster.

This modification requires some minor modifications in the existing class and device drivers.

Dropping of CMD_xxx_ALL commands

The slow control commands CMD_SET_ALL, CMD_GET_ALL, CMD_SET_CURRENT_LIMIT_ALL, CMD_GET_CURRENT_LIMIT_ALL, etc. have been dropped. They were there to accomodate some slow devices, which sometimes works a bit faster if all channels are set or read at once. Since the inter-thread communication scheme implemented now does only allow passing one channel at a time, the "ALL" functions cannot be supported any more. On the other hand this is not such an issue any more, since slow devices are handled now in parallel, speeding up things considereably.

The command have been removed from midas.h and from all device and class drivers coming with the midas distribution. If you have your own drivers, just delete the sections wich use these commands.

Calling the device driver inside the class driver

The device drivers have now to be called differently in the class driver. The reason for that is that in a multi-threaded front-end, there is only one central device driver dispatcher, which communicates with the individual device driver threads. The device drivers do not need to be modified, but all existing class drivers need modification, if they are going to be run in a multi-threaded front-end. Old class drivers which are not used in a multi-threaded front-end do not to be modified.

Following modifications are necessary:

  • Remove following line:
    #define DRIVER(_i) ...

  • Find all lines containing
    DRIVER(i)(CMD_xxx, info->dd_info[i], ...)

    and replace them with
    device_driver(info->driver[i], CMD_xxx, ...)

    note that info->dd_info[i] is not passed any more. Instead, you pass info->driver[i]. Pleae note that the arguments passed after CMD_xxx are not checked by the compiler, since they are a variable argument list. Any error there will not produce a compiler warning, but will just crash the front-end.

  • Find the line with
    status = pequipment->driver[i].dd(CMD_INIT, hKey, &pequipment->driver[i].dd_info,
                                            pequipment->driver[i].channels,
                                            pequipment->driver[i].flags,
                                            pequipment->driver[i].bd);

    and replace it with
    status = device_driver(&pequipment->driver[i], CMD_INIT, hKey);

  • Find the line with
    pequipment->driver[i].dd(CMD_EXIT, pequipment->driver[i].dd_info);

    and replace it with
    device_driver(&pequipment->driver[i], CMD_EXIT);

  • Find following lines
    hv_info->driver[i] = pequipment->driver[index].dd;
    hv_info->dd_info[i] = pequipment->driver[index].dd_info;
    hv_info->channel_offset[i] = offset;
    hv_info->flags[i] = pequipment->driver[index].flags;

    and replace them with
    hv_info->driver[i] = &pequipment->driver[index];
    hv_info->channel_offset[i] = offset;

The class drivers multi.c and generic.c can be used as a reference for these modifications.

Implementing CMD_STOP command

For multithread-enabled device drivers it is necessary to support the CMD_STOP command, which is needed to stop all device threads before the actual device gets closed. Following code is necessary:
INT cd_xxx(INT cmd, EQUIPMENT * pequipment)
{
   INT i, status;

   switch (cmd) {
   case CMD_INIT:
      ...

   case CMD_STOP:
      for (i = 0; pequipment->driver[i].dd != NULL &&
                  pequipment->driver[i].flags & DF_MULTITHREAD ; i++)
         status = device_driver(&pequipment->driver[i], CMD_STOP);
      break;

   case CMD_IDLE:
      ...

   return status;
}

Enabling multi-thread support

To turn on multi-thread support for a device, the flag DF_MULTITHREAD must be used in the front-end user code device driver list, such as
DEVICE_DRIVER multi_driver[] = {
   {"Input", nulldev, 2, null, DF_INPUT | DF_MULTITHREAD},
   {"Output", nulldev, 2, null, DF_OUTPUT | DF_MULTITHREAD},
   {""}
};
ELOG V3.1.4-2e1708b5