> > cm_watchdog() has been removed from the latest midas sources
> > Removal of cm_watchdog() solves many problems in the midas code base:
> Removal of cm_watchdog() creates new problems:
> a) the bm_send_event(BM_WAIT) and bm_receive_event(BM_WAIT)
> b) frontends that talk to slow external equipment
> c) mhttpd sometimes dies from from an odb timeout (with the default 10 sec timeout).
The watchdog is back, in a "light" form. Added:
- cm_watchdog_thread() - runs every 2 seconds and updates the timestamps on ODB and all open event buffers (SYSMSG, SYSTEM, etc).
- cm_start_watchdog_thread() - added to mfe.c and mhttpd - so user frontends work the same as before cm_watchdog() removal
- cm_stop_watchdog_thread() - added to cm_disconnect_experiment() to avoid leaving the thread running after we closed odb and all event buffers.
As before, the watchdog only runs on locally attached midas programs. For programs attached remotely via the mserver, the mserver handles the watchdog functions.
This new light-weight watchdog thread only updates the timestamps, it does not check and remove dead clients, it does not check the alarms. These functions are now performed
by cm_yield() and cm_periodic_tasks(). At least some program in an experiment should call them periodically. (normally, at least mlogger and mhttpd will do that).
Programs that accidentally relied on SIGALRM firing at 1Hz may still be affected - i.e. with the old cm_watchdog(), ::sleep(1000) will only sleep for 1 second (interrupted by
SIGALARM), now it will sleep for the full 1000 seconds. Other syscalls, i.e. select(), are similarly affected.
For now, I think only mfe.c frontends and mhttpd need the watchdog thread. With luck all the other midas programs (mlogger, mdump, etc) will run fine without it.
K.O. |