Back Midas Rome Roody Rootana
  Midas DAQ System, Page 35 of 138  Not logged in ELOG logo
ID Date Author Topicdown Subject
  1415   18 Dec 2018 Konstantin OlchanskiInfomxml update
the mxml library was updated to make it thread-safe.
https://bitbucket.org/tmidas/mxml/src/master/

I also take an opportunity to remind all to update your copy to the latest version
as I just stumbled on old bug that I fixed 1 year ago (crash of mlogger)
but forgot to update all and every of my copies of mxml.

I also looked at the xml encoder and I see that it has several places where it may
truncate the data, but none of these places can cause truncation of ODB data
because the fixed-size internal buffers are big enough to hold the longest
values sent by the odb xml encoder.

K.O.
  1419   26 Dec 2018 Konstantin OlchanskiInfoPartial refactoring of ODB code
> One additional comment: In the 90's when I developed this code, locking was expensive.
> Now the world has changed, we can do almost a million locks a second.

I am not sure this is quite true. The CPU can execute 3000 million operations per second (3GHz CPU, assuming 1 op/Hz),
so 1 lock operation is worth 3000 normal operations. Of course cache misses and branch mispredictions mess up
this simple arithmetic...

But I think cost of mutex lock/unlock can be easily measured. (hmm... now I am curious).

Bigger question is architectural, nested/recursive locks is definitely a bad thing to do (not just my opinion).

But closer to home, as I implemented "write protected" ODB, lock/unlock suddenly has to do MMU operations
(map unmap memory) and this is *very* expensive.

Also as we start doing more multithreading, lock contention is becoming a problem, and the standard solution
is to implement read-locks and write-locks. (everybody holding a read-lock can read ODB at the same time
without waiting).

So, moving in the direction of separate read and write locks and write-protected (and/or read-protected) ODB shared memory,
all points in the direction of reworking of ODB locks in the direction of removing the need for nested/recursive locks.

I think me and Stefan are in agreement here.

K.O.
  1420   26 Dec 2018 Konstantin OlchanskiInfobm_receive_event timeout in ROME
> There is a bug report in the ROME repository which says bm_receive_event timeouts.
> https://bitbucket.org/muegamma/rome3/issues/8/rome-with-midas-produces-timeout-after
> Does anybody have any ideas what could causing the problem ?

There could be a problem with causing bm_receive_event() to wait for an event for a time longer than 
the rpc timeout. This rings a very small bell for me. But I do not remember the details.

As I now go through the midas event buffer code, I will check that bm_receive_event() connected 
through the mserver has correctly working timeouts.

Thank you for reminding me about this difficulty.

K.O.
  1423   27 Dec 2018 Stefan RittInfoPartial refactoring of ODB code
> I am not sure this is quite true. The CPU can execute 3000 million operations per second (3GHz CPU, assuming 1 op/Hz),
> so 1 lock operation is worth 3000 normal operations. Of course cache misses and branch mispredictions mess up
> this simple arithmetic...

You can try that with "t1" in odbedit. This times the number of db_get_data() calls midas can do per second. On my MacBook Pro I get 470'000 
accesses per second.
  1426   28 Dec 2018 Konstantin OlchanskiInfonote on the midas event buffer code, part 1
In this technical note, I write down the workings of the midas event buffer code, the path 
that events travel from the frontend to the SYSTEM buffer to mlogger (and to disk).

The event buffer code has worked well in the past, but more recently we see a few 
problems. There is the event buffer shared memory corruption problem in the alpha-g 
detector daq. There is difficulties with GET_RECENT. There is timeouts in bm_receive_event 
RPC path in ROME. There is the 2Gbyte size limit on the event buffer size (limiting 
maximum event size to about 1Gbyte), due to the 32-bit-ness of the event buffer size code. 
In the day of 10gige networkwing (1Gbyte/sec) and >1Gbyte/sec storage arrays, 2Gbyte 
buffer size is just about sufficient. There is lack of multithread safety in the event buffer 
code. There is the lack of bm_receive_event() where I do not have to guess the maximum 
event size (making event truncation impossible).

I have been looking at the event buffer code for many years. It is extremely very well 
written,
but it is also probably the oldest code inside midas and it's age shows. Good code from 
1998 is just very hard to read and follow 20 years later in 2018. We no longer do "goto", we 
are not afraid of using malloc(), we declare variables as we are about to use them (instead 
of at the beginning of a function). The list goes on and on.

So after looking at this code for many years, I finally decided to bite the bullet and 
rewrite/modernize it. To my surprise the code took very well to this. I only had to rewrite 
the parts that use difficult to follow "goto" logic, the rest of the code almost refactored 
itself. I think this is a very good thing to happen as the basic logic remains the same and 
people already familiar with old code (Stefan, myself, etc) will find that while things moved 
around, the basic logic remained the same.

But first, we need to understand and write down how the event buffer code works.

to be continued,
K.O.
  1427   28 Dec 2018 Konstantin OlchanskiInfonote on the midas event buffer code, part 2, bm_send_event()
> In this technical note, I write down the workings of the midas event buffer code
> we need to understand and write down how the event buffer code works.

The data ingress part of the event buffer code is very simple, events are sent to the event buffer via the one 
function bm_send_event(). There is no other way to inject data into an event buffer.

bm_send_event() does this:
= if the write cache is active:
- the new event is written into the local write cache
- if the write cache is full, it is flushed into the event buffer via bm_flush_cache()
- if the new event does not fit into the write cache, the write cache is flushed and the event is written into the 
event buffer (preserving the event ordering).
= if the write cache is inactive, new events are written directly into the event buffer.
= to write an event into the event buffer:
- wait for free space via bm_wait_for_free_space()
- lock buffer semaphore
- copy data into buffer shared memory
- update write pointer
- unlock buffer semaphore
- notify all readers waiting for this event

bm_flush_cache() does this:
- wait for free space via bm_wait_for_free_space()
- lock buffer semaphore
- copy events from write cache to buffer shared memory
- at the same time keep track of readers that wait for these events
- update write pointer
- unlock buffer semaphore
- notify all readers waiting for previous cached events

bm_wait_for_free_space() does this:
= if buffer is full and sync_flag==BM_NO_WAIT
- return BM_ASYNC_RETURN immediately
- causing bm_send_event() to immediately return BM_ASYNC_RETURN without writing anything to the event 
buffer
= if buffer is full,
- sleep using ss_suspend(1000, MSG_BM), then check again
- there is no timeout, bm_wait_for_free_space() will wait forever

The most expensive operation when writing to the event buffer is the locking,
we must wait until all readers finish their reading and all other writers finish their writing
and unlock the buffer for us. (read on "lock fairness and starvation"). In general, the less
locking we do, the better.

To reduce lock contention the event buffer code has a write cache. In theory, when we write
a large number of small events, it is more efficient to "batch" them together, significantly
reducing the number of locking operations "per event". Even if the events are large, if there
is significant lock contention from multiple writers and multiple readers, batching the writes
is still a good idea. The downside of this is the cost of an extra memcpy(). instead of one memcpy() from
user buffer to the shared memory, we do 2 memcpy() - from user buffer to write cache, then from
write cache to the shared memory. Today's typical PC-type machines have very fast RAM,
so memcpy() is inexpensive. However embedded and low power machines (ARM SoCs, FPGA based SoCs,
etc) tend to have pretty slow memory, so extra memcpy() can be expensive.

The bottom line is that the write cache size should be tuned to the actual use case,
but in general it is less useful at low data rates (hardly any contention for the event buffer locks),
and more useful at high data rates especially with very small event sizes (high overhead from
locking on each event).

Right now the write cache is always enabled for mfe-based frontends:
bm_set_cache_size(..., 0, SERVER_CACHE_SIZE); // 100000 bytes
(perhaps the cache size should be made configurable via ODB /Eq/xxx/Common).

To ensure that events do not sit in the write cache for too long,
mfe-based frontends call bm_flush_cache() about once per second.
(see mfe.c, good luck!)

to be continued,
K.O.
  1428   28 Dec 2018 Konstantin OlchanskiInfonote on the midas event buffer code, part 3, rpc_send_event()
> In this technical note, I write down the workings of the midas event buffer code
> we need to understand and write down how the event buffer code works.
> bm_send_event() does this ...

When connecting to midas remotely via the mserver, bm_send_event() works as expected through
the MIDAS RPC (RPC_BM_SEND_EVENT).

MIDAS RPC does a lot of work encoding and decoding the data, so for extra efficiency,
mfe-based frontends, use rpc_send_event().

rpc_send_event() does this:
- if we are connected locally, call bm_send_event()
- if rpc_mode==0, an RPC_BM_SEND_EVENT request is constructed "by hand" and sent to the mserver, result is same as calling 
bm_send_event() but overhead is reduced because we avoid calling the generic rpc_call().
- if rpc_mode!=0, event data is sent to the mserver by writing it into the event tcp connection (_server_connection.event_sock).
- there is a small (8kbyte) cache for batching tcp writes, probably not useful for modern tcp implementation, but it makes 
rpc_send_event() non thread-safe.

rpc_send_event() is NOT thread safe.

mserver receives event data in rpc_server_receive():
- read the data from the event tcp connection (_server_acception[idx].event_sock) via recv_event_server()
- decode the buffer handle and call bm_send_event(BM_WAIT)
- this is done in a loop for a maximum of 500 msec or until there is no more data in the event tcp connection
- maximum event size that can be received is set by ODB /Experiment/MAX_EVENT_SIZE.

This data path is the most efficient way for sending data from a remote client into midas.

Limitations:
- rpc_send_event() is not thread safe because
a) it uses a small and probably unnecessary data cache and
b) the communication protocol requires 2 syscalls to send an event (1st syscall to send 2 bytes of buffer handle, 2nd syscall to send the 
event data).
- rpc_server_receive() cannot receive arbitrary large events because recv_event_server() does not know how to allocate memory (it does 
know the event size).
 
to be continued,
K.O.
  1429   28 Dec 2018 Konstantin OlchanskiInfonote on the midas event buffer code, part 4, reading from event buffer
> > In this technical note, I write down the workings of the midas event buffer code
> > we need to understand and write down how the event buffer code works.
> > bm_send_event() does this ...
> rpc_send_event() does this ...
> mserver rpc_server_receive() does this ...

There are two ways to read data from an event buffer.

The first way is to specify an event request without an event handler callback function and use bm_receive_event() to poll for new events.

The second way is to specify an even handler callback function and rely on midas internal event notifications
and polling to receive events via the callback function. The mlogger and mdump utilities use this method.

The third part to this involves delivering event notifications to remote midas clients connected via the mserver.
They cannot receive event buffer notifications - the UDP datagrams are sent on the localhost interface
and are received by the mserver which has to forward them to the remote client.

In addition, there is an event buffer read cache, which works similar to the write cache to reduce the number
of buffer lock operations and so reduce the locking overhead and the lock contention.

to be continued,
K.O.
  1430   02 Jan 2019 Konstantin OlchanskiInfonote on the midas event buffer code, part 5, bm_read_buffer()
> > > In this technical note, I write down the workings of the midas event buffer code
> > > we need to understand and write down how the event buffer code works.
> > > bm_send_event() does this ...
> > rpc_send_event() does this ...
> > mserver rpc_server_receive() does this ...
> 
> There are two ways to read data from an event buffer.
>
> The first way is to specify an event request without an event handler callback function and use bm_receive_event() to poll for new events.
> 
> The second way is to specify an even handler callback function and rely on midas internal event notifications
> and polling to receive events via the callback function. The mlogger and mdump utilities use this method.
> 

The two ways of reading events from the event buffer are/were implemented by bm_receive_event() and bm_push_event(). The code
in these two functions is/was identical (the best I can tell) except that the first one copied the event data into a user buffer,
while the second one invoked the user event request callback function via bm_dispatch_event().

In the new code, I have consolidated both functions into one, called bm_read_buffer().

bm_read_buffer() does this:
- if read cache is enabled:
- read events from read cache
- if read cache is empty, fill it from the event buffer shared memory via bm_fill_read_cache()
- if the next event in the event buffer shared memory does not fit into the read cache, filling of the read cache stops
- then events are read from the read cache until it is empty, then we fall through to:
- if the read cache is disabled or next event in the event buffer does not fit into the read cache,
- read the next event directly from the event buffer (at this point, the read cache is empty).

Through out this activity, all events from the event buffer shared memory that do not match any
event request are skipped. Only events that match an event request are copied into the read cache.

bm_dispatch_event() does this:
- calls some code for event defragmentation (I do not know if this code is used or if it works)
- loop over all event requests, for matching request, call the user event handler function (buffer handle, request id, event header, event data)
- if multiple requests match an event, the user event handler will be called multiple times (once per matching request).

Following functions are now available to the users to read the event buffer:
- bm_push_event(buffer name) - dispatch next event from the read cache or poll the event buffer for new events, fill the read cache and dispatch the next event (via 
bm_dispatch_event()).
- bm_check_buffers() - call bm_push_event() for all open buffers in a loop for about 1000 ms.
- bm_receive_event() - read the next event into the supplied buffer (buffer should be big enough to fit event of maximum size, see ODB /Experiment/MAX_EVENT_SIZE.
- bm_receive_event_alloc() - read the next event into an automatically allocated buffer of correct size to fit the event. this buffer should be free()ed after use.

This is it for the code internals that deal directly with the event data buffers shared memory.

Next to explore is the reading of events through the mserver, the polling of buffers in various programs and the communication between buffer readers and writers.

to be continued,
K.O.
  1431   02 Jan 2019 Konstantin OlchanskiInfonote on the midas event buffer code, part 6, reading events through the mserver
> > > > In this technical note, I write down the workings of the midas event buffer code
> > > > we need to understand and write down how the event buffer code works.
> > > > bm_send_event() does this ...
> > > rpc_send_event() does this ...
> > > mserver rpc_server_receive() does this ...
> bm_read_buffer() does this ...
> bm_dispatch_event() does this ...
> - bm_push_event(buffer name) ...
> - bm_check_buffers() - call bm_push_event() for all open buffers ...
> - bm_receive_event() ...
> - bm_receive_event_alloc() ...

There is only one path for midas programs (analyzers, mdumps, etc) connected remotely
through the mserver to receive event data - by using bm_receive_event(). Unlike the mfe
frontend where two paths are possible for sending data - bm_send_event() and rpc_send_event() -
there is no "rpc_recieve_event" alternate data path for receiving data.
Event data can only travel through the RPC_BM_RECEIVE_EVENT RPC call.

So how do the event request callbacks work in a remote-connected client?

a) cm_yield() always calls bm_poll_event() which loops over all requests, calls bm_receive_event() and bm_dispatch_event().
b) ss_suspend() calls rpc_client_dispatch() to receive an MSG_BM message from the mserver and call bm_poll_event(), then as above.

So new events can show up at the user event handler each time we call cm_yield() (poll bm_receive_event()) and
each time we call ss_suspend() (check for MSG_BM message from mserver).

This is how does the mserver generates the MSG_BM messages:
- cm_dispatch_ipc receives a "B" message (from the writer to the event buffer)
- calls bm_notify_client(buffer name) (instead of calling bm_push_event() for a normal direct-attached client)
- bm_notify_client() sends the MSG_BM message to the remote client, unless:
-- the remote client did not specify and event handler callbacks (pbuf->callback is false) (they poll for data)
-- previous MSG_BM message was sent less than 500 ms ago.

The best I understand, at the end, this scheme works like this:

- low frequency events (< 1/sec) will always generate an MSG_BM message and cause
the remote client to receive and dispatch this event almost immediately (as soon as they
do the next cm_yield() or ss_suspend().

- high frequency events (> 2/sec) will have the MSG_BM messages throttled by the 500 ms blank-off
in bm_notify_client() and will be processed almost exclusively by polling via cm_yield().

Additional gotchas.

a) polling for events via bm_receive_event() without BM_NO_WAIT will cause serious problems. Because
internally, bm_receive_event() does not have a timeout, it will wait for new data forever, stalling
the RPC_BM_RECEIVE_EVENT request. Normally, rpc_call(RPC_BM_RECEIVE_EVENT) will timeout,
but the bm_receive_event() function disables this timeout, so for the remotely connected client,
the rpc call will hang forever. This creates two problems:
a1) if this is a multithreaded client and from another thread, it tries to do another RPC call (i.e. access odb, etc), there will be a crash on waiting for the RPC mutex (timeout is 10 sec, see 
rpc_call(), in this case, rpc_timeout is zero)
a2) if a run stop is attempted, the RPC call from cm_transition() will timeout and cause this client to be killed. This is because while waiting for an RPC reply, we do not listen for and process 
incoming RPC requests. (waiting is done inside ss_socket_wait() through recv_tcp2() and ss_recv_net_command()).
aa) because of this, clients connected remotely should always call bm_receive_event() with BM_NO_WAIT.

All of the above applies only to clients connected remotely via the mserver.

to be continued,
K.O.
  1432   03 Jan 2019 Konstantin OlchanskiInfonote on the midas event buffer code, part 7, event buffer polling frequencies
> > > > > In this technical note, I write down the workings of the midas event buffer code
> > > > > we need to understand and write down how the event buffer code works.
> > > > > bm_send_event() does this ...
> > > > rpc_send_event() does this ...
> > > > mserver rpc_server_receive() does this ...
> > bm_read_buffer() does this ...
> > bm_dispatch_event() does this ...
> > - bm_push_event(buffer name) ...
> > - bm_check_buffers() - call bm_push_event() for all open buffers ...
> > - bm_receive_event() ...
> > - bm_receive_event_alloc() ...
> remote client: cm_yield() -> bm_poll_event() -> bm_receive_event() -> bm_dispatch_event()
> remote client: ss_suspend() -> receive MSG_BM -> rpc_client_dispatch() -> bm_poll_event() -> ...
> mserver: cm_dispatch_ipc -> bm_notify_client() -> send MSG_BM

We now understand that cm_yield() polls the data buffers for new events, adding to buffer lock congestion. What is the
polling frequency in different programs:

mlogger, no run: 1000 ms cm_yield() split by firing of cm_watchdog() (average sleep time is 500ms), poll frequency 2 Hz
mlogger, when running at high data rate: most time is spent looping inside bm_check_buffers(), poll frequency is ~1 Hz
mdump: same thing, 1000 ms cm_yield() is split by cm_watchdog(), poll frequency is 2 Hz
odbedit: 100 ms cm_yield(), poll frequency is 10 Hz, normally it polls just the SYSMSG buffer.
mfe.c frontend: 100 ms cm_yield(), but it makes no event requests, so no polling of event buffers
mhttpd: 0 ms (!!!) cm_yield(), period 10 ms (there is a 10 ms ss_suspend() somewhere there), polls SYSMSG at crazy frequency of 100 Hz.
mserver: 5000 ms cm_yield() no polling for events

hmm... perhaps the logic of cm_yield() should be changed to only poll for new events once per second -
as for rare events we receive them through the "B" messages from the buffer writer, for high rate
messages we process them in a batch via the read cache and the loop in bm_check_buffer().

Next is to write down the communication between buffer writers and buffer readers.

to be continued,
K.O.
  1433   03 Jan 2019 Konstantin OlchanskiInfonote on the midas event buffer code, part 8, writer and reader communications
> > > > > > In this technical note, I write down the workings of the midas event buffer code

Event buffer readers and writers need to communicate following information:

- writer to reader: when new events are written to the buffer, send a message to wakeup any readers that have matching requests
- reader to writer: when buffer is full and we are waiting for free space, readers need to send us a message after they free up some space

Writer to reader path:
writer - bm_send_event() -> bm_wake_up_client_locked() -> for all readers with matching requests -> if "read_wait" is set, send "B" message
reader - ... -> bm_read_buffer() -> bm_wait_for_more_events() -> set "read_wait" to TRUE -> ss_suspend(1000, MSG_BM) - wait for "B" message, also poll the 
buffer every 1000 msec -> when new event is received, clear "read_wait".

Reader to writer path:
writer - bm_send_event() -> bm_wait_for_free_space() -> set "write_wait" to the amount of free space requested -> ss_suspend(1000, MSG_BM) - wait for "B" 
message, also poll every 1000 msec -> clear "write_wait" to zero.
reader - ... -> bm_read_buffer() -> (also via bm_fill_read_cache() -> bm_wakeup_producers_locked() -> if we are a GET_ALL reader -> if buffer is 50% empty -> 
if write_wait < free_space -> send "B" message.

N.B.1. There is a fly in the ointment for the writer-to-reader path: a function called
bm_mark_read_waiting() is called from several places to set and clear the reader "read_wait" flag.
I think this causes several logic errors: "read_wait" is mistakenly set for buffers where we have
not requested anything; and "read_wait" is forcibly cleared when we are actually waiting
for a "B" message, with "read_wait" cleared, this message will not be sent. But because we  also
poll for new events (nominally every 1000 msec, with cm_watchdog() cutting it down to average 500 msec),
we will not see this fault unless we look carefully for any extra delays between sending and receiving events.

N.B.2. There is a mistake in bm_wakeup_producers(), it should not send "B" messages to clients with zero "write_wait". (fixed in the new code).

and that's all she wrote,
K.O.
  1434   10 Jan 2019 Konstantin OlchanskiInfonote on the midas event buffer code, part 8, writer and reader communications
> > > > > > > In this technical note, I write down the workings of the midas event buffer code
> Event buffer readers and writers need to communicate following information:
> 
> N.B.1. There is a fly in the ointment for the writer-to-reader path: a function called
> bm_mark_read_waiting() is called from several places to set and clear the reader "read_wait" flag.
> I think this causes several logic errors...

bm_mark_read_waiting() is removed in the new code, it is unnecessary. This uncovered a number of problems
in the writer-to-reader communications, fixed in the new code:

- bm_poll_event() did not poll anything because internal flags were not set right
- reader "read_wait" now means "I am waiting for new data, please send me a notification "B" message".
- writer sends a notification "B" message and clears "read_wait" to stop sending more notifications until the reader asks for more data.
- for remote clients, there is interplay between the 500 ms sec blank-off in sending BM_MSG notifications from the mserver and the 1000 ms timeout in the loop of bm_poll_event(). Do not 
change one of them without thinking how it will affect the other one and how they interact.

K.O.
  1435   10 Jan 2019 Konstantin OlchanskiInforemoval of cm_watchdog()
cm_watchdog() has been removed from the latest midas sources. The watchdog functions performed by cm_watchdog() were 
moved to cm_yield() - those are - maintaining odb and event buffer "last active" timestamps and checking for and removing of 
timed-out clients.

Those who write midas programs should ensure that they call cm_yield() at least every 1-2 seconds (for the normal 10 second 
timeout settings). As always, before calling potentially time consuming operations, such as accessing slow responding 
hardware, one should increase or disable their watchdog timeout.

Those who write midas programs that do not use cm_yield() (i.e. the mserver), should call cm_periodic_tasks() instead with 
the same period as described for cm_yield() above.

Removal of cm_watchdog() solves many problems in the midas code base:
- firing of cm_watchdog() at random times in random places makes it difficult to do static code path analysis (call paths, etc) - 
this is needed to ensure correct multithread locking, etc
- ditto caused trouble with multithread locking when cm_watchdog() fires *inside* the pthread mutex locking library itself 
causing mutexes to be in an inconsistent state. This had to be kludged against in the ODB multithread locks - now this kludge 
can be removed.
- many non-midas codes in experiment frontends, etc was not expecting and did not correctly handle firing of the 
cm_watchdog() SIGALARM at random times (i.e. select() with timeout returned too soon, etc).
- today, UNIX signals are pretty obscure, best avoided. (i.e. interaction of signals and threads is not super will defined, etc).

commits up to (and including) plus merge of branch feature/remove_cm_watchdog
https://bitbucket.org/tmidas/midas/commits/9f1775d2fc75d0de0b9d4ef1abc7b2fb9bacca28

K.O.
  1438   18 Jan 2019 Konstantin OlchanskiInfobitbucket issue tracker "feature"
It turns out the bitbucket issue tracker has a feature - I cannot make it automatically add me 
to the watcher list of all new issues.

So when you create a new issue, I think I get one message about it, and no more.

If there is some other activity on the issue (Stefan answers, Thomas answers),
I most definitely will not see any of it.

So in case you wondered why I sometimes completely do not react to some bug reports, this 
is why.

So much for "advanced" and "highly automated" and "intelligent" bug tracking.

K.O.
  1439   21 Jan 2019 Konstantin OlchanskiInforemoval of cm_watchdog()
> cm_watchdog() has been removed from the latest midas sources
> Removal of cm_watchdog() solves many problems in the midas code base:

Removal of cm_watchdog() creates new problems:

a) the bm_send_event(BM_WAIT) and bm_receive_event(BM_WAIT) wait for free space and wait for new event do not update the timeouts (need to add a call 
to cm_periodic_tasks())
b) frontends that talk to slow external equipment now die unless they have their timeout adjusted to be longer than the longest equipment operation (they 
were already supposed to do this, but...)
c) mhttpd sometimes dies from from an odb timeout (with the default 10 sec timeout).

As one solution, we may bring an automatic cm_watchdog() back, but running from a thread instead of from SIGALARM.

K.O.
  1442   24 Jan 2019 Konstantin OlchanskiInforemoval of cm_watchdog()
> > cm_watchdog() has been removed from the latest midas sources
> > Removal of cm_watchdog() solves many problems in the midas code base:
> Removal of cm_watchdog() creates new problems:
> a) the bm_send_event(BM_WAIT) and bm_receive_event(BM_WAIT)
> b) frontends that talk to slow external equipment
> c) mhttpd sometimes dies from from an odb timeout (with the default 10 sec timeout).

The watchdog is back, in a "light" form. Added:

- cm_watchdog_thread() - runs every 2 seconds and updates the timestamps on ODB and all open event buffers (SYSMSG, SYSTEM, etc).
- cm_start_watchdog_thread() - added to mfe.c and mhttpd - so user frontends work the same as before cm_watchdog() removal
- cm_stop_watchdog_thread() - added to cm_disconnect_experiment() to avoid leaving the thread running after we closed odb and all event buffers.

As before, the watchdog only runs on locally attached midas programs. For programs attached remotely via the mserver, the mserver handles the watchdog functions.

This new light-weight watchdog thread only updates the timestamps, it does not check and remove dead clients, it does not check the alarms. These functions are now performed 
by cm_yield() and cm_periodic_tasks(). At least some program in an experiment should call them periodically. (normally, at least mlogger and mhttpd will do that).

Programs that accidentally relied on SIGALRM firing at 1Hz may still be affected - i.e. with the old cm_watchdog(), ::sleep(1000) will only sleep for 1 second (interrupted by 
SIGALARM), now it will sleep for the full 1000 seconds. Other syscalls, i.e. select(), are similarly affected.

For now, I think only mfe.c frontends and mhttpd need the watchdog thread. With luck all the other midas programs (mlogger, mdump, etc) will run fine without it.

K.O.
  1445   07 Feb 2019 Stefan RittInfoHistory panels in custom pages
A new tag has been implemented to display history panels in custom pages, integrated in the 
new custom page design from 2017. The full documentation can be found at 

https://midas.triumf.ca/MidasWiki/index.php/New_Custom_Pages_(2017)#mhistory

Attached is a simple example of such a panel and the result. You have to rename triggerrate.txt to triggerrate.html if you want to use it (can't do it here since otherwise the browser wants to render it which does not work outside of midas).

Stefan
Attachment 1: Screenshot_2019-02-07_at_10.39.44_.png
Screenshot_2019-02-07_at_10.39.44_.png
Attachment 2: triggerrate.txt
<html>
<head>
   <title>Trigger Rate Page</title>
   <link rel="stylesheet" href="midas.css">
   <script src="controls.js"></script>
   <script src="midas.js"></script>
   <script src="mhttpd.js"></script>
</head>
<body class="mcss" onload="mhttpd_init('TriggerRate');">

<div id="mheader"></div>
<div id="msidenav"></div>

<div id="mmain">
   <table class="mtable">
      <tr>
         <th colspan="2" class="mtableheader">
            Trigger Rate
         </th>
      </tr>
      <tr>
         <td>Current event rate:</td>
         <td> <span name="modbvalue" data-odb-path="/Equipment/Trigger/Statistics/Events per sec." data-format="f1">.</span> events per sec.</td>
      </tr>
      <tr>
         <td>Current data rate:</td>
         <td> <span name="modbvalue" data-odb-path="/Equipment/Trigger/Statistics/kBytes per sec." data-format="f1">.</span> kBytes per sec.</td>
      </tr>
      <tr>
         <td colspan="2">
            <div name="mhistory" data-group="Default" data-panel="Trigger rate" data-scale="10m" style="width:600px;border:1px solid black;"></div>
         </td>
      </tr>
   </table>
</div>
</body>
</html>
  1446   08 Feb 2019 Thomas LindnerInfoHistory panels in custom pages
> A new tag has been implemented to display history panels in custom pages, integrated in the 
> new custom page design from 2017. The full documentation can be found at 
> 

As part of consolidating/cleaning the MIDAS Wiki documentation, the "New Custom Pages" was folded into the main "Custom Page".  So to see a
description of Stefan's new functionality please go to 

https://midas.triumf.ca/MidasWiki/index.php/Custom_Page#mhistory
  1447   11 Feb 2019 Konstantin OlchanskiInfojson-rpc request for ODB /Script and /CustomScript.
I added json-rpc requests for ODB /Script and /CustomScript (the first one shows up on the status page in the left hand side menu, the 
second one is "hidden", intended for use by custom pages).

To invoke the RPC method do this: (from mhttpd.js). Use parameter "customscript" instead of "script" to execute scripts from ODB 
/CustomScript.

One can identify the version of MIDAS that has this function implemented by the left hand side menu - the script links are placed by script 
buttons.

<pre>
function mhttpd_exec_script(name)
{
   //console.log("exec_script: " + name);
   var params = new Object;
   params.script = name;
   mjsonrpc_call("exec_script", params).then(function(rpc) {
      var status = rpc.result.status;
      if (status != 1) {
         dlgAlert("Exec script \"" + name + "\" status " + status);
      }
   }).catch(function(error) {
      mjsonrpc_error_alert(error);
   });
}
</pre>

The underlying code moved from mhttpd.cxx to the midas library as cm_exec_script(odb_path);

K.O.
ELOG V3.1.4-2e1708b5