Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  28 Dec 2018, Konstantin Olchanski, Info, note on the midas event buffer code, part 1 
    Reply  28 Dec 2018, Konstantin Olchanski, Info, note on the midas event buffer code, part 2, bm_send_event() 
       Reply  28 Dec 2018, Konstantin Olchanski, Info, note on the midas event buffer code, part 3, rpc_send_event() 
          Reply  28 Dec 2018, Konstantin Olchanski, Info, note on the midas event buffer code, part 4, reading from event buffer 
             Reply  02 Jan 2019, Konstantin Olchanski, Info, note on the midas event buffer code, part 5, bm_read_buffer() 
                Reply  02 Jan 2019, Konstantin Olchanski, Info, note on the midas event buffer code, part 6, reading events through the mserver 
                   Reply  03 Jan 2019, Konstantin Olchanski, Info, note on the midas event buffer code, part 7, event buffer polling frequencies 
                      Reply  03 Jan 2019, Konstantin Olchanski, Info, note on the midas event buffer code, part 8, writer and reader communications 
                         Reply  10 Jan 2019, Konstantin Olchanski, Info, note on the midas event buffer code, part 8, writer and reader communications 
Message ID: 1431     Entry time: 02 Jan 2019     In reply to: 1430     Reply to this: 1432
Author: Konstantin Olchanski 
Topic: Info 
Subject: note on the midas event buffer code, part 6, reading events through the mserver 
> > > > In this technical note, I write down the workings of the midas event buffer code
> > > > we need to understand and write down how the event buffer code works.
> > > > bm_send_event() does this ...
> > > rpc_send_event() does this ...
> > > mserver rpc_server_receive() does this ...
> bm_read_buffer() does this ...
> bm_dispatch_event() does this ...
> - bm_push_event(buffer name) ...
> - bm_check_buffers() - call bm_push_event() for all open buffers ...
> - bm_receive_event() ...
> - bm_receive_event_alloc() ...

There is only one path for midas programs (analyzers, mdumps, etc) connected remotely
through the mserver to receive event data - by using bm_receive_event(). Unlike the mfe
frontend where two paths are possible for sending data - bm_send_event() and rpc_send_event() -
there is no "rpc_recieve_event" alternate data path for receiving data.
Event data can only travel through the RPC_BM_RECEIVE_EVENT RPC call.

So how do the event request callbacks work in a remote-connected client?

a) cm_yield() always calls bm_poll_event() which loops over all requests, calls bm_receive_event() and bm_dispatch_event().
b) ss_suspend() calls rpc_client_dispatch() to receive an MSG_BM message from the mserver and call bm_poll_event(), then as above.

So new events can show up at the user event handler each time we call cm_yield() (poll bm_receive_event()) and
each time we call ss_suspend() (check for MSG_BM message from mserver).

This is how does the mserver generates the MSG_BM messages:
- cm_dispatch_ipc receives a "B" message (from the writer to the event buffer)
- calls bm_notify_client(buffer name) (instead of calling bm_push_event() for a normal direct-attached client)
- bm_notify_client() sends the MSG_BM message to the remote client, unless:
-- the remote client did not specify and event handler callbacks (pbuf->callback is false) (they poll for data)
-- previous MSG_BM message was sent less than 500 ms ago.

The best I understand, at the end, this scheme works like this:

- low frequency events (< 1/sec) will always generate an MSG_BM message and cause
the remote client to receive and dispatch this event almost immediately (as soon as they
do the next cm_yield() or ss_suspend().

- high frequency events (> 2/sec) will have the MSG_BM messages throttled by the 500 ms blank-off
in bm_notify_client() and will be processed almost exclusively by polling via cm_yield().

Additional gotchas.

a) polling for events via bm_receive_event() without BM_NO_WAIT will cause serious problems. Because
internally, bm_receive_event() does not have a timeout, it will wait for new data forever, stalling
the RPC_BM_RECEIVE_EVENT request. Normally, rpc_call(RPC_BM_RECEIVE_EVENT) will timeout,
but the bm_receive_event() function disables this timeout, so for the remotely connected client,
the rpc call will hang forever. This creates two problems:
a1) if this is a multithreaded client and from another thread, it tries to do another RPC call (i.e. access odb, etc), there will be a crash on waiting for the RPC mutex (timeout is 10 sec, see 
rpc_call(), in this case, rpc_timeout is zero)
a2) if a run stop is attempted, the RPC call from cm_transition() will timeout and cause this client to be killed. This is because while waiting for an RPC reply, we do not listen for and process 
incoming RPC requests. (waiting is done inside ss_socket_wait() through recv_tcp2() and ss_recv_net_command()).
aa) because of this, clients connected remotely should always call bm_receive_event() with BM_NO_WAIT.

All of the above applies only to clients connected remotely via the mserver.

to be continued,
K.O.
ELOG V3.1.4-2e1708b5