Back Midas Rome Roody Rootana
  Midas DAQ System, Page 92 of 150  Not logged in ELOG logo
ID Date Author Topicdown Subject
  Draft   04 Jul 2024 Nick HastingsForummfe.cxx with RO_STOPPED and EQ_POLLED
I just discovered https://bitbucket.org/tmidas/midas/issues/338/mfec-ro_stopped-is-now-forbidden

> Dear Midas experts,
> 
> I noticed that a check was added to mfe.cxx in 1961af0d6:
> 
> +      /* check for consistent common settings */
> +      if ((eq_info->read_on & RO_STOPPED) &&
> +          (eq_info->eq_type == EQ_POLLED ||
> +           eq_info->eq_type == EQ_INTERRUPT ||
> +           eq_info->eq_type == EQ_MULTITHREAD ||
> +           eq_info->eq_type == EQ_USER)) {
> +         cm_msg(MERROR, "register_equipment", "Events \"%s\" cannot be read when run is stopped (RO_STOPPED flag)", equipment[idx].name);
> +         return 0;
> +      }
> 
> This commit was by Stefan in May 2022.
> 
> A commit few days later, 28d9c96bd, removed the "return 0;", and updated the
> error message to:
> 
> "Equipment \"%s\" contains RO_STOPPED or RO_ALWAYS. This can lead to undesired side-effect and should be removed."
> 
> So such FEs can run but there is still an error at start up. The 
> documentation at https://daq00.triumf.ca/MidasWiki/index.php/ReadOn_Flags
> states with RO_STOPPED "Readout Occurs" "Before stopping run".
> Which seems to indicate that the removing the RO_STOPPED bit from a SC FE
> would just result in an additional read not happening just prior to a run
> stop. However reading scheduler() in mfe.cxx I see in the the main loop:
> 
>  if (run_state == STATE_STOPPED && (eq_info->read_on & RO_STOPPED) == 0) 
>     continue;
> 
> So it seems to me that the a EQ_PERIODIC equipment needs RO_STOPPED to be set
> otherwise it will not read out data while there is no DAQ run.
> 
> Can someone explain the purpose of this check and error message? Perhaps it
> was put in place with only DAQ FEs, not SC FEs in mind? And should the 
> documentation in the wiki actually be "s/Before stopping run/While run is stopped/"?
> 
> Thanks,
> 
> Nick.
  2796   06 Aug 2024 Stefan RittForummfe.cxx with RO_STOPPED and EQ_POLLED
> I noticed that a check was added to mfe.cxx in 1961af0d6:
> 
> +      /* check for consistent common settings */
> +      if ((eq_info->read_on & RO_STOPPED) &&
> +          (eq_info->eq_type == EQ_POLLED ||
> +           eq_info->eq_type == EQ_INTERRUPT ||
> +           eq_info->eq_type == EQ_MULTITHREAD ||
> +           eq_info->eq_type == EQ_USER)) {
> +         cm_msg(MERROR, "register_equipment", "Events \"%s\" cannot be read when run is stopped (RO_STOPPED flag)", equipment[idx].name);
> +         return 0;
> +      }
> 
> 
> Can someone explain the purpose of this check and error message? Perhaps it
> was put in place with only DAQ FEs, not SC FEs in mind? And should the 
> documentation in the wiki actually be "s/Before stopping run/While run is stopped/"?

Indeed you have two types of events handled by mfe.cxx: Slow control events (EQ_SLOW or EQ_PRIODIC) and triggered events (EQ_POLLED or 
EQ_INTERRUPT or EQ_MULTITHREAD or EQ_USER). For slow control events it can make sense to read them also when the run is stopped, that's why you 
can specify RO_STOPPED or RO_ALWAYS. This does however not make sense for triggered events. Reading triggered events when the run is stopped 
invalidates the concept of runs (= read triggered events only during a run). We had cases where people mixed this up, so the warning was added. 
If you have a slow control event you want to read when the run is stopped, make sure it is of type EQ_SLOW or EQ_PERIODIC.

Stefan
  2804   15 Aug 2024 Scott OserForum"Safe" abort of sequencer scripts
We often use the MIDAS sequencer to temporarily control detector settings, such as:

* <change some setting>
* WAIT 60 seconds
* <revert setting to original value>

The question arises of what happens if the sequencer scripts gets aborted during that wait, preventing the value from being reset.  Depending on the setting, this could be undesirable or even damage something if left uncorrected for too long.

Is there any way to have a "safe abort" from the sequencer so that the "Stop immediately" button will call some cleanup script to leave things in a safe state?  Or what about if the sequencer process itself gets killed in the middle of a script?

How have other experiments using MIDAS protected themselves from unplanned terminations of sequencer scripts?
  2805   19 Aug 2024 Stefan RittForum"Safe" abort of sequencer scripts
This request came more than once in the past. One thing I could implement is a "atexit" function similarly to the C funciton atexit().

Then we would have a function in the script which gets called whenever one does "stop immediately". This function can then restore
some ODB values or do whatever is necessary. 

If the sequencer gets killed in the middle, it can safely be restarted since the complete sequencer state is kept in the ODB under
/Sequencer/State. After the restart, the sequencer continues exactly where it has been killed before.

Would that solve your problem?

Stefan
  2809   22 Aug 2024 Scott OserForum"Safe" abort of sequencer scripts
> This request came more than once in the past. One thing I could implement is a "atexit" function similarly to the C funciton atexit().
> 
> Then we would have a function in the script which gets called whenever one does "stop immediately". This function can then restore
> some ODB values or do whatever is necessary. 
> 
> If the sequencer gets killed in the middle, it can safely be restarted since the complete sequencer state is kept in the ODB under
> /Sequencer/State. After the restart, the sequencer continues exactly where it has been killed before.
> 
> Would that solve your problem?
> 
> Stefan

Yes, an "atexit" functionality within the Midas Sequencer Language would be useful for us with this issue.  Is this easy for you to implement?

Thanks,
Scott Oser
  2825   05 Sep 2024 Jack CarltonForumPython frontend rate limitations?
I'm trying to get a sense of the rate limitations of a python frontend. I 
understand this will vary from system to system.

I adapted two frontends from the example templates, one in C++ and one in python. 
Both simply fill a midas bank with a fixed length array of zeros at a given polled 
rate. However, the C++ frontend is about 100 times faster in both data and event 
rates. This seems slow, even for an interpreted language like python. Furthermore, 
I can effectively increase the maximum rate by concurrently running a second 
python frontend (this is not the case for the C++ frontend). In short, there is 
some limitation with using python here unrelated to hardware.

In my case, poll_func appears to be called at 100Hz at best. What limits the rate 
that poll_func is called in a python frontend? Is there a more appropriate 
solution for increasing the python frontend data/event rate than simply launching 
more frontends?

I've attached my C++ and python frontend files for reference.

Thanks,
Jack
  2826   05 Sep 2024 Ben SmithForumPython frontend rate limitations?
> What limits the rate that poll_func is called in a python frontend? 

First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently. You can set it to 0 and we'll call it as often as possible. You can set this in the ODB at "/Equipment/Python Data Simulator/Common/Period"

If that's still not fast enough, then you can return a *list* of events from your readout_func. I've seen real-world cases of 25kHz+ of midas events generated in this fashion.


However in your case the limitation is likely that you're sending 1.25MB per event and we have a lot of data marshalling to do between the python and C++ layer. In particular it takes 15ms on my machine to just pack the data into a memory buffer (see timeit command below). I am sure there must be a faster way to do this packing, especially in the case where the bank contains a numpy array rather than a python list.

I'll add it to my to-do list to investigate improving the performance of medium-to-large events in the python code.


Cheers,
Ben


P.S. You may have a bug in your calculations (depending on how you did your testing). In poll_func I think you should be updating the stats every time the function is called, not just the times when you return True.


P.P.S. Command I used to test how slow it is to pack the data. One-time setup of creating the buffers, then multiple tests of the pack_into function:

python -m timeit -s "import struct;import ctypes;arr = [0]*1250001;buf = ctypes.create_string_buffer(10000000);fmt = \">1250000d\"" "struct.pack_into(fmt, buf, *arr)"
20 loops, best of 5: 15.3 msec per loop
  Draft   05 Sep 2024 Jack CarltonForumPython frontend rate limitations?
Thank you, this was very helpful.

> First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently. You can set it to 0 and we'll call it as often as possible. You can set this in the ODB at "/Equipment/Python Data Simulator/Common/Period"

Thanks, I thought that was just for periodic triggering (or at least that's how I've used it in C++ frontends). Changing this allowed me to get past the 100Hz event rate cap I described.

> If that's still not fast enough, then you can return a *list* of events from your readout_func. I've seen real-world cases of 25kHz+ of midas events generated in this fashion.
> 
> 
> However in your case the limitation is likely that you're sending 1.25MB per event and we have a lot of data marshalling to do between the python and C++ layer. In particular it takes 15ms on my machine to just pack the data into a memory buffer (see timeit command below). I am sure there must be a faster way to do this packing, especially in the case where the bank contains a numpy array rather than a python list.
> 
> I'll add it to my to-do list to investigate improving the performance of medium-to-large events in the python code.
> 
> 
> Cheers,
> Ben


> P.S. You may have a bug in your calculations (depending on how you did your testing). In poll_func I think you should be updating the stats every time the function is called, not just the times when you return True.
I had tested the way you described at first, then later changed

> P.P.S. Command I used to test how slow it is to pack the data. One-time setup of creating the buffers, then multiple tests of the pack_into function:
> 
> python -m timeit -s "import struct;import ctypes;arr = [0]*1250001;buf = ctypes.create_string_buffer(10000000);fmt = \">1250000d\"" "struct.pack_into(fmt, buf, *arr)"
> 20 loops, best of 5: 15.3 msec per loop
  2828   05 Sep 2024 Stefan RittForumPython frontend rate limitations?
> First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently. 
> You can set it to 0 and we'll call it as often as possible. You can set this in the ODB at "/Equipment/Python Data Simulator/Common/Period"

Just for your general understanding: The "period" i the C framework works differently. It calls the poll function with a number, 
and then that number is used in the poll function like (simplified):

poll(INT count) {
   for (i=0 ; i<count ; i++)
      if (new_event())
         return TRUE;
   return FALSE;
}

This ensures that polling is done as quickly as possible, even staying in the same function (poll) rather than called from the 
framework in a loop (which would require a function call to poll each time). The "count" is determined from the framework
during startup of the framework such that the execution time of the poll() routine equals the "period". Like if the period 
is 0.1, the count might be a few millions, so that the poll routine returns immediately when a new event occurs or when
100ms have expired. During the polling the frontend is "dead" meaning it cannot react on run transitions for example. That's
why most experiments use 0.1-0.5 seconds. But this does then NOT mean that you can only have 10-2 events per second, but that
the reaction time if the frontend is at maximum 0.1-0.5 seconds which is acceptable most of the case. 

Due to this design, the C frontend is capable of producing millions of events per second. It took me some while in the early 1990's
to work out that scheme sitting in the "R" trailer at TRIUMF (old guys will remember...).

Best,
Stefan
  2829   06 Sep 2024 Jack CarltonForumPython frontend rate limitations?
Thanks for the responses, they were very helpful.

>First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently. You can set it to 0 and we'll 
call it as often as possible.

Thanks, this solves the event rate limitation I described. I didn't think to change this because the "period" did not affect the observed rate in C (and now 
I know why thanks to Stefan).

A couple more questions:

1. 
For me, 
python -m timeit -s "import struct;import ctypes;arr = [0]*1250001;buf = ctypes.create_string_buffer(10000000);fmt = \">1250000d\"" "struct.pack_into(fmt, 
buf, *arr)"
10 loops, best of 3: 43.7 msec per loop

which suggests my maximum data rate is about 1.25 MB * 1000/43.7 Hz = 23 MB/s (?). But I see data rates up to 60 MB/s with a python frontend. Am I 
misinterpreting the meaning of this result?


2. I can effectively bypass the rate limitations in python by running two concurrent frontends. For example, with one python frontend at best I can generate 
60 MB/s of data (setting "period" to 0 now); but with two frontends I can double this to 120 MB/s. This implies one python frontend is not bottlenecked by 
hardware limitations in my case.

Am I doing something wrong to artificially bottleneck my frontends? Perhaps there's a multi-threading solution I can implement to avoid needing multiple 
frontends?


Thanks,
Jack
  2831   11 Sep 2024 Konstantin OlchanskiForumPython frontend rate limitations?
> I'm trying to get a sense of the rate limitations of a python frontend.

1) python is single-threaded, for ultimate performance, a MIDAS frontend (or any DAQ 
application) has to be multithreaded:
a) thread with busy loop read the data and place it into a FIFO
b) thread to read data from FIFO and send it to SYSTEM buffer shared memory or to 
mserver
c) thread to respond to begin-run, end-run, etc RPCs
d) probably a thread to recycle memory from thread (b) back to thread (a) if per-event 
malloc()/free() adds too much overhead

2) data readout. C++ AXI bus access is compiled into 1 instruction and results in 1 AXI 
bus operation. comparable for python likely has much more overhead, slows you down.

3) event bank filling. C++ for() loop is compiled into very compact machine code, 
python loop cannot because each array element can be random data type, shows you down.

bottom line, there is a reason high speed data acquisitions are written in C/C++, not 
in shell, perl, tcl/tk, or (today's favourite) python.

> The C++ frontend is about 100 times faster in both data and event rates.

This is as expected. You can probably improve python code to get closer to 10 times 
slower than C++. But consider:

a) will it be "fast enough" for the task?
b) learning C++ and optimizing python to within "2-3-10x slower than C++" may involve a 
similar amount of time and effort.

And you have not looked at the real-time properties of your frontend. You may discover 
that it's actually faster than you think, but occasionally stops for a millisecond (or 
two or hundred). some applications a notorious for running memory garbage collection 
just at the wrong time.

I am working right now on exactly this problem, I have a 1 GHz ARM CPU (Cyclone-V FPGA) 
and I need to push data out at 100 Mbytes/sec while avoiding and bad-real-time dropouts  
that cause the FPGA data FIFO to overflow. And I only have 2 CPU cores, 1 to read the 
FPGA FIFO, 1 to run the TCP/IP stack and the ethernet driver. No this can be done with 
python.

K.O.
  2832   11 Sep 2024 Konstantin OlchanskiForumPython frontend rate limitations?
> 
> poll(INT count) {
>    for (i=0 ; i<count ; i++)
>       if (new_event())
>          return TRUE;
>    return FALSE;
> }

in the c++ frontend (tmfe.h) this loop usually runs in a separate thread, and I am now working on the linux magic to assign this thread maximum 
uninterruptible priority. otherwise on my Cyclone-V FPGA SoC I see 1-10 msec dropouts, I think from taking ethernet interrupts.

K.O.
  2833   11 Sep 2024 Konstantin OlchanskiForumPython frontend rate limitations?
> > I'm trying to get a sense of the rate limitations of a python frontend.

forgot one more:

c++ toolchain comes with extensive profiler tools aimed to answer the question "why is my 
program so slow, where is it spending all the time?". some of these tools go all the way to 
the hardware level and report CPU cache misses, TLB flushes, context switches and any other 
hardware events that interrupt or slow down computations. programmer than uses this 
information to restructure the code to avoid the worst slow downs (i.e. avoid branch mis-
predictions, avoid cache misses, etc).

I doubt the python toolchain will ever profiler tools as good.

K.O.
  2838   11 Sep 2024 Konstantin OlchanskiForum"Safe" abort of sequencer scripts
> We often use the MIDAS sequencer to temporarily control detector settings, such as:
> 
> * <change some setting>
> * WAIT 60 seconds
> * <revert setting to original value>
> 
> The question arises of what happens if the sequencer scripts gets aborted during that wait, preventing the value from being reset.

Common problem. Go have an elegant solution using the "defer" keyword.

https://go.dev/tour/flowcontrol/12

K.O.
  2860   27 Sep 2024 Ben SmithForumPython frontend rate limitations?
> in your case the limitation is likely that you're sending 1.25MB per event and we have a lot of data marshalling to do between the python and C++ layer.
> 
> I'll add it to my to-do list to investigate improving the performance of medium-to-large events in the python code.

I've now added better support for numpy arrays in the python code that encodes a `midas.event.Event` object. If you use the "correct" numpy data type then you can get vastly improved performance as numpy already stores the data in memory in the format that we need.

In your example, if you change
        self.zero_buffer = [0] * self.total_data_size
to 
        self.zero_buffer = np.ndarray(self.total_data_size, np.int16)

then the max data rate of the frontend goes from 330MB/s to 7600MB/s on my laptop (a factor 20 improvement from one line of code!) 

To ensure you're using the optimal numpy dtype for your bank, you can reference a dict called `midas.tid_np_formats`. For example `midas.tid_np_formats[midas.TID_SHORT]` is equivalent to `np.int16`. If you use an int16 array and write it as a TID_SHORT bank, then we'll use the fast path. If there is a mismatch, we'll have to do type conversions and will end up on the slow path.
  2885   05 Nov 2024 Jack CarltonForumHow to properly write a client listens for events on a given buffer?
If there's some template for writing a client to access event data, that would be 
very useful (and you can probably just ignore the context I gave below in that 
case).


Some context:

Quite a while ago, I wrote the attached "data pipeline" client whose job was to 
listen for events, copy their data, and pipe them to a python script. I believe I 
just stole bits and pieces from mdump.cxx to accomplish this. Later I wrote the 
attached wrapper class "MidasConnector.cpp" and a main.cpp to generalize
data_pipeline.cxx a bit. There were a lot of iterations to the code where I had the 
below problems; so don't take the logic in the attached code as the exact code that 
caused the issues below.

However, I'm unable to resolve a couple issues:

1. If a timeout is set, everything will work until that timeout is reached. Then 
regardless of what kind of logic I tried to implement (retry receiving event, 
disconnect and reconnect client, etc.) the client would refuse to receive more data.

2. When I ctrl-C main, it hangs; this is expected because it's stuck in a while 
loop. But because I can't set a timeout I have to ctrl-C twice; this would 
occasionally corrupt the ODB which was not ideal. I was able to get around this with 
some impractical solution involving ncurses I believe.


Thanks,
Jack
  2886   05 Nov 2024 Maia Henriksson-WardForumHow to properly write a client listens for events on a given buffer?
> If there's some template for writing a client to access event data, that would be 
> very useful (and you can probably just ignore the context I gave below in that 
> case).
> 
> 
> Some context:
> 
> Quite a while ago, I wrote the attached "data pipeline" client whose job was to 
> listen for events, copy their data, and pipe them to a python script. I believe I 
> just stole bits and pieces from mdump.cxx to accomplish this. Later I wrote the 
> attached wrapper class "MidasConnector.cpp" and a main.cpp to generalize
> data_pipeline.cxx a bit. There were a lot of iterations to the code where I had the 
> below problems; so don't take the logic in the attached code as the exact code that 
> caused the issues below.
> 
> However, I'm unable to resolve a couple issues:
> 
> 1. If a timeout is set, everything will work until that timeout is reached. Then 
> regardless of what kind of logic I tried to implement (retry receiving event, 
> disconnect and reconnect client, etc.) the client would refuse to receive more data.
> 
> 2. When I ctrl-C main, it hangs; this is expected because it's stuck in a while 
> loop. But because I can't set a timeout I have to ctrl-C twice; this would 
> occasionally corrupt the ODB which was not ideal. I was able to get around this with 
> some impractical solution involving ncurses I believe.
> 
> 
> Thanks,
> Jack

midas/examples/lowlevel/consume.cxx might be what you're looking for, but I think all 
you're missing is a call to cm_yield() in your loop, so your midas client doesn't get 
killed when the timeout is reached (and also so you can act on shutdown requests from 
midas)

Something like 
      int status = cm_yield(100);
      if (status == SS_ABORT || status == RPC_SHUTDOWN)
         break;

There might be a recommended way to handle the ctrl-c and disconnect from the ODB, but 
off the top of my head I don't remember it. 

Also check out Ben's new(ish) python library, midas/python/examples/event_receiver.py 
might be a much easier solution. And you can use the context manager, which will take 
care of safely disconnecting from midas after you ctrl-C.
  2914   02 Dec 2024 Stefan RittForum"Safe" abort of sequencer scripts
The atexit() function has been implemented in the current develop branch of midas, see

  https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#ATEXIT_subroutine


Stefan
  2926   29 Dec 2024 Pavel MuratForumtime ordering of run transition calls to TMFeEquipment things
Dear MIDAS experts, 

I have a question about "tmfe approach" to implementing MIDAS frontends. If I read the code correctly, 
within this approach it is the TMFeEquipment things, not the TMFrontend's themselves, 
which handle the run transitions - the TMFrontend class 

https://bitbucket.org/tmidas/midas/src/423082fb67c7711813fcda61f7cd03784c398f49/include/tmfe.h#lines-306:378

simply doesn't have methods to handle those directly. 

So how does a user control the sequence in which TMFeEquipment::HandleBeginRun functions of different 
TMFeEquipment pieces are called at begin run? - there are two cases to consider: TMFeEquipment things 
defined by the same TMFrontend and by different TMFrontend's.

Many thanks and happy holidays to everyone! 

-- regards, Pasha
 
  2927   01 Jan 2025 Konstantin OlchanskiForumtime ordering of run transition calls to TMFeEquipment things
> I have a question about "tmfe approach" to implementing MIDAS frontends. If I read the code correctly, 
> within this approach it is the TMFeEquipment things, not the TMFrontend's themselves, 
> which handle the run transitions - the TMFrontend class

that's correct and it is documented so in https://bitbucket.org/tmidas/midas/src/develop/tmfe.md

> So how does a user control the sequence in which TMFeEquipment::HandleBeginRun functions of different 
> TMFeEquipment pieces are called at begin run? - there are two cases to consider: TMFeEquipment things 
> defined by the same TMFrontend and by different TMFrontend's.

I am not sure what you are trying to do. It is always easier to suggest a solution to a specific problem.

But I will try to answer anyway:

1) "time ordering of run transitions" - of course midas transitions are ordered by transition sequence numbers 
and the tmfe class provides methods to control this. ditto for the mfe.cxx frontends.

2) for one TMFrontend, the order of calling HandleBeginRun() is the order in which equipments were added to the 
equipment using FeAddEquipment(). HandleEndRun() is called in reverse order. (I better check this).

3) to have multiple TMFrontends in one program would be unusual (mfe.cxx frontends completely do not support 
this), but should work. Everything was coded to support this, but it was never tested in practice because we 
cannot invent any useful use-case for it. HandleBeginRun() handlers are likely to be called in the frontends are 
created. (I could check this and confirm it works, as long as you have a valid use-case for this configuration).

4) Frontend X has EquipmentA and EquipmentB, you want EqA::HandleBeginRun() to be called at run transition 200 
and EqB::HandleBeginRun() to be called at run transition 400.

This is not directly supported by mfe.cxx frontends (the begin_run() handler is a global function) and I did not 
directly implement it in the TMFE frontend.

But I think this would be a useful improvement. I will look into this.

Likely I will add per-equipment data members fEqConfBeginRunSeqNo, fEqConfEndRunSeqno, etc. Value 0 would 
unregister the corresponding run transition handler. This would cleanup the code quite a bit, a bunch
of RegisterTranstionXXX functions could go away.

K.O.
ELOG V3.1.4-2e1708b5