ID |
Date |
Author |
Topic |
Subject |
Draft
|
04 Jul 2024 |
Nick Hastings | Forum | mfe.cxx with RO_STOPPED and EQ_POLLED | I just discovered https://bitbucket.org/tmidas/midas/issues/338/mfec-ro_stopped-is-now-forbidden
> Dear Midas experts,
>
> I noticed that a check was added to mfe.cxx in 1961af0d6:
>
> + /* check for consistent common settings */
> + if ((eq_info->read_on & RO_STOPPED) &&
> + (eq_info->eq_type == EQ_POLLED ||
> + eq_info->eq_type == EQ_INTERRUPT ||
> + eq_info->eq_type == EQ_MULTITHREAD ||
> + eq_info->eq_type == EQ_USER)) {
> + cm_msg(MERROR, "register_equipment", "Events \"%s\" cannot be read when run is stopped (RO_STOPPED flag)", equipment[idx].name);
> + return 0;
> + }
>
> This commit was by Stefan in May 2022.
>
> A commit few days later, 28d9c96bd, removed the "return 0;", and updated the
> error message to:
>
> "Equipment \"%s\" contains RO_STOPPED or RO_ALWAYS. This can lead to undesired side-effect and should be removed."
>
> So such FEs can run but there is still an error at start up. The
> documentation at https://daq00.triumf.ca/MidasWiki/index.php/ReadOn_Flags
> states with RO_STOPPED "Readout Occurs" "Before stopping run".
> Which seems to indicate that the removing the RO_STOPPED bit from a SC FE
> would just result in an additional read not happening just prior to a run
> stop. However reading scheduler() in mfe.cxx I see in the the main loop:
>
> if (run_state == STATE_STOPPED && (eq_info->read_on & RO_STOPPED) == 0)
> continue;
>
> So it seems to me that the a EQ_PERIODIC equipment needs RO_STOPPED to be set
> otherwise it will not read out data while there is no DAQ run.
>
> Can someone explain the purpose of this check and error message? Perhaps it
> was put in place with only DAQ FEs, not SC FEs in mind? And should the
> documentation in the wiki actually be "s/Before stopping run/While run is stopped/"?
>
> Thanks,
>
> Nick. |
2796
|
06 Aug 2024 |
Stefan Ritt | Forum | mfe.cxx with RO_STOPPED and EQ_POLLED | > I noticed that a check was added to mfe.cxx in 1961af0d6:
>
> + /* check for consistent common settings */
> + if ((eq_info->read_on & RO_STOPPED) &&
> + (eq_info->eq_type == EQ_POLLED ||
> + eq_info->eq_type == EQ_INTERRUPT ||
> + eq_info->eq_type == EQ_MULTITHREAD ||
> + eq_info->eq_type == EQ_USER)) {
> + cm_msg(MERROR, "register_equipment", "Events \"%s\" cannot be read when run is stopped (RO_STOPPED flag)", equipment[idx].name);
> + return 0;
> + }
>
>
> Can someone explain the purpose of this check and error message? Perhaps it
> was put in place with only DAQ FEs, not SC FEs in mind? And should the
> documentation in the wiki actually be "s/Before stopping run/While run is stopped/"?
Indeed you have two types of events handled by mfe.cxx: Slow control events (EQ_SLOW or EQ_PRIODIC) and triggered events (EQ_POLLED or
EQ_INTERRUPT or EQ_MULTITHREAD or EQ_USER). For slow control events it can make sense to read them also when the run is stopped, that's why you
can specify RO_STOPPED or RO_ALWAYS. This does however not make sense for triggered events. Reading triggered events when the run is stopped
invalidates the concept of runs (= read triggered events only during a run). We had cases where people mixed this up, so the warning was added.
If you have a slow control event you want to read when the run is stopped, make sure it is of type EQ_SLOW or EQ_PERIODIC.
Stefan |
2804
|
15 Aug 2024 |
Scott Oser | Forum | "Safe" abort of sequencer scripts | We often use the MIDAS sequencer to temporarily control detector settings, such as:
* <change some setting>
* WAIT 60 seconds
* <revert setting to original value>
The question arises of what happens if the sequencer scripts gets aborted during that wait, preventing the value from being reset. Depending on the setting, this could be undesirable or even damage something if left uncorrected for too long.
Is there any way to have a "safe abort" from the sequencer so that the "Stop immediately" button will call some cleanup script to leave things in a safe state? Or what about if the sequencer process itself gets killed in the middle of a script?
How have other experiments using MIDAS protected themselves from unplanned terminations of sequencer scripts? |
2805
|
19 Aug 2024 |
Stefan Ritt | Forum | "Safe" abort of sequencer scripts | This request came more than once in the past. One thing I could implement is a "atexit" function similarly to the C funciton atexit().
Then we would have a function in the script which gets called whenever one does "stop immediately". This function can then restore
some ODB values or do whatever is necessary.
If the sequencer gets killed in the middle, it can safely be restarted since the complete sequencer state is kept in the ODB under
/Sequencer/State. After the restart, the sequencer continues exactly where it has been killed before.
Would that solve your problem?
Stefan |
2809
|
22 Aug 2024 |
Scott Oser | Forum | "Safe" abort of sequencer scripts | > This request came more than once in the past. One thing I could implement is a "atexit" function similarly to the C funciton atexit().
>
> Then we would have a function in the script which gets called whenever one does "stop immediately". This function can then restore
> some ODB values or do whatever is necessary.
>
> If the sequencer gets killed in the middle, it can safely be restarted since the complete sequencer state is kept in the ODB under
> /Sequencer/State. After the restart, the sequencer continues exactly where it has been killed before.
>
> Would that solve your problem?
>
> Stefan
Yes, an "atexit" functionality within the Midas Sequencer Language would be useful for us with this issue. Is this easy for you to implement?
Thanks,
Scott Oser |
2825
|
05 Sep 2024 |
Jack Carlton | Forum | Python frontend rate limitations? | I'm trying to get a sense of the rate limitations of a python frontend. I
understand this will vary from system to system.
I adapted two frontends from the example templates, one in C++ and one in python.
Both simply fill a midas bank with a fixed length array of zeros at a given polled
rate. However, the C++ frontend is about 100 times faster in both data and event
rates. This seems slow, even for an interpreted language like python. Furthermore,
I can effectively increase the maximum rate by concurrently running a second
python frontend (this is not the case for the C++ frontend). In short, there is
some limitation with using python here unrelated to hardware.
In my case, poll_func appears to be called at 100Hz at best. What limits the rate
that poll_func is called in a python frontend? Is there a more appropriate
solution for increasing the python frontend data/event rate than simply launching
more frontends?
I've attached my C++ and python frontend files for reference.
Thanks,
Jack |
2826
|
05 Sep 2024 |
Ben Smith | Forum | Python frontend rate limitations? | > What limits the rate that poll_func is called in a python frontend?
First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently. You can set it to 0 and we'll call it as often as possible. You can set this in the ODB at "/Equipment/Python Data Simulator/Common/Period"
If that's still not fast enough, then you can return a *list* of events from your readout_func. I've seen real-world cases of 25kHz+ of midas events generated in this fashion.
However in your case the limitation is likely that you're sending 1.25MB per event and we have a lot of data marshalling to do between the python and C++ layer. In particular it takes 15ms on my machine to just pack the data into a memory buffer (see timeit command below). I am sure there must be a faster way to do this packing, especially in the case where the bank contains a numpy array rather than a python list.
I'll add it to my to-do list to investigate improving the performance of medium-to-large events in the python code.
Cheers,
Ben
P.S. You may have a bug in your calculations (depending on how you did your testing). In poll_func I think you should be updating the stats every time the function is called, not just the times when you return True.
P.P.S. Command I used to test how slow it is to pack the data. One-time setup of creating the buffers, then multiple tests of the pack_into function:
python -m timeit -s "import struct;import ctypes;arr = [0]*1250001;buf = ctypes.create_string_buffer(10000000);fmt = \">1250000d\"" "struct.pack_into(fmt, buf, *arr)"
20 loops, best of 5: 15.3 msec per loop |
Draft
|
05 Sep 2024 |
Jack Carlton | Forum | Python frontend rate limitations? |
Thank you, this was very helpful.
> First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently. You can set it to 0 and we'll call it as often as possible. You can set this in the ODB at "/Equipment/Python Data Simulator/Common/Period"
Thanks, I thought that was just for periodic triggering (or at least that's how I've used it in C++ frontends). Changing this allowed me to get past the 100Hz event rate cap I described.
> If that's still not fast enough, then you can return a *list* of events from your readout_func. I've seen real-world cases of 25kHz+ of midas events generated in this fashion.
>
>
> However in your case the limitation is likely that you're sending 1.25MB per event and we have a lot of data marshalling to do between the python and C++ layer. In particular it takes 15ms on my machine to just pack the data into a memory buffer (see timeit command below). I am sure there must be a faster way to do this packing, especially in the case where the bank contains a numpy array rather than a python list.
>
> I'll add it to my to-do list to investigate improving the performance of medium-to-large events in the python code.
>
>
> Cheers,
> Ben
> P.S. You may have a bug in your calculations (depending on how you did your testing). In poll_func I think you should be updating the stats every time the function is called, not just the times when you return True.
I had tested the way you described at first, then later changed
> P.P.S. Command I used to test how slow it is to pack the data. One-time setup of creating the buffers, then multiple tests of the pack_into function:
>
> python -m timeit -s "import struct;import ctypes;arr = [0]*1250001;buf = ctypes.create_string_buffer(10000000);fmt = \">1250000d\"" "struct.pack_into(fmt, buf, *arr)"
> 20 loops, best of 5: 15.3 msec per loop |
2828
|
05 Sep 2024 |
Stefan Ritt | Forum | Python frontend rate limitations? | > First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently.
> You can set it to 0 and we'll call it as often as possible. You can set this in the ODB at "/Equipment/Python Data Simulator/Common/Period"
Just for your general understanding: The "period" i the C framework works differently. It calls the poll function with a number,
and then that number is used in the poll function like (simplified):
poll(INT count) {
for (i=0 ; i<count ; i++)
if (new_event())
return TRUE;
return FALSE;
}
This ensures that polling is done as quickly as possible, even staying in the same function (poll) rather than called from the
framework in a loop (which would require a function call to poll each time). The "count" is determined from the framework
during startup of the framework such that the execution time of the poll() routine equals the "period". Like if the period
is 0.1, the count might be a few millions, so that the poll routine returns immediately when a new event occurs or when
100ms have expired. During the polling the frontend is "dead" meaning it cannot react on run transitions for example. That's
why most experiments use 0.1-0.5 seconds. But this does then NOT mean that you can only have 10-2 events per second, but that
the reaction time if the frontend is at maximum 0.1-0.5 seconds which is acceptable most of the case.
Due to this design, the C frontend is capable of producing millions of events per second. It took me some while in the early 1990's
to work out that scheme sitting in the "R" trailer at TRIUMF (old guys will remember...).
Best,
Stefan |
2829
|
06 Sep 2024 |
Jack Carlton | Forum | Python frontend rate limitations? | Thanks for the responses, they were very helpful.
>First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently. You can set it to 0 and we'll
call it as often as possible.
Thanks, this solves the event rate limitation I described. I didn't think to change this because the "period" did not affect the observed rate in C (and now
I know why thanks to Stefan).
A couple more questions:
1.
For me,
python -m timeit -s "import struct;import ctypes;arr = [0]*1250001;buf = ctypes.create_string_buffer(10000000);fmt = \">1250000d\"" "struct.pack_into(fmt,
buf, *arr)"
10 loops, best of 3: 43.7 msec per loop
which suggests my maximum data rate is about 1.25 MB * 1000/43.7 Hz = 23 MB/s (?). But I see data rates up to 60 MB/s with a python frontend. Am I
misinterpreting the meaning of this result?
2. I can effectively bypass the rate limitations in python by running two concurrent frontends. For example, with one python frontend at best I can generate
60 MB/s of data (setting "period" to 0 now); but with two frontends I can double this to 120 MB/s. This implies one python frontend is not bottlenecked by
hardware limitations in my case.
Am I doing something wrong to artificially bottleneck my frontends? Perhaps there's a multi-threading solution I can implement to avoid needing multiple
frontends?
Thanks,
Jack |
2831
|
11 Sep 2024 |
Konstantin Olchanski | Forum | Python frontend rate limitations? | > I'm trying to get a sense of the rate limitations of a python frontend.
1) python is single-threaded, for ultimate performance, a MIDAS frontend (or any DAQ
application) has to be multithreaded:
a) thread with busy loop read the data and place it into a FIFO
b) thread to read data from FIFO and send it to SYSTEM buffer shared memory or to
mserver
c) thread to respond to begin-run, end-run, etc RPCs
d) probably a thread to recycle memory from thread (b) back to thread (a) if per-event
malloc()/free() adds too much overhead
2) data readout. C++ AXI bus access is compiled into 1 instruction and results in 1 AXI
bus operation. comparable for python likely has much more overhead, slows you down.
3) event bank filling. C++ for() loop is compiled into very compact machine code,
python loop cannot because each array element can be random data type, shows you down.
bottom line, there is a reason high speed data acquisitions are written in C/C++, not
in shell, perl, tcl/tk, or (today's favourite) python.
> The C++ frontend is about 100 times faster in both data and event rates.
This is as expected. You can probably improve python code to get closer to 10 times
slower than C++. But consider:
a) will it be "fast enough" for the task?
b) learning C++ and optimizing python to within "2-3-10x slower than C++" may involve a
similar amount of time and effort.
And you have not looked at the real-time properties of your frontend. You may discover
that it's actually faster than you think, but occasionally stops for a millisecond (or
two or hundred). some applications a notorious for running memory garbage collection
just at the wrong time.
I am working right now on exactly this problem, I have a 1 GHz ARM CPU (Cyclone-V FPGA)
and I need to push data out at 100 Mbytes/sec while avoiding and bad-real-time dropouts
that cause the FPGA data FIFO to overflow. And I only have 2 CPU cores, 1 to read the
FPGA FIFO, 1 to run the TCP/IP stack and the ethernet driver. No this can be done with
python.
K.O. |
2832
|
11 Sep 2024 |
Konstantin Olchanski | Forum | Python frontend rate limitations? | >
> poll(INT count) {
> for (i=0 ; i<count ; i++)
> if (new_event())
> return TRUE;
> return FALSE;
> }
in the c++ frontend (tmfe.h) this loop usually runs in a separate thread, and I am now working on the linux magic to assign this thread maximum
uninterruptible priority. otherwise on my Cyclone-V FPGA SoC I see 1-10 msec dropouts, I think from taking ethernet interrupts.
K.O. |
2833
|
11 Sep 2024 |
Konstantin Olchanski | Forum | Python frontend rate limitations? | > > I'm trying to get a sense of the rate limitations of a python frontend.
forgot one more:
c++ toolchain comes with extensive profiler tools aimed to answer the question "why is my
program so slow, where is it spending all the time?". some of these tools go all the way to
the hardware level and report CPU cache misses, TLB flushes, context switches and any other
hardware events that interrupt or slow down computations. programmer than uses this
information to restructure the code to avoid the worst slow downs (i.e. avoid branch mis-
predictions, avoid cache misses, etc).
I doubt the python toolchain will ever profiler tools as good.
K.O. |
2838
|
11 Sep 2024 |
Konstantin Olchanski | Forum | "Safe" abort of sequencer scripts | > We often use the MIDAS sequencer to temporarily control detector settings, such as:
>
> * <change some setting>
> * WAIT 60 seconds
> * <revert setting to original value>
>
> The question arises of what happens if the sequencer scripts gets aborted during that wait, preventing the value from being reset.
Common problem. Go have an elegant solution using the "defer" keyword.
https://go.dev/tour/flowcontrol/12
K.O. |
2860
|
27 Sep 2024 |
Ben Smith | Forum | Python frontend rate limitations? | > in your case the limitation is likely that you're sending 1.25MB per event and we have a lot of data marshalling to do between the python and C++ layer.
>
> I'll add it to my to-do list to investigate improving the performance of medium-to-large events in the python code.
I've now added better support for numpy arrays in the python code that encodes a `midas.event.Event` object. If you use the "correct" numpy data type then you can get vastly improved performance as numpy already stores the data in memory in the format that we need.
In your example, if you change
self.zero_buffer = [0] * self.total_data_size
to
self.zero_buffer = np.ndarray(self.total_data_size, np.int16)
then the max data rate of the frontend goes from 330MB/s to 7600MB/s on my laptop (a factor 20 improvement from one line of code!)
To ensure you're using the optimal numpy dtype for your bank, you can reference a dict called `midas.tid_np_formats`. For example `midas.tid_np_formats[midas.TID_SHORT]` is equivalent to `np.int16`. If you use an int16 array and write it as a TID_SHORT bank, then we'll use the fast path. If there is a mismatch, we'll have to do type conversions and will end up on the slow path. |
2885
|
05 Nov 2024 |
Jack Carlton | Forum | How to properly write a client listens for events on a given buffer? | If there's some template for writing a client to access event data, that would be
very useful (and you can probably just ignore the context I gave below in that
case).
Some context:
Quite a while ago, I wrote the attached "data pipeline" client whose job was to
listen for events, copy their data, and pipe them to a python script. I believe I
just stole bits and pieces from mdump.cxx to accomplish this. Later I wrote the
attached wrapper class "MidasConnector.cpp" and a main.cpp to generalize
data_pipeline.cxx a bit. There were a lot of iterations to the code where I had the
below problems; so don't take the logic in the attached code as the exact code that
caused the issues below.
However, I'm unable to resolve a couple issues:
1. If a timeout is set, everything will work until that timeout is reached. Then
regardless of what kind of logic I tried to implement (retry receiving event,
disconnect and reconnect client, etc.) the client would refuse to receive more data.
2. When I ctrl-C main, it hangs; this is expected because it's stuck in a while
loop. But because I can't set a timeout I have to ctrl-C twice; this would
occasionally corrupt the ODB which was not ideal. I was able to get around this with
some impractical solution involving ncurses I believe.
Thanks,
Jack |
2886
|
05 Nov 2024 |
Maia Henriksson-Ward | Forum | How to properly write a client listens for events on a given buffer? | > If there's some template for writing a client to access event data, that would be
> very useful (and you can probably just ignore the context I gave below in that
> case).
>
>
> Some context:
>
> Quite a while ago, I wrote the attached "data pipeline" client whose job was to
> listen for events, copy their data, and pipe them to a python script. I believe I
> just stole bits and pieces from mdump.cxx to accomplish this. Later I wrote the
> attached wrapper class "MidasConnector.cpp" and a main.cpp to generalize
> data_pipeline.cxx a bit. There were a lot of iterations to the code where I had the
> below problems; so don't take the logic in the attached code as the exact code that
> caused the issues below.
>
> However, I'm unable to resolve a couple issues:
>
> 1. If a timeout is set, everything will work until that timeout is reached. Then
> regardless of what kind of logic I tried to implement (retry receiving event,
> disconnect and reconnect client, etc.) the client would refuse to receive more data.
>
> 2. When I ctrl-C main, it hangs; this is expected because it's stuck in a while
> loop. But because I can't set a timeout I have to ctrl-C twice; this would
> occasionally corrupt the ODB which was not ideal. I was able to get around this with
> some impractical solution involving ncurses I believe.
>
>
> Thanks,
> Jack
midas/examples/lowlevel/consume.cxx might be what you're looking for, but I think all
you're missing is a call to cm_yield() in your loop, so your midas client doesn't get
killed when the timeout is reached (and also so you can act on shutdown requests from
midas)
Something like
int status = cm_yield(100);
if (status == SS_ABORT || status == RPC_SHUTDOWN)
break;
There might be a recommended way to handle the ctrl-c and disconnect from the ODB, but
off the top of my head I don't remember it.
Also check out Ben's new(ish) python library, midas/python/examples/event_receiver.py
might be a much easier solution. And you can use the context manager, which will take
care of safely disconnecting from midas after you ctrl-C. |
2914
|
02 Dec 2024 |
Stefan Ritt | Forum | "Safe" abort of sequencer scripts | The atexit() function has been implemented in the current develop branch of midas, see
https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#ATEXIT_subroutine
Stefan |
2926
|
29 Dec 2024 |
Pavel Murat | Forum | time ordering of run transition calls to TMFeEquipment things | Dear MIDAS experts,
I have a question about "tmfe approach" to implementing MIDAS frontends. If I read the code correctly,
within this approach it is the TMFeEquipment things, not the TMFrontend's themselves,
which handle the run transitions - the TMFrontend class
https://bitbucket.org/tmidas/midas/src/423082fb67c7711813fcda61f7cd03784c398f49/include/tmfe.h#lines-306:378
simply doesn't have methods to handle those directly.
So how does a user control the sequence in which TMFeEquipment::HandleBeginRun functions of different
TMFeEquipment pieces are called at begin run? - there are two cases to consider: TMFeEquipment things
defined by the same TMFrontend and by different TMFrontend's.
Many thanks and happy holidays to everyone!
-- regards, Pasha
|
2927
|
01 Jan 2025 |
Konstantin Olchanski | Forum | time ordering of run transition calls to TMFeEquipment things | > I have a question about "tmfe approach" to implementing MIDAS frontends. If I read the code correctly,
> within this approach it is the TMFeEquipment things, not the TMFrontend's themselves,
> which handle the run transitions - the TMFrontend class
that's correct and it is documented so in https://bitbucket.org/tmidas/midas/src/develop/tmfe.md
> So how does a user control the sequence in which TMFeEquipment::HandleBeginRun functions of different
> TMFeEquipment pieces are called at begin run? - there are two cases to consider: TMFeEquipment things
> defined by the same TMFrontend and by different TMFrontend's.
I am not sure what you are trying to do. It is always easier to suggest a solution to a specific problem.
But I will try to answer anyway:
1) "time ordering of run transitions" - of course midas transitions are ordered by transition sequence numbers
and the tmfe class provides methods to control this. ditto for the mfe.cxx frontends.
2) for one TMFrontend, the order of calling HandleBeginRun() is the order in which equipments were added to the
equipment using FeAddEquipment(). HandleEndRun() is called in reverse order. (I better check this).
3) to have multiple TMFrontends in one program would be unusual (mfe.cxx frontends completely do not support
this), but should work. Everything was coded to support this, but it was never tested in practice because we
cannot invent any useful use-case for it. HandleBeginRun() handlers are likely to be called in the frontends are
created. (I could check this and confirm it works, as long as you have a valid use-case for this configuration).
4) Frontend X has EquipmentA and EquipmentB, you want EqA::HandleBeginRun() to be called at run transition 200
and EqB::HandleBeginRun() to be called at run transition 400.
This is not directly supported by mfe.cxx frontends (the begin_run() handler is a global function) and I did not
directly implement it in the TMFE frontend.
But I think this would be a useful improvement. I will look into this.
Likely I will add per-equipment data members fEqConfBeginRunSeqNo, fEqConfEndRunSeqno, etc. Value 0 would
unregister the corresponding run transition handler. This would cleanup the code quite a bit, a bunch
of RegisterTranstionXXX functions could go away.
K.O. |
|