Back Midas Rome Roody Rootana
  Midas DAQ System, Page 125 of 146  Not logged in ELOG logo
New entries since:Wed Dec 31 16:00:00 1969
    Reply  05 Sep 2024, Jack Carlton, Forum, Python frontend rate limitations? 
Thank you, this was very helpful.

> First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently. You can set it to 0 and we'll call it as often as possible. You can set this in the ODB at "/Equipment/Python Data Simulator/Common/Period"

Thanks, I thought that was just for periodic triggering (or at least that's how I've used it in C++ frontends). Changing this allowed me to get past the 100Hz event rate cap I described.

> If that's still not fast enough, then you can return a *list* of events from your readout_func. I've seen real-world cases of 25kHz+ of midas events generated in this fashion.
> 
> 
> However in your case the limitation is likely that you're sending 1.25MB per event and we have a lot of data marshalling to do between the python and C++ layer. In particular it takes 15ms on my machine to just pack the data into a memory buffer (see timeit command below). I am sure there must be a faster way to do this packing, especially in the case where the bank contains a numpy array rather than a python list.
> 
> I'll add it to my to-do list to investigate improving the performance of medium-to-large events in the python code.
> 
> 
> Cheers,
> Ben


> P.S. You may have a bug in your calculations (depending on how you did your testing). In poll_func I think you should be updating the stats every time the function is called, not just the times when you return True.
I had tested the way you described at first, then later changed

> P.P.S. Command I used to test how slow it is to pack the data. One-time setup of creating the buffers, then multiple tests of the pack_into function:
> 
> python -m timeit -s "import struct;import ctypes;arr = [0]*1250001;buf = ctypes.create_string_buffer(10000000);fmt = \">1250000d\"" "struct.pack_into(fmt, buf, *arr)"
> 20 loops, best of 5: 15.3 msec per loop
    Reply  06 Sep 2024, Jack Carlton, Forum, Python frontend rate limitations? 
Thanks for the responses, they were very helpful.

>First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently. You can set it to 0 and we'll 
call it as often as possible.

Thanks, this solves the event rate limitation I described. I didn't think to change this because the "period" did not affect the observed rate in C (and now 
I know why thanks to Stefan).

A couple more questions:

1. 
For me, 
python -m timeit -s "import struct;import ctypes;arr = [0]*1250001;buf = ctypes.create_string_buffer(10000000);fmt = \">1250000d\"" "struct.pack_into(fmt, 
buf, *arr)"
10 loops, best of 3: 43.7 msec per loop

which suggests my maximum data rate is about 1.25 MB * 1000/43.7 Hz = 23 MB/s (?). But I see data rates up to 60 MB/s with a python frontend. Am I 
misinterpreting the meaning of this result?


2. I can effectively bypass the rate limitations in python by running two concurrent frontends. For example, with one python frontend at best I can generate 
60 MB/s of data (setting "period" to 0 now); but with two frontends I can double this to 120 MB/s. This implies one python frontend is not bottlenecked by 
hardware limitations in my case.

Am I doing something wrong to artificially bottleneck my frontends? Perhaps there's a multi-threading solution I can implement to avoid needing multiple 
frontends?


Thanks,
Jack
Entry  05 Nov 2024, Jack Carlton, Forum, How to properly write a client listens for events on a given buffer? data_pipeline_(2).cxxMidasConnector.cppmain.cpp
If there's some template for writing a client to access event data, that would be 
very useful (and you can probably just ignore the context I gave below in that 
case).


Some context:

Quite a while ago, I wrote the attached "data pipeline" client whose job was to 
listen for events, copy their data, and pipe them to a python script. I believe I 
just stole bits and pieces from mdump.cxx to accomplish this. Later I wrote the 
attached wrapper class "MidasConnector.cpp" and a main.cpp to generalize
data_pipeline.cxx a bit. There were a lot of iterations to the code where I had the 
below problems; so don't take the logic in the attached code as the exact code that 
caused the issues below.

However, I'm unable to resolve a couple issues:

1. If a timeout is set, everything will work until that timeout is reached. Then 
regardless of what kind of logic I tried to implement (retry receiving event, 
disconnect and reconnect client, etc.) the client would refuse to receive more data.

2. When I ctrl-C main, it hangs; this is expected because it's stuck in a while 
loop. But because I can't set a timeout I have to ctrl-C twice; this would 
occasionally corrupt the ODB which was not ideal. I was able to get around this with 
some impractical solution involving ncurses I believe.


Thanks,
Jack
Entry  23 Mar 2020, Ivo Schulthess, Forum, Save data to FTP 
Dear all
I try to save data to an FTP server but don't get any data on the server. Midas does not complain or message any error but also nothing gets saved. Does somebody have experience with this? I use the following settings for the ODB mlogger channel settings: Type: FTP, Filename: server.com, 21, user, pw, ., run%06d.mid, Format: MIDAS, Output: FILE. What would be the Output: FTP setting for? I tried this but it does not work at all. 
Thanks in advance,
Ivo
    Reply  24 Mar 2020, Ivo Schulthess, Forum, Save data to FTP 
> > I try to save data to an FTP server but don't get any data on the server. Midas does not complain or message any error but also nothing gets saved. Does somebody have experience with this? I use the following settings for the ODB mlogger channel settings: Type: FTP, Filename: server.com, 21, user, pw, ., run%06d.mid, Format: MIDAS, Output: FILE. What would be the Output: FTP setting for? I tried this but it does not work at all. 
> 
> Hi, Ivo, good to hear from a midas user in these difficult times.
> 
> We do not use FTP at TRIUMF, but Stefan asked us to keep FTP alive and working, so we should be able
> to get you going. I will try to find the FTP instructions for you, I am pretty sure I have them somewhere.
> 
> In the mean time, I am very curious why you are using a FTP to record data, is it some kind
> of data appliance where simplest input for data is FTP? Using NFS does not work or is too hard?
> 
> Also for example at CERN, we write data to Castor and EOS, for this mlogger writes data to local disk,
> then the lazylogger runs a script to move the data to Castor and EOS. The example lazylogger
> scripts for this are in the MIDAS "progs" directory. But maybe you do not have a local disk and this would
> not work for you.
> 
> In other news, I hope to work on mlogger and lazylogger support for cloud storage (swift and s3 apis?),
> would that be useful as replacement for FTP?
> 
> K.O.
>
Good Morning Konstantin

Thanks for the fast reply.  Yes, it is, Midas is one of the things we can at least improve from home. 

Our experiment is planned to measure (soon) at ILL. Now since we don't use the equipment/detector from the 
beamline but our own, all the data from Midas is saved on the local drive. This is fine in the first instance
but then we also need proper backup. Since our experiment is quite small, the easiest solution I came up with
is to copy all of our data to the ILL storage which has enough space and is properly backed up. The ILL data 
storage allows only SFTP connections, nothing else. Since Midas has the FTP feature, having a separate FTP 
logger channel seemed the easiest way to go. 

Thanks for your input, I will look into how to mount SFTP and then this would also be a solution. 

Since ILL only provides access via SFTP and everything else is not existent or blocked (not even ssh is possible),
this is the only thing we can work with by now. 

Best regards,
Ivo
    Reply  24 Mar 2020, Ivo Schulthess, Forum, Save data to FTP 
> Logging directly from the midas logger to FTP is a bit cumbersome. In case of delays during login etc. this can throttle the whole DAQ chain. 
> What we use in our lab is to write to local disk, then use the lazylogger (https://midas.triumf.ca/MidasWiki/index.php/Lazylogger) to copy the 
> local files to a remote FTP server. This way we de-couple data taking from backup, making the system much more swift.
> 
> Best,
> Stefan

Yes, see this now too. I will, therefore, try to set up the lazylogger properly. 
Entry  07 Apr 2020, Ivo Schulthess, Suggestion, Sequencer loop break 
I am using the Midas sequencer to run subsequent measurements in a loop, without 
knowing how many iterations in advance. Therefore, I am using the "infinity" 
option. Since I have other commands after the loop, it would be nice to have the 
possibility to break the loop, but let the sequencer then finish the rest of the 
commands. 
Cheers,
Ivo
    Reply  23 Apr 2020, Ivo Schulthess, Suggestion, Sequencer loop break 
> You can do that with the "GOTO" statement, jumping to the first line after the loop.
> 
> Here is a working example:
> 
> 
> LOOP runs, 5
>      WAIT Seconds 3
>      IF $runs > 2
>          GOTO 7
>      ENDIF
> ENDLOOP
> MESSAGE "Finished", 1
> 
> Best,
> Stefan

Hoi Stefan

Thanks for your answer. As I understand it, this has to be in the sequence script before 
running. So, in the end, it is not different than just saying "LOOP runs, 2" and 
therefore the number of runs has do be known in advance as well. Or is there an option to 
change the script on runtime? What I would like, is to start a sequence with "LOOP runs, 
infinite" and when I come back to the experiment after falling asleep being able to break 
the loop after the next iteration, but still execute everything after ENDLOOP, i.e. the 
MESSAGE statement in your example. Because if I do a "Stop after current run", this seems 
not to happen. 

Best, Ivo
Entry  10 Jun 2020, Ivo Schulthess, Forum, slow-control equipment crashes when running multi-threaded on a remote machine 
Dear all

To reduce the time needed by Midas between runs, we want to change some of our periodic equipment to multi-threaded slow-control equipment. To do that I wanted to start from 
the slowcont with the multi/hv class driver and the nulldev device driver and null bus driver. The example runs fine as it is on the local midas machine and also on remote 
machines. When adding the DF_MULTITHREAD flag to the device driver list, it does not run anymore on remote machines but aborts with the following assertion:

scfe: /home/neutron/packages/midas/src/midas.cxx:1569: INT cm_get_path(char*, int): Assertion `_path_name.length() > 0' failed.

Running the frontend with GDB and set a breakpoint at the exit leads to the following: 

(gdb) where
#0  0x00007ffff68d599f in raise () from /lib64/libc.so.6
#1  0x00007ffff68bfcf5 in abort () from /lib64/libc.so.6
#2  0x00007ffff68bfbc9 in __assert_fail_base.cold.0 () from /lib64/libc.so.6
#3  0x00007ffff68cde56 in __assert_fail () from /lib64/libc.so.6
#4  0x000000000041efbf in cm_get_path (path=0x7fffffffd060 "P\373g", path_size=256)
    at /home/neutron/packages/midas/src/midas.cxx:1563
#5  cm_get_path (path=path@entry=0x7fffffffd060 "P\373g", path_size=path_size@entry=256)
    at /home/neutron/packages/midas/src/midas.cxx:1563
#6  0x0000000000453dd8 in ss_semaphore_create (name=name@entry=0x7fffffffd2c0 "DD_Input", 
    semaphore_handle=semaphore_handle@entry=0x67f700 <multi_driver+96>)
    at /home/neutron/packages/midas/src/system.cxx:2340
#7  0x0000000000451d25 in device_driver (device_drv=0x67f6a0 <multi_driver>, cmd=<optimized out>)
    at /home/neutron/packages/midas/src/device_driver.cxx:155
#8  0x00000000004175f8 in multi_init(eqpmnt*) ()
#9  0x00000000004185c8 in cd_multi(int, eqpmnt*) ()
#10 0x000000000041c20c in initialize_equipment () at /home/neutron/packages/midas/src/mfe.cxx:827
#11 0x000000000040da60 in main (argc=1, argv=0x7fffffffda48)
    at /home/neutron/packages/midas/src/mfe.cxx:2757

I also tried to use the generic class driver which results in the same. I am not sure if this is a problem of the multi-threaded frontend running on a remote machine or is it 
something of our system which is not properly set up. Anyway I am running out of ideas how to solve this and would appreciate any input. 

Thanks in advance,
Ivo
    Reply  12 Jun 2020, Ivo Schulthess, Forum, slow-control equipment crashes when running multi-threaded on a remote machine 
Thanks you two once again for the very fast answers. I tested the example on the local machine and it works perfectly fine. In the meantime I also created two new drivers for our devices 
and everything works with them, the improvement in time is significant and I will create drivers for all our devices where possible. If they are in a working state I can also provide 
them to add to the Midas drivers. Of course if it would be possible to run the front-end also on our remote machines this would be even better. I am not experienced in any multi-threaded 
programming but if I can provide any help or input, please let me know. 

Have a great weekend,
Ivo
Entry  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
Dear all

We just started our beam time at ILL and just found yesterday that for certain 
settings of our detector the data is not saved into the .mid files. Running "mdump 
-l 10" online we see the data coming in as they should. Nevertheless, if we run 
"mdump -x runXXXXXX.mid" offline, the data file has no events and the banks are 
missing. Any ideas where the data could go lost?

Thanks in advance,
Ivo
    Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> > Dear all
> > 
> > We just started our beam time at ILL and just found yesterday that for certain 
> > settings of our detector the data is not saved into the .mid files. Running "mdump 
> > -l 10" online we see the data coming in as they should. Nevertheless, if we run 
> > "mdump -x runXXXXXX.mid" offline, the data file has no events and the banks are 
> > missing. Any ideas where the data could go lost?
> > 
> > Thanks in advance,
> > Ivo
> 
> Have you checked 
> 
> /Logger/Channels/0/Settings/Event ID = -1
> /Logger/Channels/0/Settings/Trigger mask = -1
> 
> If these settings are not -1, they filter the data stream for certain events and trigger 
> masks.
> 
> Stefan

Good morning Stefan

Both set to -1. We only have one logging channel. If we run a sequence with a few runs and the 
same settings, sometimes data is in the .mid file and sometimes it is not.

Best,
Ivo 
    Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> Then I'm running out of ideas. Things I would check:
> 
> - Are the file sizes about the same? 
> 
> - When you dump the .mid file, you do you see your bank names? 
> 
> This would tell you if the events are really missing or if mdump would just not find them.
> 
> But I guess without being able to debug the system at ILL I cannot be of any more help. You are the 
> first one reporting such a problem, so it must have to do with your local setup.
> 
> Stefan

So I did a quick check. The file size is about the same (322K and 329K). When I dump the .mid I don't see 
the banks. It only prints two lines with "------ Event# 0 ------" and "------ Event# 1 ------" whereas for 
the file with data I get the two banks with all the data. Our online analyzer also fails to see the banks. 
Is there another way to check what is in the .mid file?

Best,
Ivo
    Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> with "dump" I meant a true object dump like "hexdump -C run000001.mid". I produced a file with ADC0 and TDC0 
> banks (that's the example from the distribution under exampels/experiments/frontend.cxx), and I get
> 
> ....
> 00024220  01 00 00 00 41 44 43 30  04 00 08 00 eb 06 35 04  |....ADC0......5.|
> 00024230  31 09 4f 06 54 44 43 30  04 00 08 00 93 04 fb 07  |1.O.TDC0........|
> 00024240  5c 09 88 0b 01 00 00 00  01 00 00 00 2a 0b 31 5f  |\...........*.1_|
> 00024250  28 00 00 00 20 00 00 00  01 00 00 00 41 44 43 30  |(... .......ADC0|
> 00024260  04 00 08 00 c3 09 24 05  85 05 f3 06 54 44 43 30  |......$.....TDC0|
> 00024270  04 00 08 00 88 08 2d 03  3b 0d d6 02 01 00 00 00  |......-.;.......|
> 00024280  02 00 00 00 2a 0b 31 5f  28 00 00 00 20 00 00 00  |....*.1_(... ...|
> 00024290  01 00 00 00 41 44 43 30  04 00 08 00 a5 0a 69 09  |....ADC0......i.|
> 
> where you clearly see the ADC0 and TDC0 banks.
> 
> Stefan

So at least I learned something new. I tried it with the hexdump and the banks are not existent in the .mid file. I 
only have the ODB inside the file. The 7K difference in size is actually just about what I expect to be the data 
(1792 x 4 bytes)

Best, Ivo
    Reply  10 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> Have you tried longer files? Maybe a few 100 MB or so. Maybe a buffer is not flushed correctly at the end of a run.

Yes, I did. This 7 KB of the data bank is about the limit. If we go only 1 KB higher it seems that we save all data. In 
our specific case, this is the number of time bins (256 pixels with 7 time bins results in data loss, with 8 time bins it 
seems to be okay, data type is DWORD). 

Of course, a workaround for us is to save at least 8 time bins and throw 7 of them away later on. Nevertheless, since we 
are only in the commissioning phase now this is okay, I would just like to avoid data loss in the data taking phase of the 
experiment so knowing where the problem origins could help. 

I did another test with another FE running that produces a lot of data. The behavior is the same though. If the bank size 
is less than about 8 KB, the bank is not saved anymore. But probably this is anyway the expected behavior since it is a 
different FE that produces the data. 

So if it is coming from the buffer, is there something I could change to test or solve the problem?

Best, Ivo
    Reply  11 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid 
> It would be good to pin point there the data is lost. This is the sequence:
> 
> frontend user code -> mfe.c code -> SYSTEM buffer -> mlogger -> disk
> 
> To see if correct data arrives to the SYSTEM buffer, run:
> mdump -z SYSTEM
> 
> To see if mlogger is receiving events from the SYSTEM buffer, run:
> mlogger -v ### mlogger should report all events, history and data
> 
> To see if mlogger writes events to disk, examine the disk file (in this case, you already did, data is not there).
> 
> I would guess that your data does not make it out from the frontend (mdump shows "nothing"),
> if data were to arrive into the SYSTEM buffer, it would make it to disk, unless
> mlogger is misconfigured (but you already checked that).
> 
> If you have trouble with the frontend framework code, you can try to switch from the mfe.c frontend
> to the newer c++ tmfe frontend (see progs/fetest_tmfe.cxx and progs/fetest_tmfe_thread.cxx).
> 
> K.O.

Good evening

I tried to reproduce the behavior in a very simple FE but it did not work out. The next thing for me would be to take the FE that is producing this behavior, replace all the device communication and data with dummies. If the problem is still there I would start to simplify as much as possible. 

Following the inputs of KO, I pin-pointed the data loss. The system buffer still gets the data but the mlogger does not write the data event. Then of course the data is also not anymore present in the data file. Therefore, I checked the logger settings again, Event ID and Trigger Mask still -1. Nothing else, at least from my point of view, that is misconfigured. Nevertheless, if it helps I can send my ODB settings. 

When doing the tests just before I found something else that probably can give a hint to the problem. The data is only lost if the time between two runs is long (a few seconds). As an example: If I run a sequence with a loop and after the FE stops the run the loop ends and the next run is started automatically, then only the first run has no data, which is the one after a longer time of no data taking. When I add a "WAIT Seconds 5" after the run before starting the next, not data is written to the disk for any run. I also found this once when adding a sleep(1) at the end of the FE readout function but back then did not think about it any further. 

Best, Ivo
    Reply  23 Mar 2022, Ivo Schulthess, Bug Fix, fix for event buffer corruption in bm_flush_cache() 
Thanks for the investigation. Back in 2020, we had some issues of losing data between the system buffer and the logger writing them to disk (https://daq00.triumf.ca/elog-midas/Midas/1966). This was polled equipment but we had a multithreaded FE running at the same time. Could this be related to the same problem?

Best, Ivo
Entry  13 Nov 2023, Ivo Schulthess, Forum, mlogger does not HAVE_ROOT 
Good evening, 

I am setting up Midas (v2.1) for a new experiment. We want to save the data in the ROOT format. We installed ROOT from source (v6.28/06), and ROOTSYS is set. When we compile Midas, it says that it found ROOT. We set up a second logger channel where we set the filename to run%05d.root, the format to ROOT, and the output to ROOT. Nevertheless, when starting a run, the logger writes the error that "channel '1' requested ROOT output, but mlogger is built without HAVE_ROOT". From the CMake file, I would assume that it is set automatically if ROOT is found. Do you have any idea why the mlogger does not find ROOT or save the data in the ROOT format?

Thanks in advance for your ideas, input, and help. 

Cheers,
Ivo
    Reply  14 Nov 2023, Ivo Schulthess, Forum, mlogger does not HAVE_ROOT 
> Stefan is right. I forgot this. As solution to our troubles, mlogger is built without root support. use rmlogger instead.
> 
> K.O.

Thanks, Stefan and Konstantin, for your feedback. 

So I checked the cmake file, and already the existence of the rmlogger shows that HAVE_ROOT was set. It was really only the problem of not being aware of rmlogger. This now works, and it produces root files that are readable. 

However, we encountered a new problem not that it does not find a bank that is produced by a multi-threaded slow-control frontend. The logger triggers the error "mlogger.cxx:3328:root_book_bank,ERROR] received unknown bank 'MSRD' in event #8". After this, we get a segmentation violation, but I guess this is then coming from the error. If we run only the polled FE, it works fine. If we run the polled and the multi-threaded FE with only the logger saving mid files, it works fine as well. Are you aware of issues with multi-threaded slow-control frontends and saving their banks in root format?

Cheers,
Ivo
    Reply  15 Nov 2023, Ivo Schulthess, Forum, mlogger does not HAVE_ROOT 
> No, I'm not aware of this problem, but I suspect that your events somehow got corrupted. You can try the mdump utility
> or the "Event Dump" web page to peek into your events, maybe you see an issue there. To give you more detailed information,
> I would have to reproduce your problem, which is probably hard without your hardware.
> 
> Stefan

Hi Stefan,

So I did a few things:
- I checked with mdump online, the data stream looks good, and I can see the bank name properly
- I checked with mdump offline the .mid files, the banks are there, and the data look good
- I removed the creating of the bank MSRD in the class driver. This stopped the writing of the data in the midas/root file but kept the stream to the history files. In principle, this is a quick and dirty fix because we still have all the data in the history files. Do you see any bigger problem with that solution?
- I tried to run the multi-threaded slow-control frontend with the generic class driver (generic.cxx) and the nulldev device driver (nulldev.cxx). This produces the DMND and MSRD bank with and also produces the error in the with the logger when trying to save in the root format (received unknown bank "DMND" in event #8). This means it is not related to the devices (maybe some other part of my user code of course). 

Cheers,
Ivo
ELOG V3.1.4-2e1708b5