In MEG II we also kind of achieved this rate. Marco F. will post an entry soon to describe the details. There is only one thing
I want to mention, which is our network switch. Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade
switch. We have "rack switches", which are collector switch for each rack receiving up to 10 x 1GBit inputs, and outputting 1 x
10 GBit to an "aggregation switch", which collects all 10 GBit lines form rack switches and forwards it with (currently a single
) 10 GBit line. For the rack switch we use a
MikroTik CRS354-48G-4S+2Q+RM 54 port
and for the aggregation switch
MikroTik CRS326-24S-2Q+RM 26 Port
both cost in the order of 500 US$. We were astonished that they don't loose UDP packets when all inputs send a packet at the
same time, and they have to pipe them to the single output one after the other, but apparently the switch have enough buffers
(which is usually NOT written in the data sheets).
To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is
completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the
details.
Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your
bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend
attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk. I got
Event size: 7 MB
No logging: 900 events/s = 6.7 GBytes/s
Logging with LZ4 compression: 155 events/s = 1.2 GBytes/s
Logging without compression: 170 events/s = 1.3 GBytes/s
So with this simple approach I got already more than 1 GByte of "dummy data" through midas, indicating that the buffer
management is not so bad. I did use the plain mfe.c frontend framework, no bm_send_event_sg() (but mfe.c uses rpc_send_event() which is an
optimized version of bm_send_event()).
Best,
Stefan |