Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  20 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks lxdaq09cpu.giflxdaq09net.gifladd02cpu.gifladd02net.gif
    Reply  20 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks 
       Reply  24 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks Scalers_(1).gif
          Reply  25 Jun 2012, Stefan Ritt, Info, midas vme benchmarks 
             Reply  25 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks 
          Reply  26 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks canvas.pdf
             Reply  26 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks Scalers.gifladd02-cpu.pngladd02-net.pngcanvas-1000-100Hz.pdf
    Reply  21 Jun 2012, Stefan Ritt, Info, midas vme benchmarks Screen_Shot_2012-06-21_at_10.14.09_.png
       Reply  21 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks 
          Reply  22 Jun 2012, Stefan Ritt, Info, midas vme benchmarks 
             Reply  24 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks 
Message ID: 815     Entry time: 25 Jun 2012     In reply to: 814
Author: Konstantin Olchanski 
Topic: Info 
Subject: midas vme benchmarks 
> > P.S. Observe the ever present unexplained event rate fluctuations between 130-140 event/sec.
> 
> An important aspect of optimizing your system is to keep the network traffic under control. I use GBit Ethernet between FE and BE, and make sure the switch 
> can accomodate all accumulated network traffic through its backplane. This way I do not have any TCP retransmits which kill you. Like if a single low-level 
> ethernet packet is lost due to collision, the TCP stack retransmits it. Depending on the local settings, this can be after a timeout of one (!) second, which 
> punches already a hole in your data rate. On the MSCB system actually I use UDP packets, where I schedule the retransmit myself. For a LAN, 10-100ms timeout 
> is there enough. The one second is optimized for a WAN (like between two continents) where this is fine, but it is not what you want on a LAN system. Also 
> make sure that the outgoing traffic (lazylogger) uses a different network card than the incoming traffic. I found that this also helps a lot.
> 

In typical applications at TRIUMF we do not setup a private network for the data traffic - data from VME to backend computer
and data from backend computer to DCACHE all go through the TRIUMF network.

This is justified by the required data rates - the highest data rate experiment running right now is PIENU - running
at about 10 M/s sustained, nominally April through December. (This is 20% of the data rate of the present benchmark).

The next highest data rate experiment is T2K/ND280 in Japan running at about 20 M/s (neutrino beam, data rate
is dominated by calibration events).

All other experiments at TRIUMF run at lower data rates (low intensity light ion beams), but we are planning for an experiment
that will run at 300 M/s sustained over 1 week of scheduled beam time.

But we do have the technical capability to separate data traffic from the TRIUMF network - the VME processors and
the backend computers all have dual GigE NICs.

(I did not say so, but obviously the present benchmark at 50 M/s VME to backend and 20-30 M/s from backend to HDFS is a GigE network).

(I am not monitoring the TCP loss and retransmit rates at present time)

(The network switch between VME and backend is a "the cheapest available" rackmountable 8-port GigE switch. The network between
the backend and the HDFS nodes is mostly Nortel 48-port GigE edge switches with single-GigE uplinks to the core router).

K.O.
ELOG V3.1.4-2e1708b5