28 Feb 2011, Konstantin Olchanski, Info, javascript example experiment
|
I just committed a MIDAS example for using most mhttpd html and javascript functions: ODBGet(),
ODBSet(), ODBRpc() & co.
Please checkout .../midas/examples/javascript1, and follow instructions in the README file.
For your enjoyment, the example html file is attached to this message.
(to use the function ODBRpc_rev1 - the midas rpc call that returns data to the web page - you need
mhttpd svn rev 4994 committed today. Other functions should work with any revision of mhttpd)
svn rev 4992.
K.O. |
20 Jun 2011, Stefan Ritt, Info, Javascript ODB interface revised
|
The Javascript interface to the ODB has been revised. This extends the capabilities of custom web pages requesting data from the ODB. By grouping several request together, the number of round-trips is minimized and the response time is reduced. Following functions are new or extended:
- ODBGet(path[, format]): This functions works now also with subdirectories in the ODB. The command ODBGet('/Runinfo') returns for example:
1
1
1024
0
0
0
Mon Jun 20 09:40:14 2011
1308588014
Mon Jun 20 09:40:46 2011
1308588046
- ODBGetRecord(path), ODBExtractRecord(key): While ODBGet can be used for subdirectories, an easier way is to use ODBGetRecord and ODBExtractRecord. The first function retrieves the subtree (record), while the second one can be used to extract individual items. Here is an example:
result = ODBGetRecord('/Runinfo');
run_number = ODBExtractRecord(result, 'Run number');
start_time = ODBExtractRecord(result, 'Start time');
- ODBMGet(paths[, callback, formats]): This function ("Multi-Get") can be used to obtain ODB values from different paths in one call. The ODB paths have to be supplied in an array, the result is again an array. An optional callback routine might be supplied for asynchronous operation. Optional formats might be supplied if the resulting number should be formatted in a specific way. Here is an example:
var req = new Array();
req[0] = "/Runinfo/Run number";
req[1] = "/Equipment/Trigger/Statistics/Events sent";
var result = ODBMGet(req);
run_number = result[0];
events_sent = result[1];
The new functions are implemented in mhttpd revision 5075. |
21 Jun 2011, Stefan Ritt, Info, New MIDAS sequencer
|
A new sequencer for starting and stopping runs has been implemented. Although it is till kind of in a preliminary phase, it is usable, so I would like to share the syntax with you.
The sequencer runs inside mhttpd, and creates a new ODB subdirectory "/Sequencer". There is a new button on the status page called "Sequencer". In can run scripts in XML format, which reside on the server (where mhttpd is running). The sequencer is stateless, that means even if mhttpd is stopped and restarted, it resumes operation from where it has been stopped. Following statements are implemented:
- <Comment>comment</Comment>
a comment for this XML file, for information only
- <ODBSet path="path">value</ODBSet>
to set a value in the ODB
- <ODBInc path="path">delta</ODBInc>
to increment a value in the ODB
- <RunDescription>Description</RunDescription>
a run description which is stored under /Experiment/Run Parameters/Run Description.
- <Transition>Start | Stop</Transition>
to start or stop a run
- <Loop n="n"> ... </Loop>
to execute a loop n times. For infinite loops, "infinit" can be specified as n
- <Wait for="events | ODBvalue | seconds" [path="ODB path"]>x</Wait>
wait until a number of events is acquired (testing /Equipment/Trigger/Statistics/Events sent), or until a value in the ODB exceeds x, or wait for x seconds.
- <Script [loop_counter="1"]>Script</Script>
to call a script on the server side. Optionally, the loop counter(s) are passed to the script
Attached is a simple script which can be used as a starting point. |
27 Jun 2011, Konstantin Olchanski, Info, updated mhttpd history "export" function
|
The mhttpd history "export" function has been converted to the new midas history
interface and should now work for SQL-based history systems. In the process,
improvements by Eoin Butler (CERN AD-5/ALPHA) were merged - adding a UNIX
timestamp and a better text timestamp. Also now "export" outputs the actual
values from the history file - the scaling values from the definition of the
history plot panel are no longer applied.
Here is an example of the new file format:
Time, Timestamp, Run, Run State, SLOW
2011.06.21 15:45:21, 1308696321, 13292, 3, -89.1007
svn rev 5104
K.O. |
27 Jun 2011, Konstantin Olchanski, Info, midas shared memory changes
|
A number of changes were made to the midas shared memory implementation for
Linux and MacOS:
1) SysV or POSIX shared memory compile-type choice is removed. Both shared
memory types are compiled-in and are selected at run time.
2) the shared memory type used by an experiment is recorded in the file
.SHM_TYPE.TXT. Currently implemented are "POSIXv2_SHM" (the new default for new
experiments), "POSIX_SHM", "MMAP_SHM" and "SYSV_SHM". (see system.c) (MMAP_SHM
is fully functional but is not recommended). The POSIXv2_SHM uses an improved
filename scheme (on Linux, see "ls -l /dev/shm") and permits multiple
experiments to coexist on a MacOS computer (where there is a severe limit on
shared memory filename length).
3) following a number of mishaps where "odbedit" has been run on the wrong
computer (causing havoc with ODB and .xxx.SHM files), for each experiment, the
hostname of the computer where the ODB shared memory is meant to reside is now
recorded in the file .SHM_HOST.TXT. Typically, this is the machine running
mserver, mhttpd and mlogger. If some client is accidentally started on the wrong
machine or if MIDAS_SERVER_HOST is accidentally left undefined, MIDAS will now
print a stern message reporting the hostname mismatch, tell the user to use the
mserver and refuse to run. The user has the choice of starting the client on the
correct computer (as reported in the error message), using the mserver (start
client with -H flag) or edit/delete the .SHM_HOST.TXT file (full pathname is
reported by the error message).
With this update, MIDAS on MacOS becomes fully functional (before, only one
experiment could be used at a time).
svn rev 5105
K.O. |
27 Jun 2011, Konstantin Olchanski, Info, mlogger lock for runNNN.mid.gz files
|
By popular request, Stefan R. implemented a locking scheme for mlogger output files.
To use this function, set the mlogger ODB /Logger/Channels/NNN/Settings/Filename
to ".run%05dsub%05d.mid.gz" (note the leading dot).
In this mode, active output files will have a filename with a leading dot
(.run00001sub00001.mid.gz) while the file is being written to. After the file is
closed, it is renamed and the leading dot is removed.
To use this function with the lazylogger, please set ODB
"/Lazy/Foo/Settings/Filename format" to "run*.mid.gz,run*.xml" (note the leading
text "run"). Set "stay behind" to 0.
svn rev 5080 (or so, checking by Stefan R.)
K.O. |
05 Jul 2011, Konstantin Olchanski, Info, midas shared memory changes
|
> 2) the shared memory type used by an experiment is recorded in the file .SHM_TYPE.TXT.
An error with creating the file .SHM_TYPE.TXT was corrected in system.c svn rev 5125 - if file did not exist, it is
created correctly, but MIDAS reports "cannot connect to ODB". Second try works correctly because the file exists
now.
> 3) the hostname of the computer where the ODB shared memory is meant to reside is now
> recorded in the file .SHM_HOST.TXT.
This is causing problems on mobile computers where "hostname" changes all the time (i.e. set according to
DHCP on whatever network happens to be connected).
If you run into this problem, keep deleting .SHM_HOST.TXT or use this workaround: disable the hostname check
by making the file .SHM_HOST.TXT empty (zero length).
K.O. |
11 Jul 2011, Konstantin Olchanski, Info, Make "STOP" run transition always succeed
|
Over the years, there was some back-and-forth changes in what happens to run transitions when some
of the participants misbehave (do not respond to RPC calls, timeout, crash, etc).
The very original behaviour was to ignore all errors. This resulted in user confusion when some clients
would start, some would not, data from frontends that missed the transition did not arrive, etc.
So it was changed to fail the transition if any client misbehaves.
This left mlogger (who is usually the first one to see the TR_START transition) in a funny state - output
file is open, etc, but there is no run active. This was fixed by adding a TR_STARTABORT transition to tell
mlogger, event builder & co that the just started run did not start after all.
Also at some point code was added to forcefully kill clients that do not respond to run transitions (do
not respond to RPC, timeout, etc).
Recently, it was observed how during unattended overnight operation of a MIDAS DAQ system, with the
logger set to "auto restart", some unnecessary clients misbehave during the run stop transition, and
prevent the run from stopping and restarting. The user comes in the morning and is unhappy that data
taking stopped some time during the night.
midas.c svn rev 5136 changes the TR_STOP transition to always succeed, even if some clients had
transition errors. If these clients are unnecessary for normal operation of the DAQ, the following run
"auto restart" will continue taking data. If those were important clients, data taking will continue the
best it can - it *is* unattended operation - nobody is looking - but users can always setup alarms for
checking that important clients are always running during data taking. (For very important clients, one
can setup alarms to send email, send SMS messages, etc).
K.O. |
30 Jan 2012, Stefan Ritt, Info, IEEE Real Time 2012 Call for Abstracts
|
Hello,
I'm co-organizing the upcoming Real Time Conference, which covers also the field of data acquisition, so it might be interesting for people working
with MIDAS. If you have something to report, you could also consider to send an abstract to this conference. It will be nicely located in Berkeley,
California. We plan excursions to San Francisco and to Napa Valley.
Best regards,
Stefan Ritt
---------------------------
18th Real Time Conference
June 11 – 15, 2012
Berkeley, CA
We invite you to the Hotel Shattuck Plaza in downtown Berkeley, California for
the 2012 Real-Time Conference (RT2012). It will take place Monday, June 11
through Friday, June 15, 2012, with optional pre-conference tutorials Saturday
and Sunday, June 9-10.
Like the previous editions, RT2012 will be a multidisciplinary conference
devoted to the latest developments on realtime techniques in the fields of
plasma and nuclear fusion, particle physics, nuclear physics and astrophysics,
space science, accelerators, medical physics, nuclear power instrumentation and
other radiation instrumentation.
Abstract submission is open as of 18 January (deadline 2 March). Please visit
http://www.npss-confs.org/rtc/welcome.asp?flag=44675.77&Retry=1 to submit an
abstract.
Call for Abstracts
RT 2012 is an interdisciplinary conference on realtime data acquisition and
computing applications in the physical sciences. These applications include:
* High energy physics
* Nuclear physics
* Astrophysics and astroparticle physics
* Nuclear fusion
* Medical physics
* Space instrumentation
* Nuclear power instrumentation
* Realtime security and safety
* General Radiation Instrumentation
Specific topics include (but are certainly not limited to) the list shown below.
We welcome correspondence to see how your research fits our venue.
Key Dates
* Abstract submission opened: January 18, 2012
* Abstract deadline: March 2, 2012
* Program available: April 2
Suggested Topics
* Realtime system architectures
* Intelligent signal processing
* Programmable devices
* Fast data transfer links and networks
* Trigger systems
* Data acquisition
* Processing farms
* Control, monitoring, and test systems
* Upgrades
* Emerging realtime technologies
* New standards
* Realtime safety and security
* Feedback on experiences
Contact Information
If you have a question or wish to opt in for occasional e-mail updates about
RT2012, send us a message at RT2012@lbl.gov. To view full conference
information, visit http://rt2012.lbl.gov/index.html |
20 Jun 2012, Konstantin Olchanski, Info, lazylogger write to HADOOP HDFS
|
I tried using the lazylogger "Disk" method to write into a HADOOP HDFS clustered filesystem and found a
number of problems. I ended up replacing the lazylogger lazy_copy() function that still uses former YBOS
code with a new lazy_disk_copy() function that uses generic fread/fwrite. Also fixed the situation where
lazylogger cannot cleanly stop from the mhttpd "programs/stop" button while it is busy writing (the fix
works only for the "Disk" method).
(Note that one can also use the "Script" method for writing into HDFS)
Anyhow, the new lazylogger writes into HDFS just fine and I expect that it would also work for writing into
DCACHE using PNFS (if ever we get the SL6 PNFS working with our DCACHE servers).
Writing into our test HDFS cluster runs at about 20 MiBytes/sec for 1GB files with replication set to 3.
svn rev 5295
K.O. |
20 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks   
|
I am recording here the results from a test VME system using two VF48 waveform digitizers and a 64-bit
dual-core VME processor (V7865). VF48 data suppression is off, VF48 modules set to read 48 channels,
1000 ADC samples each. mlogger data compression is enabled (gzip -1).
Event rate is about 200/sec
VME Data rate is about 40 Mbytes/sec
System is 100% busy (estimate)
System utilization of host computer (dual-core 2.2GHz, dual-channel DDR333 RAM):
(note high CPU use by mlogger for gzip compression of midas files)
top - 12:23:45 up 68 days, 20:28, 3 users, load average: 1.39, 1.22, 1.04
Tasks: 193 total, 3 running, 190 sleeping, 0 stopped, 0 zombie
Cpu(s): 32.1%us, 6.2%sy, 0.0%ni, 54.4%id, 2.7%wa, 0.1%hi, 4.5%si, 0.0%st
Mem: 3925556k total, 3797440k used, 128116k free, 1780k buffers
Swap: 32766900k total, 8k used, 32766892k free, 2970224k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5169 trinat 20 0 246m 108m 97m R 64.3 2.8 29:36.86 mlogger
5771 trinat 20 0 119m 98m 97m R 14.9 2.6 139:34.03 mserver
6083 root 20 0 0 0 0 S 2.0 0.0 0:35.85 flush-9:3
1097 root 20 0 0 0 0 S 0.9 0.0 86:06.38 md3_raid1
System utilization of VME processor (dual-core 2.16 GHz, single-channel DDR2 RAM):
(note the more than 100% CPU use of multithreaded fevme)
top - 12:24:49 up 70 days, 19:14, 2 users, load average: 1.19, 1.05, 1.01
Tasks: 103 total, 1 running, 101 sleeping, 1 stopped, 0 zombie
Cpu(s): 6.3%us, 45.1%sy, 0.0%ni, 47.7%id, 0.0%wa, 0.2%hi, 0.6%si, 0.0%st
Mem: 1019436k total, 866672k used, 152764k free, 3576k buffers
Swap: 0k total, 0k used, 0k free, 20976k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
19740 trinat 20 0 177m 108m 984 S 104.5 10.9 1229:00 fevme_gef.exe
1172 ganglia 20 0 416m 99m 1652 S 0.7 10.0 1101:59 gmond
32353 olchansk 20 0 19240 1416 1096 R 0.2 0.1 0:00.05 top
146 root 15 -5 0 0 0 S 0.1 0.0 42:52.98 kslowd001
Attached are the CPU and network ganglia plots from lxdaq09 (VME) and ladd02 (host).
The regular bursts of "network out" on ladd02 is lazylogger writing mid.gz files to HADOOP HDFS.
K.O. |
20 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks
|
> I am recording here the results from a test VME system using two VF48 waveform digitizers
Note 1: data compression is about 89% (hence "data to disk" rate is much smaller than the "data from VME" rate)
Note 2: switch from VME MBLT64 block transfer to 2eVME block transfer:
- raises the VME data rate from 40 to 48 M/s
- event rate from 220/sec to 260/sec
- mlogger CPU use from 64% to about 80%
This is consistent with the measured VME block transfer rates for the VF48 module: MBLT64 is about 40 M/s, 2eVME is about 50 M/s (could be
80 M/s if no clock cycles were lost to sync VME signals with the VF48 clocks), 2eSST is implemented but impossible - VF48 cannot drive the
VME BERR and RETRY signals. Evil standards, grumble, grumble, grumble).
K.O. |
21 Jun 2012, Stefan Ritt, Info, midas vme benchmarks
|
Just for completeness: Attached is the VME transfer speed I get with the SIS3100/SIS1100 interface using
2eVME transfer. This curve can be explained exactly with an overhead of 125 us per DMA transfer and a
continuous link speed of 83 MB/sec. |
21 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks
|
> Just for completeness: Attached is the VME transfer speed I get with the SIS3100/SIS1100 interface using
> 2eVME transfer. This curve can be explained exactly with an overhead of 125 us per DMA transfer and a
> continuous link speed of 83 MB/sec.
What VME module is on the other end?
K.O. |
22 Jun 2012, Stefan Ritt, Info, midas vme benchmarks
|
> > Just for completeness: Attached is the VME transfer speed I get with the SIS3100/SIS1100 interface using
> > 2eVME transfer. This curve can be explained exactly with an overhead of 125 us per DMA transfer and a
> > continuous link speed of 83 MB/sec.
>
> What VME module is on the other end?
>
> K.O.
The PSI-built DRS4 board, where we implemented the 2eVME protocol in the Virtex II FPGA. The same speed can be obtained with the commercial
VME memory module CI-VME64 from Chrislin Industries (see http://www.controlled.com/vme/chinp1.html).
Stefan |
22 Jun 2012, Zisis Papandreou, Info, adding 2nd ADC and TDC to crate  
|
Hi folks:
we've been running midas-1.9.5 for a few years here at Regina. We are now
working on a larger cosmic ray testing that requires a second ADC and second TDC
module in our Camac crate (we use the hytek1331 controller by the way). We're
baffled as to how to set this up properly. Specifically we have tried:
frontend.c
/* number of channels */
#define N_ADC 12
(changed this from the old '8' to '12', and it seems to work for Lecroy 2249)
#define SLOT_ADC0 10
#define SLOT_TDC0 9
#define SLOT_ADC1 15
#define SLOT_TDC1 14
Is this the way to define the additional slots (by adding 0, 1 indices)?
Also, we were not able to get a new bank (ADC1) working, so we used a loop to
tag the second ADC values onto those of the first.
If someone has an example of how to handle multiple ADCs and TDCs and
suggestions as to where changes need to be made (header files, analyser, etc)
this would be great.
Thanks, Zisis...
P.S. I am attaching the relevant files. |
24 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks
|
> > > Just for completeness: Attached is the VME transfer speed I get with the SIS3100/SIS1100 interface using
> > > 2eVME transfer. This curve can be explained exactly with an overhead of 125 us per DMA transfer and a
> > > continuous link speed of 83 MB/sec.
>
> [with ...] the PSI-built DRS4 board, where we implemented the 2eVME protocol in the Virtex II FPGA.
This is an interesting hardware benchmark. Do you also have benchmarks of the MIDAS system using the DRS4 (measurements
of end-to-end data rates, maximum event rate, maximum trigger rate, any tuning of the frontend program
and of the MIDAS experiment to achieve those rates, etc)?
K.O. |
24 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks
|
> > I am recording here the results from a test VME system using two VF48 waveform digitizers
(I now have 4 VF48 waveform digitizers, so the event rates are half of those reported before. Date rate
is up to 51 M/s - event size has doubled, per-event overhead is the same, so the effective data rate goes
up).
This message demonstrates the effects of tuning the MIDAS system for high rate data taking.
Attached is the history plot of the event rate counters which show the real-time performance of the MIDAS
system with better detail compared to the average event rate reported on the MIDAS status page. For an
ideal real-time system, the event rate should be a constant, without any drop-outs.
Seen on the plot:
run 75: the periodic dropouts in the event rate correspond to the lazylogger writing data into HADOOP
HDFS. Clearly the host computer cannot keep up with both data taking and data archiving at the same
time. (see the output of "top" "with HDFS" and "without HDFS" below)
run 76: SYSTEM buffer size increased from 100Mbytes to 300Mbytes. Maybe there is an improvement.
run 77-78: "event_buffer_size" inside the multithreaded (EQ_MULTITHREAD) VME frontend increased from
100Mbytes to 300Mbytes. (6 seconds of data at 50M/s). Much better, yes?
Conclusion: for improved real-time performance, there should be sufficient buffering between the VME
frontend readout thread and the mlogger data compression thread.
For benchmark hardware, at 50M/s, 4 seconds of buffer space (100M in the SYSTEM buffer and 100M in
the frontend) is not enough. 12 seconds of buffer space (300+300) is much better. (Or buy a faster
backend computer).
P.S. HDFS data rate as measured by lazylogger is around 20M/s for CDH3 HADOOP and around 30M/s for
CDH4 HADOOP.
P.S. Observe the ever present unexplained event rate fluctuations between 130-140 event/sec.
K.O.
---- "top" output during normal data taking, notice mlogger data compression consumes 99% CPU at 51
M/s data rate.
top - 08:55:22 up 72 days, 17:00, 5 users, load average: 2.47, 2.32, 2.27
Tasks: 206 total, 2 running, 204 sleeping, 0 stopped, 0 zombie
Cpu(s): 52.2%us, 6.1%sy, 0.0%ni, 34.4%id, 0.8%wa, 0.1%hi, 6.2%si, 0.0%st
Mem: 3925556k total, 3064928k used, 860628k free, 3788k buffers
Swap: 32766900k total, 200704k used, 32566196k free, 2061048k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5826 trinat 20 0 437m 291m 287m R 97.6 7.6 636:39.63 mlogger
27617 trinat 20 0 310m 288m 288m S 24.6 7.5 6:59.28 mserver
1806 ganglia 20 0 415m 62m 1488 S 0.9 1.6 668:43.55 gmond
--- "top" output during lazylogger/HDFS activity. Observe high CPU use by lazylogger and fuse_dfs (the
HADOOP HDFS client). Observe that CPU use adds up to 167% out of 200% available.
top - 08:57:16 up 72 days, 17:01, 5 users, load average: 2.65, 2.35, 2.29
Tasks: 206 total, 2 running, 204 sleeping, 0 stopped, 0 zombie
Cpu(s): 57.6%us, 23.1%sy, 0.0%ni, 8.1%id, 0.0%wa, 0.4%hi, 10.7%si, 0.0%st
Mem: 3925556k total, 3642136k used, 283420k free, 4316k buffers
Swap: 32766900k total, 200692k used, 32566208k free, 2597752k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5826 trinat 20 0 437m 291m 287m R 68.7 7.6 638:24.07 mlogger
23450 root 20 0 1849m 200m 4472 S 64.4 5.2 75:35.64 fuse_dfs
27617 trinat 20 0 310m 288m 288m S 18.5 7.5 7:22.06 mserver
26723 trinat 20 0 38720 11m 1172 S 17.9 0.3 22:37.38 lazylogger
7268 trinat 20 0 1007m 35m 4004 D 1.3 0.9 187:14.52 nautilus
1097 root 20 0 0 0 0 S 0.8 0.0 101:45.55 md3_raid1 |
25 Jun 2012, Stefan Ritt, Info, midas vme benchmarks
|
> P.S. Observe the ever present unexplained event rate fluctuations between 130-140 event/sec.
An important aspect of optimizing your system is to keep the network traffic under control. I use GBit Ethernet between FE and BE, and make sure the switch
can accomodate all accumulated network traffic through its backplane. This way I do not have any TCP retransmits which kill you. Like if a single low-level
ethernet packet is lost due to collision, the TCP stack retransmits it. Depending on the local settings, this can be after a timeout of one (!) second, which
punches already a hole in your data rate. On the MSCB system actually I use UDP packets, where I schedule the retransmit myself. For a LAN, 10-100ms timeout
is there enough. The one second is optimized for a WAN (like between two continents) where this is fine, but it is not what you want on a LAN system. Also
make sure that the outgoing traffic (lazylogger) uses a different network card than the incoming traffic. I found that this also helps a lot.
- Stefan |
25 Jun 2012, Konstantin Olchanski, Info, midas vme benchmarks
|
> > P.S. Observe the ever present unexplained event rate fluctuations between 130-140 event/sec.
>
> An important aspect of optimizing your system is to keep the network traffic under control. I use GBit Ethernet between FE and BE, and make sure the switch
> can accomodate all accumulated network traffic through its backplane. This way I do not have any TCP retransmits which kill you. Like if a single low-level
> ethernet packet is lost due to collision, the TCP stack retransmits it. Depending on the local settings, this can be after a timeout of one (!) second, which
> punches already a hole in your data rate. On the MSCB system actually I use UDP packets, where I schedule the retransmit myself. For a LAN, 10-100ms timeout
> is there enough. The one second is optimized for a WAN (like between two continents) where this is fine, but it is not what you want on a LAN system. Also
> make sure that the outgoing traffic (lazylogger) uses a different network card than the incoming traffic. I found that this also helps a lot.
>
In typical applications at TRIUMF we do not setup a private network for the data traffic - data from VME to backend computer
and data from backend computer to DCACHE all go through the TRIUMF network.
This is justified by the required data rates - the highest data rate experiment running right now is PIENU - running
at about 10 M/s sustained, nominally April through December. (This is 20% of the data rate of the present benchmark).
The next highest data rate experiment is T2K/ND280 in Japan running at about 20 M/s (neutrino beam, data rate
is dominated by calibration events).
All other experiments at TRIUMF run at lower data rates (low intensity light ion beams), but we are planning for an experiment
that will run at 300 M/s sustained over 1 week of scheduled beam time.
But we do have the technical capability to separate data traffic from the TRIUMF network - the VME processors and
the backend computers all have dual GigE NICs.
(I did not say so, but obviously the present benchmark at 50 M/s VME to backend and 20-30 M/s from backend to HDFS is a GigE network).
(I am not monitoring the TCP loss and retransmit rates at present time)
(The network switch between VME and backend is a "the cheapest available" rackmountable 8-port GigE switch. The network between
the backend and the HDFS nodes is mostly Nortel 48-port GigE edge switches with single-GigE uplinks to the core router).
K.O. |
|