Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  01 May 2020, Joseph McKenna, Forum, Taking MIDAS beyond 64 clients 
    Reply  01 May 2020, Stefan Ritt, Forum, Taking MIDAS beyond 64 clients 
       Reply  01 May 2020, Pierre Gorel, Forum, Taking MIDAS beyond 64 clients 
          Reply  02 May 2020, Stefan Ritt, Forum, Taking MIDAS beyond 64 clients 
             Reply  02 May 2020, Joseph McKenna, Forum, Taking MIDAS beyond 64 clients 
                Reply  02 May 2020, Stefan Ritt, Forum, Taking MIDAS beyond 64 clients 
    Reply  02 May 2020, Konstantin Olchanski, Forum, Taking MIDAS beyond 64 clients 
       Reply  02 May 2020, Konstantin Olchanski, Forum, Taking MIDAS beyond 64 clients 
          Reply  02 May 2020, Konstantin Olchanski, Forum, Taking MIDAS beyond 64 clients 
Message ID: 1891     Entry time: 01 May 2020     Reply to this: 1892   1897
Author: Joseph McKenna 
Topic: Forum 
Subject: Taking MIDAS beyond 64 clients 

Hi all,

I have been experimenting with a frontend solution for my experiment 
(ALPHA). The intention to replace how we log data from PCs running LabVIEW. 

I am at the proof of concept stage. So far I have some promising 
performance, able to handle 10-100x more data in my test setup (current 
limitations now are just network bandwith, MIDAS is impressively efficient). 

==========================================================================

Our experiment has many PCs using LabVIEW which all log to MIDAS, the 
experiment has grown such that we need some sort of load balancing in our 
frontend.

The concept was to have a 'supervisor frontend' and an array of 'worker 
frontend' processes. 
-A LabVIEW client would connect to the supervisor, then be referred to a 
worker frontend for data logging. 
-The supervisor could start a 'worker frontend' process as the demand 
required.

To increase accountability within the experiment, I intend to have a 'worker 
frontend' per PC connecting. Then any rouge behavior would be clear from the 
MIDAS frontpage.

Presently there around 20-30 of these LabVIEW PCs, but given how the group 
is growing, I want to be sure that my data logging solution will be viable 
for the next 5-10 years. With the increased use of single board computers, I 
chose the target of benchmarking upto 1000 worker frontends... but I quickly 
hit the '64 MAX CLIENTS' and '64 RPC CONNECTION' limit. Ok...

branching and updating these limits:
https://bitbucket.org/tmidas/midas/branch/experimental-beyond_64_clients

I have two commits. 
1. update the memory layout assertions and use MAX_CLIENTS as a variable
https://bitbucket.org/tmidas/midas/commits/302ce33c77860825730ce48849cb810cf
366df96?at=experimental-beyond_64_clients
2. Change the MAX_CLIENTS and MAX_RPC_CONNECTION
https://bitbucket.org/tmidas/midas/commits/f15642eea16102636b4a15c8411330969
6ce3df1?at=experimental-beyond_64_clients

Unintended side effects:
I break compatibility of existing ODB files... the database layout has 
changed and I read my old ODB as corrupt. In my test setup I can start from 
scratch but this would be horrible for any existing experiment.

Edit: I noticed 'make testdiff' pipeline is failing... also fails locally... 
investigating

Early performance results:
In early tests, ~700 PCs logging 10 unique arrays of 10 doubles into 
Equipment variables in the ODB seems to perform well... All transactions 
from client PCs are finished within a couple of ms or less

==========================================================================

Questions:

Does the community here have strong opinions about increasing the 
MAX_CLIENTS and MAX_RPC_CONNECTION limits? 
Am I looking at this problem in a naive way?


Potential solutions other than increasing the MAX_CLIENTS limit:
-Make worker threads inside the supervisor (not a separate process), I am 
using TMFE, so I can dynamically create equipment. I have not yet taken a 
deep dive into how any multithreading is implemented
-One could have a round robin system to load balance between a limited pool 
of 'worker frontend' proccesses. I don't like this solution as I want to 
able to clearly see which client PCs have been setup to log too much data

==========================================================================
ELOG V3.1.4-2e1708b5