Hi all,
I have been experimenting with a frontend solution for my experiment
(ALPHA). The intention to replace how we log data from PCs running LabVIEW.
I am at the proof of concept stage. So far I have some promising
performance, able to handle 10-100x more data in my test setup (current
limitations now are just network bandwith, MIDAS is impressively efficient).
==========================================================================
Our experiment has many PCs using LabVIEW which all log to MIDAS, the
experiment has grown such that we need some sort of load balancing in our
frontend.
The concept was to have a 'supervisor frontend' and an array of 'worker
frontend' processes.
-A LabVIEW client would connect to the supervisor, then be referred to a
worker frontend for data logging.
-The supervisor could start a 'worker frontend' process as the demand
required.
To increase accountability within the experiment, I intend to have a 'worker
frontend' per PC connecting. Then any rouge behavior would be clear from the
MIDAS frontpage.
Presently there around 20-30 of these LabVIEW PCs, but given how the group
is growing, I want to be sure that my data logging solution will be viable
for the next 5-10 years. With the increased use of single board computers, I
chose the target of benchmarking upto 1000 worker frontends... but I quickly
hit the '64 MAX CLIENTS' and '64 RPC CONNECTION' limit. Ok...
branching and updating these limits:
https://bitbucket.org/tmidas/midas/branch/experimental-beyond_64_clients
I have two commits.
1. update the memory layout assertions and use MAX_CLIENTS as a variable
https://bitbucket.org/tmidas/midas/commits/302ce33c77860825730ce48849cb810cf
366df96?at=experimental-beyond_64_clients
2. Change the MAX_CLIENTS and MAX_RPC_CONNECTION
https://bitbucket.org/tmidas/midas/commits/f15642eea16102636b4a15c8411330969
6ce3df1?at=experimental-beyond_64_clients
Unintended side effects:
I break compatibility of existing ODB files... the database layout has
changed and I read my old ODB as corrupt. In my test setup I can start from
scratch but this would be horrible for any existing experiment.
Edit: I noticed 'make testdiff' pipeline is failing... also fails locally...
investigating
Early performance results:
In early tests, ~700 PCs logging 10 unique arrays of 10 doubles into
Equipment variables in the ODB seems to perform well... All transactions
from client PCs are finished within a couple of ms or less
==========================================================================
Questions:
Does the community here have strong opinions about increasing the
MAX_CLIENTS and MAX_RPC_CONNECTION limits?
Am I looking at this problem in a naive way?
Potential solutions other than increasing the MAX_CLIENTS limit:
-Make worker threads inside the supervisor (not a separate process), I am
using TMFE, so I can dynamically create equipment. I have not yet taken a
deep dive into how any multithreading is implemented
-One could have a round robin system to load balance between a limited pool
of 'worker frontend' proccesses. I don't like this solution as I want to
able to clearly see which client PCs have been setup to log too much data
========================================================================== |