| ID |
Date |
Author |
Topic |
Subject |
|
1932
|
04 Jun 2020 |
Stefan Ritt | Forum | Template of slow control frontend | > I’m beginner of Midas, and trying to develop the slow control front-end with the latest Midas.
> I found the scfe.cxx in the “example”, but not enough to refer to write the front-end for my own devices
> because it contains only nulldevice and null bus driver case...
> (I could have succeeded to run the HV front-end for ISEG MPod, because there is the device driver...)
>
> Can I get some frontend examples such as simple TCP/IP and/or RS232 devices?
> Hopefully, I would like to have examples of frontend and device driver.
> (if any device driver which is included in the package is similar, please tell me.)
Have you checked the documentation?
https://midas.triumf.ca/MidasWiki/index.php/Slow_Control_System
Basically you have to replace the nulldevice driver with a "real" driver. You find all existing drivers under
midas/drivers/device. If your favourite is not there, you have to write it. Use one which is close to the one
you need and modify it.
Best,
Stefan |
|
1933
|
04 Jun 2020 |
Hisataka YOSHIDA | Forum | Template of slow control frontend | Dear Stefan,
Thank you for you quick reply.
> Have you checked the documentation?
>
> https://midas.triumf.ca/MidasWiki/index.php/Slow_Control_System
Yes, I have read the wiki, but not easy to figure out how I treat the individual case.
> Basically you have to replace the nulldevice driver with a "real" driver. You find all existing drivers under
> midas/drivers/device. If your favourite is not there, you have to write it. Use one which is close to the one
> you need and modify it.
Okay, I will try to write drivers for my own devices using existing drivers.
(maybe I can find some device drivers which uses TCP/IP, RS232)
Best regards,
Hisataka Yoshida |
|
1934
|
04 Jun 2020 |
Hisataka YOSHIDA | Forum | Template of slow control frontend | Dear Giorgio,
Thank you very much for your kind and quick reply!
I appreciate you giving me such a nice explanation, experience, and great sample codes (This is what I desired!).
They all are useful for me. I will try to write my frontend codes using gift from you.
Thank you again!
Best regards,
Hisataka Yoshida |
|
145
|
12 Jun 2003 |
Pierre-André Amaudruz | | Tape handling | - remove ss_tape_get_blockn from lazylogger.c
- add ss_tape_get_blockn to system.c
- add ss_tape_get_blockn prototype into midas.h
- fix buffer size for "dir" in mtape.c
- add block# for "dir" in mtape if command successful.
- handle TID_STRUCT bank type by display as 8bit in ybos.c (mdump) |
|
1891
|
01 May 2020 |
Joseph McKenna | Forum | Taking MIDAS beyond 64 clients |
Hi all,
I have been experimenting with a frontend solution for my experiment
(ALPHA). The intention to replace how we log data from PCs running LabVIEW.
I am at the proof of concept stage. So far I have some promising
performance, able to handle 10-100x more data in my test setup (current
limitations now are just network bandwith, MIDAS is impressively efficient).
==========================================================================
Our experiment has many PCs using LabVIEW which all log to MIDAS, the
experiment has grown such that we need some sort of load balancing in our
frontend.
The concept was to have a 'supervisor frontend' and an array of 'worker
frontend' processes.
-A LabVIEW client would connect to the supervisor, then be referred to a
worker frontend for data logging.
-The supervisor could start a 'worker frontend' process as the demand
required.
To increase accountability within the experiment, I intend to have a 'worker
frontend' per PC connecting. Then any rouge behavior would be clear from the
MIDAS frontpage.
Presently there around 20-30 of these LabVIEW PCs, but given how the group
is growing, I want to be sure that my data logging solution will be viable
for the next 5-10 years. With the increased use of single board computers, I
chose the target of benchmarking upto 1000 worker frontends... but I quickly
hit the '64 MAX CLIENTS' and '64 RPC CONNECTION' limit. Ok...
branching and updating these limits:
https://bitbucket.org/tmidas/midas/branch/experimental-beyond_64_clients
I have two commits.
1. update the memory layout assertions and use MAX_CLIENTS as a variable
https://bitbucket.org/tmidas/midas/commits/302ce33c77860825730ce48849cb810cf
366df96?at=experimental-beyond_64_clients
2. Change the MAX_CLIENTS and MAX_RPC_CONNECTION
https://bitbucket.org/tmidas/midas/commits/f15642eea16102636b4a15c8411330969
6ce3df1?at=experimental-beyond_64_clients
Unintended side effects:
I break compatibility of existing ODB files... the database layout has
changed and I read my old ODB as corrupt. In my test setup I can start from
scratch but this would be horrible for any existing experiment.
Edit: I noticed 'make testdiff' pipeline is failing... also fails locally...
investigating
Early performance results:
In early tests, ~700 PCs logging 10 unique arrays of 10 doubles into
Equipment variables in the ODB seems to perform well... All transactions
from client PCs are finished within a couple of ms or less
==========================================================================
Questions:
Does the community here have strong opinions about increasing the
MAX_CLIENTS and MAX_RPC_CONNECTION limits?
Am I looking at this problem in a naive way?
Potential solutions other than increasing the MAX_CLIENTS limit:
-Make worker threads inside the supervisor (not a separate process), I am
using TMFE, so I can dynamically create equipment. I have not yet taken a
deep dive into how any multithreading is implemented
-One could have a round robin system to load balance between a limited pool
of 'worker frontend' proccesses. I don't like this solution as I want to
able to clearly see which client PCs have been setup to log too much data
========================================================================== |
|
1892
|
01 May 2020 |
Stefan Ritt | Forum | Taking MIDAS beyond 64 clients | Hi Joseph,
here some thoughts from my side:
- Breaking ODB compatibility in the master/develop midas branch is very bad, since almost all experiments worldwide are affected if they just do blindly a pull and want to recompile and rerun. Currently,
even during our Corona crisis, still some experiments are running and monitored remotely.
- On the other hand, if we have to break compatibility, now is maybe a good time since most accelerators worldwide are off. But before doing so, I would like to get feedback from the main experiments
around the world (MEG, T2K, g-2, DEAP besides ALPHA).
- Having a maximum of 64 clients was originally decided when memory was scarce. In the early days one had just a couple of megabytes of share memory. Now this is not an issue any more, but I see
another problem. The main status page gives a nice overview of the experiment. This only works because there is a limited number of midas clients and equipments. If we blow up to 1000+, the status
page would be rather long and we have to scroll up and down forever. In such a scenario one would have at least to redesign the status and program pages. To start your experiment, you would have to
click 1000 times to start each front-end, also not very practicable.
- Having 100's or 1000's of front-ends calls rather for a hierarchical design, like the LHC experiments have. That would be a major change of midas and cannot be done quickly. It would also result in
much slower run start/stops.
- If you see limitations with your LabVIEW PCs, have you considered multi-threading on your front-ends? Note that the standard midas slow control system supports multithreaded devices
(DF_MULTITHREAD). In MEG, we use about 800 microcontrollers via the MSCB protocol. They are grouped together and each group is a multithreaded device in the midas slow control lingo, meaning the
group gets its own thread for control and readout in the midas frontend. This way, one group cannot slow down all other groups. There is one front-end for all groups, which can be started/stopped with
a single click, it shows up just as one line in the status page, and still it's pretty fast. Have you considered such a scheme? So your LabVIEW PCs would not be individual front-ends, but just make a
network connection to the midas front-end which then manages all LabVEIW PCs. The midas slow control system allows to define custom commands (besides the usual read/write command for slow
control data), so you could maybe integrate all you need into that scheme.
Best,
Stefan
>
> Hi all,
> I have been experimenting with a frontend solution for my experiment
> (ALPHA). The intention to replace how we log data from PCs running LabVIEW.
> I am at the proof of concept stage. So far I have some promising
> performance, able to handle 10-100x more data in my test setup (current
> limitations now are just network bandwith, MIDAS is impressively efficient).
> ==========================================================================
> Our experiment has many PCs using LabVIEW which all log to MIDAS, the
> experiment has grown such that we need some sort of load balancing in our
> frontend.
> The concept was to have a 'supervisor frontend' and an array of 'worker
> frontend' processes.
> -A LabVIEW client would connect to the supervisor, then be referred to a
> worker frontend for data logging.
> -The supervisor could start a 'worker frontend' process as the demand
> required.
> To increase accountability within the experiment, I intend to have a 'worker
> frontend' per PC connecting. Then any rouge behavior would be clear from the
> MIDAS frontpage.
> Presently there around 20-30 of these LabVIEW PCs, but given how the group
> is growing, I want to be sure that my data logging solution will be viable
> for the next 5-10 years. With the increased use of single board computers, I
> chose the target of benchmarking upto 1000 worker frontends... but I quickly
> hit the '64 MAX CLIENTS' and '64 RPC CONNECTION' limit. Ok...
> branching and updating these limits:
> https://bitbucket.org/tmidas/midas/branch/experimental-beyond_64_clients
> I have two commits.
> 1. update the memory layout assertions and use MAX_CLIENTS as a variable
> https://bitbucket.org/tmidas/midas/commits/302ce33c77860825730ce48849cb810cf
> 366df96?at=experimental-beyond_64_clients
> 2. Change the MAX_CLIENTS and MAX_RPC_CONNECTION
> https://bitbucket.org/tmidas/midas/commits/f15642eea16102636b4a15c8411330969
> 6ce3df1?at=experimental-beyond_64_clients
> Unintended side effects:
> I break compatibility of existing ODB files... the database layout has
> changed and I read my old ODB as corrupt. In my test setup I can start from
> scratch but this would be horrible for any existing experiment.
> Edit: I noticed 'make testdiff' is failing... also fails lok
> Early performance results:
> In early tests, ~700 PCs logging 10 unique arrays of 10 doubles into
> Equipment variables in the ODB seems to perform well... All transactions
> from client PCs are finished within a couple of ms or less
> ==========================================================================
> Questions:
> Does the community here have strong opinions about increasing the
> MAX_CLIENTS and MAX_RPC_CONNECTION limits?
> Am I looking at this problem in a naive way?
>
> Potential solutions other than increasing the MAX_CLIENTS limit:
> -Make worker threads inside the supervisor (not a separate process), I am
> using TMFE, so I can dynamically create equipment. I have not yet taken a
> deep dive into how any multithreading is implemented
> -One could have a round robin system to load balance between a limited pool
> of 'worker frontend' proccesses. I don't like this solution as I want to
> able to clearly see which client PCs have been setup to log too much data
> ========================================================================== |
|
1893
|
01 May 2020 |
Pierre Gorel | Forum | Taking MIDAS beyond 64 clients | > - On the other hand, if we have to break compatibility, now is maybe a good time since most accelerators worldwide are off. But before doing so, I would like to get feedback from the main experiments
> around the world (MEG, T2K, g-2, DEAP besides ALPHA).
Hello Stefan,
For what is worth, DEAP will not be impacted: as we have been taking data around the clock for the last few years, we froze the code running on the computers. We may have some window of opportunity for upgrade in few months but such a move has not been discussed yet.
Best regards,
Pierre |
|
1894
|
02 May 2020 |
Stefan Ritt | Forum | Taking MIDAS beyond 64 clients | TRIUMF stayed quiet, probably they have other things to do.
I allowed myself to move the maximum number of clients back to its original value, in order not to break running experiments.
This does not mean that the increase is a bad idea, we just have to be careful not to break running experiments. Let's discuss it
more thoroughly here before we make a decision in that direction.
Best regards,
Stefan |
|
1895
|
02 May 2020 |
Joseph McKenna | Forum | Taking MIDAS beyond 64 clients |
Thank you very much for feedback.
I am satisfied with not changing the 64 client limit. I will look at re-writing my frontend to spawn threads rather than
processses. The load of my frontend is low, so I do not anticipate issues with a threaded implementation.
In this threaded scenario, it will be a reasonable amount of time until ALPHA bumps into the 64 client limit.
If it avoids confusion, I am happy for my experimental branch 'experimental-beyond_64_clients' to be deleted.
Perhaps a item for future discussion would be for the odbinit program to be able to 'upgrade' the ODB and enable some backwards
compatibility.
Thanks again
Joseph |
|
1896
|
02 May 2020 |
Stefan Ritt | Forum | Taking MIDAS beyond 64 clients | > Perhaps a item for future discussion would be for the odbinit program to be able to 'upgrade' the ODB and enable some backwards
> compatibility.
We had this discussion already a few times. There is an ODB version number (DATABSE_VERSION 3 in midas.h) which is intended for that. If we break teh
binary compatibility, programs should complain "ODB version has changed, please run ...", then odbinit (written by KO) should have a well-defined
procedure to upgrade existing ODBs by re-creating them, but keeping all old contents. This should be tested on a few systems.
Stefan |
|
1897
|
02 May 2020 |
Konstantin Olchanski | Forum | Taking MIDAS beyond 64 clients | >
> Does the community here have strong opinions about increasing the
> MAX_CLIENTS and MAX_RPC_CONNECTION limits?
> Am I looking at this problem in a naive way?
>
I think MAX_CLIENTS set at 64 is on the low side for today.
And in the past, we did have experiments that did not work without increasing MAX_CLIENTS. I
think T2K/ND280 needed MAX_CLIENTS bumped to about 100 (200?).
If ALPHA needs MAX_CLIENTS bigger than the default 64, nothing stops the experiment
from changing this number in the local copy of MIDAS.
It is not necessary to change it in the central repository for everybody.
K.O. |
|
1898
|
02 May 2020 |
Konstantin Olchanski | Forum | Taking MIDAS beyond 64 clients | > >
> > Does the community here have strong opinions about increasing the
> > MAX_CLIENTS and MAX_RPC_CONNECTION limits?
> > Am I looking at this problem in a naive way?
> >
The issue is binary compatibility.
MIDAS has been binary compatible with itself for a long time, 20 years now, easily.
If we are to give this up, we must gain more than we lose.
On the technical level, bumping MAX_CLIENTS from 64 to 100 gives us nothing, Tomorrow an experiment
will come along asking for 101 clients. Any number you pick, it is too small for somebody. And MIDAS
already has a solution for this: edit midas.h, hit make, done.
If we are to break binary compatibility, we should go big. Remove these limits completely!
Move the MAX_CLIENTS & co fixed size arrays out of the headers in ODB and in event buffers, put
them where they can be resized as needed.
That's a binary-compatibility breaking solution I would vote for.
K.O. |
|
Draft
|
02 May 2020 |
Konstantin Olchanski | Forum | Taking MIDAS beyond 64 clients | > > >
> > > Does the community here have strong opinions about increasing the
> > > MAX_CLIENTS and MAX_RPC_CONNECTION limits?
> > > Am I looking at this problem in a naive way?
> > >
The issue is |
|
1899
|
02 May 2020 |
Konstantin Olchanski | Forum | Taking MIDAS beyond 64 clients | > > >
> > > Does the community here have strong opinions about increasing the
> > > MAX_CLIENTS and MAX_RPC_CONNECTION limits?
> > > Am I looking at this problem in a naive way?
> > >
The issue is: how to organize an experiment? how many frontends should I have?
There are two extremes:
- collect all data in 1 frontend (and today with c++ threads and c++ ring buffers, this is trivial)
- instantiate 1 frontend for each data source. (for example, ALPHA-g detector has 8 ADCs, 64 PWBs plus some
small fish. No that's wrong. Each ADC looks like 48 individual data sources, each PWB looks like 4 data sources,
so this would be 8*48+4*64=640 data sources, could be 640 frontends easily, plus small fish).
Which way is best? Every experiment is different, but consider simple things:
640 frontends writing into 1 event buffer will probably cause large contention for the event buffer lock. bad.
640 frontends running on a 4 core CPU will probably cause unhappiness in the OS. bad.
starting and stopping 640 frontends requires some scripting, monitoring that they all still run, etc. extra work. bad.
640 frontends on the midas status page? your cell phone web browser will explode. bad.
What I am saying is - arbitrary limits are good for you. Make you think about what is going on before throwing
resources at the problem.
K.O. |
|
163
|
13 Oct 2004 |
Konstantin Olchanski | Bug Report | TWIST upgrade bombed... | The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
crashes during shared memory data buffer accesses. I am looking into it and I
will add information as I figure things out. K.O. |
|
164
|
13 Oct 2004 |
Pierre-Andre Amaudruz | Bug Report | TWIST upgrade bombed... | > The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> crashes during shared memory data buffer accesses. I am looking into it and I
> will add information as I figure things out. K.O.
Since 1.9.5 the EventBuilder has been modified. Please consult the documentation
where the new mevb scheme is explained.
Test of the mevb with up to 16 frontends (15 different CPUs) has been tested
successfully. Data rate at the EventBuilder were measured about 50MB/s without the
logger and ~30MB/s with the logger. |
|
165
|
13 Oct 2004 |
Konstantin Olchanski | Bug Report | TWIST upgrade bombed... | > The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> crashes during shared memory data buffer accesses. I am looking into it and I
> will add information as I figure things out. K.O.
I traced buffer memory corruption to a logic error in system.c::ss_shm_open(). If
a .SHM file exists, it's size is used as the size of the sysv shared memory
segment, even if the requested shared memory size is bigger, but the caller of
ss_shm_open() thinks it got all the requested memory. Eventually we try to use
the unallocated memory and crash. This is the proposed fix and I will commit it
after I retest the upgrade during the next few days.
[olchansk@send src]$ cvs diff -u system.c
olchansk@midas.psi.ch's password:
Index: system.c
===================================================================
RCS file: /usr/local/cvsroot/midas/src/system.c,v
retrieving revision 1.83
diff -u -r1.83 system.c
--- system.c 4 Oct 2004 07:04:01 -0000 1.83
+++ system.c 14 Oct 2004 05:51:16 -0000
@@ -544,8 +544,14 @@
} else {
/* if file exists, retrieve its size */
file_size = (INT) ss_file_size(file_name);
- if (file_size > 0)
+ if (file_size > 0) {
+ if (file_size < size) {
+ cm_msg(MERROR, "ss_shm_open", "Shared memory segment \'%s\' size
%d is smaller than requested size %d. Please remove it and try
again",file_name,file_size,size);
+ return SS_NO_MEMORY;
+ }
+
size = file_size;
+ }
}
/* get the shared memory, create if not existing */
K.O. |
|
166
|
13 Oct 2004 |
Konstantin Olchanski | Bug Report | TWIST upgrade bombed... | > > The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> > crashes during shared memory data buffer accesses. I am looking into it and I
> > will add information as I figure things out. K.O.
>
> Since 1.9.5 the EventBuilder has been modified. Please consult the documentation
> where the new mevb scheme is explained.
> Test of the mevb with up to 16 frontends (15 different CPUs) has been tested
> successfully. Data rate at the EventBuilder were measured about 50MB/s without the
> logger and ~30MB/s with the logger.
It turns out that TWIST uses a private mevb.c. We will consider upgrading to the
standard one.
K.O. |
|
167
|
14 Oct 2004 |
Stefan Ritt | Bug Report | TWIST upgrade bombed... | Agree.
Once you did the modification, please check following situation: Create a fresh
ODB withe increased size ("odbedit -s 2000000" for example). Then check that the
other clients "adopt" this increased size. Note that some experiments need a
bigger ODB, and I don't want to have them recompile all clients, that's why the
code in ss_shm_open() can attach to a *larger* shared memory. However, it should
not matter to the process, since the ODB (or SYSTEM) shared memory size is
stored in the pheader->key_size and pheader->data_size of each participating
process. So they should never write beyond the limits defined in that header.
The size to ss_shm_open() is only a "hint" if the shared memory does not exist,
and is nowhere later used in the code. |
|
169
|
14 Oct 2004 |
Konstantin Olchanski | Bug Report | TWIST upgrade bombed... | > The upgrade of TWIST to the latest midas has bombed- we see mevb and mlogger
> crashes during shared memory data buffer accesses. I am looking into it and I
> will add information as I figure things out. K.O.
On second try, it looks like we are in business- the first try did not work
because of two mistakes:
1) I did not delete *all* old .SHM files (.ODB.SHM, .SYSTEM.SHM, .YBUF1.SHM,
.YBUF2.SHM). I deleted ODB.SHM, so odb worked, but forgot about the data buffers
SYSTEM.SHM & co and ended up with segmentation faults and core dumps in the buffer
management code caused by a mismatch of the old-midas buffers and new-midas code.
2) while debugging these core dumps, I made an error in my test code, so even
after I deleted the old data buffers, things still did not work. Talk about
over-debugging a problem...
K.O. |
|