17 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB
|
Today I addressed a topic which bugged me since long time. The ODB contains
settings under /Equipment/<name>/Common which are a "mirror" of the equipment[]
setting in a frontend (using the mfe.cxx framework). If the "Common" entry in
the ODB is not present (fresh experiment), the equipment[] settings from the
frontend are copied to the ODB. But if it exists, it takes precedence over the
equipment[] entries, which is wrong in my opinion. Like if you change some
settings in equipment[] (like the logging period of the history), then recompile
and restart the frontend, the old values in the ODB are kept and your
modification in the frontend code has no effect.
Starting on commit c3017c6c on Nov. 17th 2020 I reversed the precedence: Now, on
each start of the frontend program, the values from equipment[] are written to
the ODB. They are still "live". If one changes them when the frontend is
running, that change takes effect immediately. But on the next restart of the
frontend, the old values from equipment[] is put back there.
I fell too many times into this trap, and I hope the modification helps
everybody. If there are however experiments which rely on the fact that the
common settings in the ODB are NOT overwritten by the frontend, please let me
know and I can put a flag "EQUIPMENT_FE_PRECEDENCE = FALSE" somewhere to restore
the old behaviour.
Stefan |
19 Nov 2020, Stefan Ritt, Forum, History plot consuming too much memory
|
The history code is right now programmes in such a way that when you request
an old time window, then all data from that window until the present date
gets loaded. When we implemented that, this worked fine for data ranges of
several years with a delay of just a few seconds. Of course one can only
load that specific window, but when the user then scrolls right, one has to
append new data to the "right side" of the array stored in the browser. If the
user jumps to another location, then the browser has to keep track of which
windows are loaded and which windows not, making the history code much more
complicated. Therefore I'm only willing to spend a few days of solid work
if this really becomes a problem.
Are you sure that the delay comes from the browser or actually from mhttpd
digging through GBytes of history data? I realized that you need solid state
disks to get a real quick response.
Stefan |
20 Nov 2020, Stefan Ritt, Forum, History plot consuming too much memory
|
> Taking this down a tangent, I have a mild concern that a user could temporarily
> flood our gigabit network if we do have faster disks to read the history data. Have
> there been any plans or thoughts on limiting the bandwidth users can pull from
> mhttpd?
I guess this will not be network limiting but CPU limiting of the mhttpd process. But I'm
not 100% sure, depends on the actual hardware. But even if we improve the history
retrieval to "window only", the user could request all data form 2010 to 2020. So one
would need some code which estimates the amount of data, then tell the user "do you really
want that?". But still, a novice user can simply click "yes" without much of a thought. So
in conclusion I believe proper user training is better than software limits. Like the
other guy "I did 'rm -rf /', and now nothing works any more, can you help?".
Stefan |
27 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB
|
Ok, so what about the following proposal:
- I change back the mfe.cxx code to behave like before (ODB has precedence and does not get overwritten when the
front-end restarts)
- I add a global flag
BOOL equipment_common_overwrite;
and pre-set it to FALSE;
- So if nothing is changed the flag stays false and ODB keeps precedence
- If a frontend wants to overwrite equipment/common on each start, the user sets
BOOL equipment_common_overwrite = TRUE;
near the equipment[] structure in the front-end code.
- If the flag is true, the mfe.cxx init code copies the equipment[] structure to the ODB on each frontend start
I believe this way we can keep backward compatibility, and add the new way with minimal effort. The only downside
is that all frontends on this plane have to add at least "BOOL equipment_common_overwrite = FALSE;" in their
code.
I know global variables are evil, but this way the user can just add the line above to the equipment[] array, so
one sees this when one edits the equipment[] array, giving motivation to change as needed. So the code would be
BOOL equipment_common_overwrite = TRUE;
EQUIPMENT equipment[] = {
....
}
An alternative way would be to add a function
set_equipment_common_overwrite(TRUE);
into the frontend_init() code. That's somehow cleaner (still needs an internal global variable), but it has to go
into frontend_init() so won't be at the same place as the EQUIPMENT list in the frontend.
Thoughts?
Best,
Stefan |
30 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB
|
Ok, I implemented it the following way:
- Added a boolean flag "equipment_common_overwrite", which must be contained in EACH frontend, preferably just
before the EQUIPMENT structure, such as:
BOOL equipment_common_overwrite = TRUE;
EQUIPMENT equipment[] = {
...
};
- If that flag is TRUE, then the contents of the "equipment" structure is copied to the ODB on each start of the
front-end
- If the flag is FALSE, then the ODB values are kept on the start of the front-end
The setting of the flag depends now on the philosophy of the experiment. Some experiments say that everything
needed should be in the front-end code, so when it starts everything gets set correctly. They don't change the
values in the ODB, but in the frontend code, which then goes into their repository. Other experiments just need
some default values from the frontend code, and the fine-tune things by changing values in the ODB. These
experiments should set this flag to FALSE.
*****
Please note that EVERY frontend now needs this flag, so all of you have to add it to all of your front-ends,
otherwise the front-end will not compile! I could not figure out how to this could be done without this
requirement, since you can define a global variable only once.
*****
Stefan |
30 Nov 2020, Stefan Ritt, Suggestion, ODBSET wildcards with array keys in Sequencer files
|
Hi Konstantin,
we are considering to make the range selection uniform among json, sequencer and
odbedit "set" command. Having multiple ranges like [1,4-5] will be quite some work, so
my question is did you just implement it on the json side because it was easy, or are
there experiments who really need it? Wouldn't it be enough to have
[*]
[n]
[n-m]
This way we always have only one db_set_data() value behind that. Any set of indices
we have to split into several db_set_data(), which especially for the front-end
configuration can cause trouble by triggering a hot link on each access.
Stefan |
30 Nov 2020, Stefan Ritt, Info, Equipment "common" settings in ODB
|
One more change:
After using the new code for some hours, we realized that the "enabled" flag should not come from the frontend code,
but always be defined by the ODB. So if you quickly have to disable some equipment because the associated hardware is
off, you want to change this flag only in the ODB and not have to recompile the frontend. So we exclude that flag from
being set by the frontend. It is anyhow special, because one sees all disable equipment in the main midas status page,
so one knows what's on and what's off.
Please comment here if you think that change causes problem. Anyhow it's working now for the enabled flag as before
all these changes.
Stefan |
01 Dec 2020, Stefan Ritt, Forum, subrun
|
There is no "mechanism" foreseen to be executed after each subrun. But you could
run a shell script after each run which loops over all subruns and converts them
one after the other.
Stefan
> Hi,
>
> I was wondering if there is a "mechanism" to run an executable
> file after each subrun is closed...
>
> I need to convert .mid.lz4 subrun files to ROOT (TTree) files;
>
> Thanks,
> Gennaro |
09 Dec 2020, Stefan Ritt, Forum, history and variables confusion
|
First, the writing of banks is completely independent of the history system. Banks go to the log file only,
while the history is only linked to the "Variables" section in the ODB.
Second, it's advisable to group similar equipment into one. Like if you have five power supplies powering
and experiment, you don't want to have five equipments Supply1, Supply2, ..., but only one equipment
"Power Supplies". In the frontend belonging to that equipment, you define a DEVICE_DRIVER list with
one entry for each power supply. If you interact with an mscb device, there are some helper functions
which simplify the definition of the equipment and which I can send you privately. So your device
driver looks a bit like the one attached.
If you cannot do that and absolutely want separate equipments, please post a complete ODB subtree of your
settings, and I can try to reproduce your problem.
Stefan
======================
DEVICE_DRIVER power_driver[] = {
{"Power Supply 1", mscbdev, 0, NULL, DF_INPUT | DF_MULTITHREAD},
{"Power Supply 2", mscbdev, 0, NULL, DF_INPUT | DF_MULTITHREAD},
{"Power Supply 3", mscbdev, 0, NULL, DF_INPUT | DF_MULTITHREAD},
{""}
};
...
INT frontend_init()
{
mscb_define("mscbxxx.psi.ch", "Power Supplies", "Power Supply 1", power_driver, 1, 0, "Output 1", 0.1);
mscb_define("mscbxxx.psi.ch", "Power Supplies", "Power Supply 1", power_driver, 1, 1, "Output 2", 0.1);
...
}
/*-- Function to define MSCB variables in a convenient way ---------*/
void mscb_define(const char *submaster, const char *equipment, const char *devname,
DEVICE_DRIVER *driver, int address, unsigned char var_index,
const char *name, double threshold)
{
int i, dev_index, chn_index, chn_total;
char str[256];
float f_threshold;
HNDLE hDB;
cm_get_experiment_database(&hDB, NULL);
if (submaster && submaster[0]) {
sprintf(str, "/Equipment/%s/Settings/Devices/%s/Device", equipment, devname);
db_set_value(hDB, 0, str, submaster, 32, 1, TID_STRING);
sprintf(str, "/Equipment/%s/Settings/Devices/%s/Pwd", equipment, devname);
db_set_value(hDB, 0, str, "meg", 32, 1, TID_STRING);
}
/* find device in device driver */
for (dev_index=0 ; driver[dev_index].name[0] ; dev_index++)
if (equal_ustring(driver[dev_index].name, devname))
break;
if (!driver[dev_index].name[0]) {
cm_msg(MERROR, "mscb_define", "Device \"%s\" not present in device driver list", devname);
return;
}
/* count total number of channels */
for (i=chn_total=0 ; i<=dev_index ; i++)
if (((driver[dev_index].flags & DF_INPUT) > 0 && (driver[i].flags & DF_INPUT)) ||
((driver[dev_index].flags & DF_OUTPUT) > 0 && (driver[i].flags & DF_OUTPUT)))
chn_total += driver[i].channels;
chn_index = driver[dev_index].channels;
sprintf(str, "/Equipment/%s/Settings/Devices/%s/MSCB Address", equipment, devname);
db_set_value_index(hDB, 0, str, &address, sizeof(int), chn_index, TID_INT, TRUE);
sprintf(str, "/Equipment/%s/Settings/Devices/%s/MSCB Index", equipment, devname);
db_set_value_index(hDB, 0, str, &var_index, sizeof(char), chn_index, TID_BYTE, TRUE);
if (threshold != -1 && (driver[dev_index].flags & DF_INPUT) > 0) {
sprintf(str, "/Equipment/%s/Settings/Update Threshold", equipment);
f_threshold = (float) threshold;
db_set_value_index(hDB, 0, str, &f_threshold, sizeof(float), chn_total, TID_FLOAT, TRUE);
}
if (name && name[0]) {
sprintf(str, "/Equipment/%s/Settings/Names %s", equipment, devname);
db_set_value_index(hDB, 0, str, name, 32, chn_total, TID_STRING, TRUE);
}
/* increment number of channels for this driver */
driver[dev_index].channels++;
} |
16 Dec 2020, Stefan Ritt, Forum, Issues building banks.
|
> This is very hard to do using the mfe.c frontend. (the main reason I wrote the TMFE C++ frontend class).
Actually that's not true. Just look at
midas/examples/mtfe/mtfe.c
this is an example for a frontend with equipment with the EQ_USER flag, which allows you easily to run a separate
thread (or more) for event collection and processing. Of course all old-fashioned C style (code is from 2007) but it
works.
Stefan |
18 Dec 2020, Stefan Ritt, Suggestion, Code formatting    
|
May I ask for your quick opinion on code formatting. MIDAS had a coding style
which pretty much followed the ROOT coding style described at
https://root.cern/contribute/coding_conventions/
so we followed the "3 spaces indent" convention, braces according to Kernigham &
Ritchie and a few other things. I see however that code written by different
people still is formatted differently, like spaces before and after comparators
etc. I wonder if it would make sense to keep a consistent code formatting through
the whole midas repository.
Looking again at what the ROOT guys doe (see link above), they have a ClangFormat
file, which I attached to this post. Putting this file into the root of midas
ensures that all files are formatted in exactly the same way, which would increase
readability largely.
The nice thing with ClangFormat is that can be integrated into my editor (Clion) as
well as in emacs and vim:
https://clang.llvm.org/docs/ClangFormat.html
This would also make the emacs settings in our files obsolete:
/* emacs
* Local Variables:
* tab-width: 8
* c-basic-offset: 3
* indent-tabs-mode: nil
* End:
*/
I don't like these because they are only for people using emacs. If everybody would
put statements into the files with their favourite editor, all our source files
would be cluttered quite a bit.
So the question is now how style to use? I attached different trials with a simple
file from the distribution, so you can see the differences. They use the style from
- LLVM
- ROOT
- GNU
- Google
I consciously skipped the "Microsoft" style ;-)
Which one should we settle on? Any opinion? If I don't hear anything, I will pick a
style at the end of this year 2020. I have a slight favour of the ROOT style, although
I don't like that the "case" is not indented there under the opening brace of the
switch statement which seems inconsistent to me. The only one doing that right is the
Google format, but that one has an indentation of 2 chars instead our usual 3 chars.
At the end of the day I think it's not so important on which style we agree, as long
as we DO have a common style for all midas files.
Best,
Stefan |
04 Jan 2021, Stefan Ritt, Suggestion, Code formatting
|
After pondering over the holidays, I decided to use the widely used LLVM code formatting,
just adapted slightly for 3 spaces and "case" indentation in a "switch" statement. This
formatting is now very close to our original one. Nevertheless, I did not reformat all
existing code, since that would screw up the git repository, and you cannot see then anymore
who wrote which line of code. But having the .clang-format file now in the midas root, all
NEW files fill follow that standard.
The CLion editor automatically picks up the .clang-format file if your enable ClangFomrat
via Preferences -> Code Style -> General -> Enable ClangFormat.
EMACS can also use this file by adding following lines to your .emacs:
(load "<path-to-clang>/tools/clang-format/clang-format.el")
(global-set-key [C-M-tab] 'clang-format-region)
One problem left is if you check out midas on a new machine, you might not have there your
personal .emacs file. If there is a way to ship a .emacs with midas, which gets
automatically loaded, I would be happy to put this into the distribution.
Stefan |
06 Jan 2021, Stefan Ritt, Bug Report, Logger: Disk nearly full.
|
The logger simple requests the disk free space level from the operating system in the same
way as the "df" command does. Can you do a "df" on your system? I have seen that some file
systems free up space not immediately if you delete files, but some times later (like 24h).
Stefan |
06 Jan 2021, Stefan Ritt, Suggestion, Improving variable functionality in Sequencer?
|
I guess you use a wrong pattern here. There is no need to copy ODB values to local variables,
then change them, then write them back. You can rather directly write values to the ODB. We run
all our experiments in that way and we can do what we want. So most of our scripts have sections
like
ODBSUBDIR "/Equipment/Laser/Variables"
ODBSET "Setting[*]", 0, 0
ODBSET "Output[1]", 0, 0
ODBSET "Output[2]", 1, 0
ODBSET "Output[3]", 0, 0
ODBSET "Output[4]", 1, 1
ENDODBSUBDIR
Note that both the path and the indices can contain wild cards, making this pattern more
flexible. Wildcards are however not (yet) supported for local variables, that's why we use
directly the ODBSET directive.
I attach a larger example from the MEG experiment here for your reference.
Stefan |
08 Jan 2021, Stefan Ritt, Forum, history and variables confusion
|
We kind of agreed to rewrite the slow control system in C++. Each device will have its own driver derived from a common base class implementing the general communication. The reason we need a "system" and not only a "hand-written" driver is because we want:
- glue many device drivers together for a single equipment
- have a dedicated readout thread for every device, in order not to block other devices
- have a common error reporting scheme working with several threads
- being able to disable/enable individual devices without changing the history system each time
- having a common naming scheme for all devices (like "enforce" /Equipment/<name>/Settings/Names xxx) which is needed by the history system
- ...
Will see when we have time for that.
Stefan |
13 Jan 2021, Stefan Ritt, Forum, poll_event() is very slow.
|
Something must be wrong on your side. If you take the example frontend under
midas/examples/experiment/frontend.cxx
and let it run to produce dummy events, you get about 90 Hz. This is because we have a
ss_sleep(10);
in the read_trigger_event() routine to throttle things down. If you remove that sleep,
you get an event rate of about 500'000 Hz. So the framework is really quick.
Probably your routine which looks for a 'lam' takes really long and should be fixed.
Stefan |
25 Jan 2021, Stefan Ritt, Suggestion, mhttpd browser caching
|
Let me first explain a bit why caching is there. Once we had the case that someone from
TRIUMF opened a midas custom page at T2K. It took about one minute (!) to load the page.
When we looked at it, we found that the custom page pulled about 100 items with individual
HTTP requests from Japan, each taking about one second for the roundtrip. Then we redesigned
the custom page communication so that many ODB entries could be retrieved in one operation,
which improved the loading time from 100s to about 2s.
With the buttons we will have to make the same compromise. If we do not cache anything,
loading the midas status page over the Pacific takes many seconds. If we cache all, any
change on the midas side will not be reflected on the web page. So there is a compromise
to be made. I thought I designed it such that the side menu is cached locally, but when
the user presses "reload", then the full menu is fetched from the server. Of course one
has to remember this, so changing the ELOG URL or other things on the menu require a
reload (or wait a certain time for the cache to expire). So try again if that's working
for you. If not, I can visit it again and check if there is any bug.
If we go the route to disable the cache, better try this to T2K and see what you get before
we commit ourselves to that. Last time TRIUMF people were complaining a lot about long
load times.
Best,
Stefan |
08 Feb 2021, Stefan Ritt, Suggestion, mhttpd browser caching
|
> It seems that the only reliable way to bypass the browser cache is to add
> a tag with a random number to the URL ("&ts=currenttime").
Indeed that's the only reliable way to avoid caching across browsers. An alternative is
("&r=" + Math.random())
to add a random number.
> BTW, things like midas.js are also cached, and it is common to see problems
> after updating midas, where status.html is newly loaded, but midas.js is an old
> stale version from cache.
Reloading JavaScript file NOT from the cache is really tricky these days. I added a
special Google Chrome extension to clear my browser cache, which works reliably:
https://chrome.google.com/webstore/detail/clear-cache/cppjkneekbjaeellbfkmgnhonkkjfpdn
Stefan |
25 Feb 2021, Stefan Ritt, Bug Report, history reload
|
I have to reproduce the problem. Can you please send me the full link by direct email. As you know, I'm also at PSI.
Stefan |
02 Mar 2021, Stefan Ritt, Info, shortest possible sleep
|
Why do you need that? Periodic equipment typically runs ever ten seconds or so, meaning one can do this easily in a scheduler.
For polled equipment, you don't want to sleep at all. Because if you sleep, you might miss an event. That's why I put my poll in mfe.c into a for() loop. No
sleep, maximum polling rate. I just double checked on my macbook air.
- If poll is always false (no event available), the loop executes 50M times in 100ms (calibrated during startup of the frontend). That means one iteration
takes 2ns (!). So if an event occurs, the readout is started with a 2ns overhead. No sleep can beat that. In a real world application, one has to add of course
the VME access or so to poll for the event.
- If poll is always true, the framework generates about 700k events each second (returning jus a few bytes of event data).
So if one adds any sleep here, things can get only worse, so I don't see the point for that. Of course polling eats one kernel at 100%, but these days every
CPU has more than one, even my 800 MHz Xilinx embedded ARM CPU (Zynq).
Best,
Stefan |
|