01 Feb 2013, Stefan Ritt, Forum, analyzer cannot connect to the statistics database
|
> The simplest thing is probably to delete all files .[A-Z]*.SHM in the odb directory (the
> one you specified in /etc/exptab).
> This wipes the ODB, shared memory and all the other obscure stuff, giving you a clean,
> fresh start.
>
> Of course it wipes all the valuable stuff, too. That's why it's handy to sometimes open
> odbedit and "save odb_<yyyymmdd>.odb". You can reload the thing after such a fatal
> "rm .[A-Z]*.SHM"
Thanks Randolf for helping out, I was not in the office this week.
In addition of deleting the *SHM files, it's sometimes necessary to delete the shared memory. You do this with the
command line tools
ipcs -m
ipcrm -m <shmid>
/Stefan |
13 May 2022, Konstantin Olchanski, Info, analysis of corner cases in event buffer write cache
|
introduction:
to remember, bm_send_event() writes an event to the write cache, bm_flush_cache()
writes the contents of the write cache into the shared memory event buffer, buffer
free space is consumed. in the usual case, mlogger is reading events from the shared
memory event buffer, buffer free space is released. there is also a read cache, not
part of this discussion.
the purpose of the write cache is to reduce contention for the shared memory
semaphore. in the case of large number of small events, semaphore is locked per
cache-flush, instead of per-event. correct tuning of write cache and event size can
reduce lock rate from >100 kHz to around 100 Hz or lower.
analysis:
for correct operation of bm_send_event() under all conditions we need to consider
all corner cases:
1) no write cache: (cache size set to 0)
- event_size > buffer_size -> reject the event (obviously)
- event_size > 0.5 * buffer_size -> only 1 event fits into the buffer, next write
will stall until mlogger reads the previous event (sequential operation, bad)
- event_size < 0.3 * buffer_size -> at least 2 events fit into the buffer (good)
decision: limit event size to 0.5 to 0.3 * buffer_size (current limit is 0.5 *
buffer_size, I think).
consequence: buffer size limit is 2 Gbytes (32-bit byte offsets, code is only 31-
bit-clean), max event size is between 1 Gbytes and 0.6 Gbytes.
2) writing to write cache:
- event_size > cache_size -> flush cache, write event to directly to buffer
- event_size > 0.5 * cache_size -> inefficient use of cache: write to cache, next
event does not fit, flush to buffer, repeat. no gain in semaphore locking (bad), one
additional memcpy() (event to cache and cache to buffer) (bad)
- event_size < 0.3 * cache_size -> multiple events fit into cache, but probably no
gain in semaphore locking
decision: events that are bigger than 0.3 to 0.1 * cache_size should not go through
the cache. (flush cache, write directly to buffer).
3) flush write cache to buffer:
- cache_size > buffer_size -> cannot flush in 1 operation, must have a loop and
flush the cache in pieces
- cache_size between 0.5 and 1.0 * buffer_size -> can flush in 1 operation, but must
wait for mlogger to fully empty the buffer (sequential operation, bad)
- cache size < 0.3 * buffer_size -> can flush in 1 operation, at least 2 "flushes"
fit inside the buffer (good)
decision: limit write cache size to 0.3 * buffer_size. (current limit is
0.25*buffer_size).
consequences:
- write cache size limit is 0.3..0.25 * 2GB = 0.6..0.5 Gbytes
- cached event size limit is 0.3..0.1 * 0.5 GBytes = 150..50 Mbytes
- minimum number of cached events: 3 to 10
- semaphore locks reduced: 3 to 10 locks become 1 lock (all events cached),
4 to 11 locks become 2 locks (big event causes cache flush).
4) complications:
- there is a periodic 1/second bm_flush_cache() that flushes the cache early and
reduces it's efficiency (but needed to avoid having data stuck in cache for long
time)
- if multiple frontends use large write cache (~ 0.3..0.5 * buffer_size), again,
sequential operation can happen (bad)
- write cache is per-frontend, not per-equipment. if different equipments request
different cache sizes, mfe.c and tmfe c++ frontends complain about this, but the
user has to sort it out.
K.O. |
16 May 2022, Konstantin Olchanski, Info, analysis of corner cases in event buffer write cache
|
> for correct operation of bm_send_event() under all conditions we need to ...
to continue computation from last message:
default SYSTEM buffer size: 32 MiBytes
default max event size: 4 MiBytes
hard max buffer size: 2 Gbytes (code is only 31-bit-clean)
hard max event size: 2 Gbytes (code is only 31-bit-clean)
max event size currently: 32 Mbytes (same as buffer size)
max event size per (1) in previous post: 32*0.5..0.3 = 16..9 MiBytes
number of default-max-size events buffered: 32/4 = 8.
number of per (1) max-size events buffered: 2 or 3
number of current max-size events buffered: 0 (bad, frontend is serialized with mlogger)
default write cache size: 100 kbytes
max write cache size currently: buffer size / 4 = 32/4 = 8 MiBytes
max write cache size per (3) in previous post: buffer_size / 3 = 10 Mbytes
hard max write cache size per (3): 2 Gbytes/3 = 600 Mbytes
max size of cached events:
current: 100 kbytes (size as cache size)
per (2) in previous post: 0.1..0.3 * cache size = 10..30 kbytes
per (2), 1 Mbyte cahe: 0.1..0.3 * cache size = 100..300 kbytes
hard max size: 0.1..0.3 * hard_max_cache_size = 0.1..0.3 * 600 = 60..180 Mbytes.
max data rate before event buffer semaphore locking rate exceeds 100 Hz:
1 kbyte events, no write cache: 100 kbytes/sec
1 kbyte events, 100 kbyte cache: 100 events cached, cache flush rate 100 Hz -> 100*1kbyte*100Hz -> 10 Mbytes/sec
1 kbyte events, 1 Mbyte cache: 1000 events cached, cache flush rate 100 Hz -> 100 Mbytes/sec (1gige ethernet)
N kbyte events, 1 Mbyte cache: same thing (data rate is limited by cache flush rate 100 Hz)
100 kbyte events, 1 Mbyte cache, not cached per (2): 100kbyte*100Hz = 10 Mbytes/sec
300 kbyte events, 1 Mbyte cache, not cached per (2): 300kbyte*100Hz = 30 Mbytes/sec
N00 kbyte events: N0 Mbytes/sec (500->50, etc)
1 kbyte events, 10 Mbyte cache: 10000 events cached, cache flush rate 100 Hz -> 1000 Mbytes/sec (10gige ethernet)
N kbyte events, 10 Mbyte cache: same thing (data rate is limited by cache flush rate 100 Hz)
1000 kbyte events, 10 Mbyte cache, not cached per (2): 1000kbyte*100Hz = 100 Mbytes/sec
3000 kbyte events, 10 Mbyte cache, not cached per (2): 3000kbyte*100Hz = 300 Mbytes/sec
N000 kbyte events: N00 Mbytes/sec (4000->400, 5000->500, etc)
default max event size: 4 Mibytes*100Hz = 400 Mbytes/sec (exceeds 1gige ethernet)
hard max event size (divided by 10 to buffer 10 events): 200 Mbytes*100Hz -> 20 Gbytes/sec
max event rate before event buffer semaphore locking rate exceeds 100 Hz:
1 kbyte events, no write cache: 100 Hz (obviously)
1 kbyte events, 100 kbyte cache: 100 events cached, cache flush rate 100 Hz -> 10 kHz
1 kbyte events, 1 Mbyte cache: 1000 events cached, cache flush rate 100 Hz -> 100 kHz
N kbyte events, 1 Mbyte cache: 1000/N events cached, cache flush rate 100 Hz -> 100/N kHz
1 kbyte events, 10 Mbyte cache: 10000 events cached, cache flush rate 100 Hz -> 1000 kHz
N kbyte events, 10 Mbyte cache: 10000/N events cached, cache flush rate 100 Hz -> 1000/N kHz
100 kbyte events, not cached per (2): 100 Hz (obviously)
300 kbyte events, not cached per (2): 100 Hz (obviously)
default max event size: 100 Hz (obviously)
K.O. |
16 May 2022, Konstantin Olchanski, Info, analysis of corner cases in event buffer write cache
|
> > for correct operation of bm_send_event() under all conditions we need to ...
> to continue computation from last message:
if I got my numbers right, for present-day hardware (1gige/10gige data rates, 100 Hz max locking rate), we should
increase the default buffer write cache size from 100 kbytes to 10 Mbytes.
this cache size will permit processing of the full mix of small/big events
at the full mix of event rates without exceeding the 100 Hz semaphore locking rate.
with the 10 Mbyte write cache, default event buffer size should be 30-40 Mbytes (current size is 33 Mbytes, so does
not need to change).
this computation is for 1 writer (1 reader, mlogger). it is a typical case for our experiments.
multiple writers can run into contention for event buffer space.
consider 10 writers want to flush their 10 Mbyte write cache all at the same time:
if buffer size is the default 33 Mbytes, the first 3 writers will have successful write cache flush,
but the other 7 will stall, there is no space in the buffer, we have to wait for mlogger to free
some (mlogger writing X Mbytes/sec will take Y milliseconds to liberate 10 Mbytes of space for the 4th writer
to successfully flush, writers 5..10 are still stalled).
but in a system with 10 writers writing at 10 Mbytes/sec (1 Hz default cache flush rate) is 100 Mbytes/sec
will likely have SYSTEM buffer size at least 200-300 Mbytes (to buffer 1-2 seconds of data against
any delays in writing to disk/network storage).
so there should be no problem in practice.
K.O. |
12 Aug 2020, Yan Liu, Suggestion, adding db_get_mode ti check access mode for keys
|
Hello,
I am wondering if there is a function that checks the access mode for a key? I
found the db_set_mode() function that allows me to set the access mode for a key,
but failed to find its counterpart get function.
Thanks in advance,
Yan |
13 Aug 2020, Stefan Ritt, Suggestion, adding db_get_mode ti check access mode for keys
|
> Hello,
>
> I am wondering if there is a function that checks the access mode for a key? I
> found the db_set_mode() function that allows me to set the access mode for a key,
> but failed to find its counterpart get function.
>
> Thanks in advance,
> Yan
KEY k;
db_get_key(hDB, handle, &k);
std::cout << k.access_mode << std::endl;
/Stefan |
13 Aug 2020, Yan Liu, Suggestion, adding db_get_mode ti check access mode for keys
|
Thank you!
Yan
> > Hello,
> >
> > I am wondering if there is a function that checks the access mode for a key? I
> > found the db_set_mode() function that allows me to set the access mode for a key,
> > but failed to find its counterpart get function.
> >
> > Thanks in advance,
> > Yan
>
>
> KEY k;
> db_get_key(hDB, handle, &k);
> std::cout << k.access_mode << std::endl;
>
> /Stefan |
22 Jun 2012, Zisis Papandreou, Info, adding 2nd ADC and TDC to crate  
|
Hi folks:
we've been running midas-1.9.5 for a few years here at Regina. We are now
working on a larger cosmic ray testing that requires a second ADC and second TDC
module in our Camac crate (we use the hytek1331 controller by the way). We're
baffled as to how to set this up properly. Specifically we have tried:
frontend.c
/* number of channels */
#define N_ADC 12
(changed this from the old '8' to '12', and it seems to work for Lecroy 2249)
#define SLOT_ADC0 10
#define SLOT_TDC0 9
#define SLOT_ADC1 15
#define SLOT_TDC1 14
Is this the way to define the additional slots (by adding 0, 1 indices)?
Also, we were not able to get a new bank (ADC1) working, so we used a loop to
tag the second ADC values onto those of the first.
If someone has an example of how to handle multiple ADCs and TDCs and
suggestions as to where changes need to be made (header files, analyser, etc)
this would be great.
Thanks, Zisis...
P.S. I am attaching the relevant files. |
30 Apr 2022, Konstantin Olchanski, Info, added web pages for "show odb clients" and "show open records"
|
for a long time, midas web pages have been missing the equivalent of odbedit
"scl" and "sor" to display current odb clients and current odb open records.
this is now added as buttons "show open records" and "show odb clients" in the
odb editor page.
as in odbedit, "sor" shows open records under the current subtree, i.e. if you
are looking at /equipment, you will not see open records for /experiment. to see
all open records, go to "/".
commit b1ab7e67ecf785744fff092708d8389f222b14a4
K.O. |
04 May 2022, Stefan Ritt, Info, added web pages for "show odb clients" and "show open records"
|
Concerning the "scl" page, we are currently having a discussion. At the moment, one can
see midas clients in three different places:
1) the main status page at the bottom, only names and hosts are there
2) the programs page, where one can also start/stop program
3) now the new page "Show ODB clients" in the ODB editor page, which shows also the
alive status, PID and timeout
I'm thinking that three locations are two too much, so we are considering to merge the
tree pages into one. That would mean that 1) goes away, and the "Programs" page will
show more information. We have some rare cases that programs are removed from
/System/Clients in the ODB but still attached to the ODB. For those "zombies" we would
add a "hard kill" function.
I would like to hear feedback from the midas community before we proceed with the
plans. Anybody desperately in need of the programs shown on the status page?
Best,
Stefan |
01 May 2022, Konstantin Olchanski, Info, added web page for "mdump"
|
added JSON RPC for bm_receive_event() and added a web page for "mdump".
the event dump is a hex dump for now.
if somebody can contribute a javascript decoder for midas bank format, it would be greatly appreciated.
otherwise, I will eventually write my own decoder library patterned on midasio.h and midasio.cxx.
as of commit 5882d55d1f5bbbdb0d9238ada639e63ac27d8825
K.O. |
01 May 2022, Konstantin Olchanski, Info, added web page for "mdump"
|
> added JSON RPC for bm_receive_event()
there is a number of problems with implementing bm_receive_event() as a RPC:
1) mhttpd has only event buffer 1 read pointer for all javascript connections, if two browser tabs are
running mdump, they will "steal" events from each other.
2) javascript connections are state-less and we cannot specify per-connection event_id and trigger_mask
filters to bm_receive_event(). our bm_request_event() has to be for all event_id and all trigger_mask.
3) for same reason, we cannot have some requests to be GET_ALL, some to be GET_RECENT and some to be
GET_OLD (if GET_OLD is ever implemented).
Problem (1) is hard to fix. Only solution I can see is to have mhttpd have it's own event buffer that can
somehow track which events have been sent to which javascript connection.
The same scheme allows implementing GET_ALL and per-connection event_id and trigger_mask filters.
The difficulty is in detecting javascript connections that are no longer active and it's event request and
events we have buffered for it can be deleted. Unlike proper rpc clients, javascript browser tabs can be
closed without warning and without opportunity to tell rpc server that they are closed, gone.
K.O. |
01 May 2022, Konstantin Olchanski, Info, added web page for "mdump"
|
> added a web page for "mdump".
missing functions:
- get a list of existing event buffers (should read event buffer names from /Experiment/Buffer sizes)
- selector box to select event buffer
- button for "get next" and "get new" (should call bm_skip_event() before bm_receive_event())
- entry fields for event_id and trigger_mask event filter
- check box for "keep getting new data" and entry field for update frequency
- (eventually) entry field for bank name filter
K.O. |
02 May 2022, Stefan Ritt, Info, added web page for "mdump"
|
Here are some of my thoughts:
- I volunteer to write the JavaScript midas bank decoder. Just a couple of pure javascript functions, no
midasio.cxx library needed.
- If different javascript connections "steal" events from each other, I would not be concerned. Actually I
would rather like that all connections see the SAME event. So mhttpd keeps one event, serves it to all
links, so displays are consistent. If a browser wants to see the "next" event, it send the old serial
number and days "please send next event AFTER serial number". If the serial number is larger than the
event in the buffer, mhttpd fetches a new event and puts it into its buffer.
- Since javascript connections are connectionless, I would rather pass event_id and trigger_mask with each
request. Then mhttpd can retrieve events until event_id and trigger_mask match, then serve that event.
Since reading events from a midas buffer is fast (many 10'000s of events per second), the won't be much of
a delay.
- GET_ALL does not make sense for browsers, you don't want to slow down any frontend. If someone wants to
do histogramming in the browser, then GET_SOME (which is kind of GET_OLD) would make sense, but most of
the cases we have some single event display, and there a GET_RECENT is most appropriate. |
14 Feb 2024, Konstantin Olchanski, Bug Fix, added ubuntu-22 to nightly build on bitbucket, now need python!
|
> Are we running these tests as part of the nightly build on bitbucket? They would be part of
> the "make test" target. Correct python dependancies may need to be added to the bitbucket OS
> image in bitbucket-pipelines.yml. (This is a PITA to get right).
I added ubuntu-22 to the nightly builds.
but I notice the build says "no python" and I am not sure what packages I need to install for
midas python to work.
Ben, can you help me with this?
https://bitbucket.org/tmidas/midas/pipelines/results/1106/steps/%7B9ef2cf97-bd9f-4fd3-9ca2-9c6aa5e20828%7D
K.O. |
24 May 2024, Konstantin Olchanski, Info, added ubuntu 22 arm64 cross-compilation
|
Ubuntu 22 has almost everything necessary to cross-build arm64 MIDAS frontends:
# apt install g++-12-aarch64-linux-gnu gcc-12-aarch64-linux-gnu-base libstdc++-12-dev-arm64-cross
$ aarch64-linux-gnu-gcc-12 -o ttcp.aarch64 ttcp.c -static
to cross-build MIDAS:
make arm64_remoteonly -j
run programs from $MIDASSYS/linux-arm64-remoteonly/bin
link frontends to libraries in $MIDASSYS/linux-arm64-remoteonly/lib
Ubuntu 22 do not provide an arm64 libz.a, as a workaround, I build a fake one. (we do not have HAVE_ZLIB anymore...). or you
can link to libz.a from your arm64 linux image, assuming include/zlib.h are compatible.
K.O. |
11 Oct 2017, Konstantin Olchanski, Info, added support for ucLinux
|
Support for building for ucLinux was added to MIDAS. I use the emcraft toolchain and userland on
some kind of embedded ARM CPU that does not have an MMU. See the Makefile for details. The
main difference of ucLinux is lack of fork(), which cannot be done without an MMU. Not everything
works, but at the least I can run a frontend and connect to an experiment on a remote host
computer (mserver connection). K.O. |
02 May 2017, Konstantin Olchanski, Info, added db_resize_string()
|
> Since we have been regularly running into problems with db_get_xxx(TID_STRING) and string buffers of mismatched size,
> I now implemented db_get_value_string(hdb, hkey, key_name, index, &string, create).
I run into problems with string arrays - non-array strings have unlimited length, but string arrays have fixed string length, usually set at creation time.
This causes a problem with growing arrays using db_get_value_string(), when converting a non-array variable to an array, the wrong
string length gets used, and one gets an array with useless string length. There is no way to specify the correct array string length
without adding more parameters to db_get_value_string() and confusing and complicating it for the typical case where it is used
against simple (non-array) odb entries.
To clarify the situation, db_get_value_string() was changed to reject attempts to resize an array and
calls of db_get_value_string(index>0 and create==TRUE) now return an error.
To create and resize string arrays, I added a new function - db_resize_array(hdb, hkey, key_name, num_values, max_string_size).
Here,
num_values is the new array size, making it possible to grow or shrink an array
max_string_size is the new string size, making it possible to change the array string length after the array was created (there was no midas function to do this before now).
I added a json-rpc call for db_resize_string().
But it still needs to be added to odbedit and mhttpd.
K.O. |
26 Apr 2017, Konstantin Olchanski, Info, added db_get_value_string()
|
Since we have been regularly running into problems with db_get_xxx(TID_STRING) and string buffers of mismatched size,
I now implemented db_get_value_string(hdb, hkey, key_name, index, &string, create).
It works the same as db_get_value(TID_STRING), except that the string value is returned into an std::string object,
memory allocation is handled by std::string and there is no string length limit (other than std::string limits).
Accessing string arrays is done explicitly via an "index" parameter, if index is bigger than odb array size DB_OUT_OF_RANGE is returned
without logging an error message (e.g. db_get_data_index() will log an error). This makes is safe to iterate over array entries with a simple
loop of index from 0 and up until db_get returns an error.
As before, if the odb entry does not exist, it will be created (if create==true) and initialized with the value of the string parameter (zero-terminated in odb).
There is also newly added db_set_value_string() and cm_get_path_string(). if you want more of these, please ask, or send patches.
K.O. |
26 Apr 2017, Stefan Ritt, Info, added db_get_value_string()
|
Just some thought for discussion:
Rather than "spicing up" the MIDAS library here and there with C++ objects such as std::string, wouldn't it make more sense to "cleanly" wrap an ODB value in a C++ class? We could use then
both APIs in parallel, and encourage the C++ API for new developments. We could then write things like:
ODBKEY<std::string> name("/Experiment/Name"); // constructor calls automatically db_get_value
name = "New Name"; // overloading the "=" operator, will call db_set_value()
or even
ODBKEY<std::vector, std::string> nameArray("...");
for (auto &s : nameArray)
std::cout << s << std::endl; // print all elements of string array
so we treat ODB arrays as vectors, which fixes array boundary violations nicely.
If the key does not exist, we could properly throw exceptions and forget about tons of nested return parameters for error conditions.
Many nice things could be done, common errors could be prevented, and we can do a "smooth" migration: We don't have to change the whole library completely, just where we feel it's currently
needed. So over time the code would be "objectified". Would be nice if we could rely on C++11 (like the "auto" feature above). Not sure about VxWorks, but every other OS should be fine.
Stefan
> Since we have been regularly running into problems with db_get_xxx(TID_STRING) and string buffers of mismatched size,
> I now implemented db_get_value_string(hdb, hkey, key_name, index, &string, create).
>
> It works the same as db_get_value(TID_STRING), except that the string value is returned into an std::string object,
> memory allocation is handled by std::string and there is no string length limit (other than std::string limits).
>
> Accessing string arrays is done explicitly via an "index" parameter, if index is bigger than odb array size DB_OUT_OF_RANGE is returned
> without logging an error message (e.g. db_get_data_index() will log an error). This makes is safe to iterate over array entries with a simple
> loop of index from 0 and up until db_get returns an error.
>
> As before, if the odb entry does not exist, it will be created (if create==true) and initialized with the value of the string parameter (zero-terminated in odb).
>
> There is also newly added db_set_value_string() and cm_get_path_string(). if you want more of these, please ask, or send patches.
>
> K.O. |
|