ID |
Date |
Author |
Topic |
Subject |
3019
|
01 Apr 2025 |
Konstantin Olchanski | Suggestion | Sequencer ODBSET feature requests | > ODBSET "/Path/value[1,3,5]"
> ODBSET "/Path/value[1-5,7-9]"
we support this array index syntax in several places,
specifically, in javascript odb get and set mjsonrpc RPCs.
> SET GOODCHANNELS, "1-5,7,9"; ODBSET "/Path/value[$GOODCHANNELS]"
> SET BADCHANNELS, "6,8"; ODBSET "/Path/value[!$BADCHANNELS]"
> ODBSET "/Path/value[0-100, except $BADCHANNELS]"
this is very clever syntax, but I have not seen any programming
language actually implement it (not even perl).
there must be a good reason why nobody does this. probably we should not do it either.
but as Stefan said (and my opinion), the route of extending MIDAS sequencer
language until it becomes a superset of python, perl, tcl, bash, javascript
and algol is not a sustainable approach. I once looked at using LUA for this,
but I think basing off an full featured programming language like python
is better.
K.O. |
3020
|
01 Apr 2025 |
Konstantin Olchanski | Bug Report | MIDAS history system not using the event timestamps ? | > I confirm that when writing out history files corresponding to the slow control event data,
> MIDAS history system timestamps the data not with the event time coming from the event data,
> but with the current time determined by [mlogger].
This is correct. The timestamp in the history file is the mlogger timestamp.
In theory we could use the ODB "last_written" timestamp, but in practice,
timestamps are 1 second granularity and the difference between the two
timestamps normally would be less than 1 second. (time to react to db_watch()).
But ODB last_written also is not the data timestamp. For remote connected clients
it includes the mserver communication delay.
What is the data timestamp, only the user knows - for some FPGA based equipments,
I can see the data timestamp being read from an FPGA register together with the data.
But back to earth.
For making history plots, 1 second granularity with a small (a few seconds) delay should be okey,
and I think the mserver timestamp is good enough.
For data analysis, you are reading history data from a history data file and you are
not constrained to using the MIDAS timestamp.
You can always include your "true" data timestamp as the first value in your data.
We do this in felaview for writing labview data to midas history in the ALPHA antihydrogen experiment at CERN.
This also anticipates your next request, can we have millisecond, microsecond, nanosecond history timestamps:
since you define your "true" data timestamp, you an make it anything you want. (I use "double" time in seconds,
64-bit IEEE-754 "double" has enough precision for microsecond granularity. FPGA based devices can have timestamps
with 10 ns or 8 ns granularity, in this case a uint64_t clock counter could be more appropriate).
K.O. |
3023
|
02 Apr 2025 |
Konstantin Olchanski | Bug Report | MIDAS history system not using the event timestamps ? | > > You can always include your "true" data timestamp as the first value in your data.
>
> Are you saying that if the first data word of a history event were a timestamp,
> the MIDAS history system, when plotting the time dependencies, would use that timestamp
> instead of the mlogger timestamp?
>
you are correct, midas knows nothing about what you put in the history data.
what I suggested is: if you want your true data timestamp recorded in the history,
you can put it into the history data yourself, and I suggested using the 1st value,
but you can also make it the last value or the 10th value, it is up to you.
for making history plots, the history timestamp is used, as you wrote and I confirmed,
this timestamp is generated by mlogger.
what is not clear to me is why this is a problem? do you see a big difference between the
true data timestamp and the mlogger data timestamp? bigger than 1 second? (this would change
the shape of "last 10 minutes" plots (600 seconds). bigger than 1 minute? (this would change
the shape of "last 1 hour plots" (60 minutes, 3600 seconds).
that said, note that we currently store the timestamp as a DWORD 32-bit UNIX time value
which will overflow in 2038 and which is quickly becoming incompatible with the ongoing
switch to 64-bit time_t. Ubuntu-24 already build a large number of system libraries with 64-
bit time_t and building MIDAS with 32-bit time_t may soon become as difficult as building
32-bit MIDAS for 32-bit i686 VME processors. we have to move with the times.
what it means is that the history system data format will have to be updated to 64-bit
time_t and at the same time, we may try to change the timestamp from mlogger-generated to
frontend-generated.
but it is still not clear to me how that helps you, because the frontend-generated timestamp
is not the true data timestamp that you wanted. (and only you know what the true data
timestamp is and where it comes from and how to tell it to MIDAS).
K.O. |
3024
|
02 Apr 2025 |
Konstantin Olchanski | Suggestion | Sequencer ODBSET feature requests | > I once looked at using LUA for this
>
> > but I think basing off an full featured programming language like python
> > is better.
>
> if it came to a vote, my vote would go to Lua: it would allow to do everything needed,
> with much less external dependencies and with much less motivation to over-use the interpreter.
> The CMS experience was very teaching in this respect...
Unfortunately I am only slightly aware of Lua to say how nicve or how bad it is. And we are
not sure how well it supports the single-line-stepping that permits the nice graphical
visualization of Stefan's sequencer.
It looks like python has the single-line-stepping built-in as a standard feature
and python is a more popular and more versatile machine, so to me python looks
like a better choice compared to lua (obscure), perl ("nobody uses it anymore")
or bash (ugly syntax).
K.O. |
3034
|
05 May 2025 |
Konstantin Olchanski | Bug Fix | Bug fix in SQL history | A bug was introduced to the SQL history in 2022 that made renaming of variable names not work. This is now fixed.
break commit:
54bbc9ed5d65d8409e8c9fe60b024e99c9f34a85
fix commit:
159d8d3912c8c92da7d6d674321c8a26b7ba68d4
P.S.
This problem was caused by an unfortunate design of the c++ class system. If I want to add more data to an existing
class, I write this:
class old_class {
int i,j,k;
}
class bigger_class: public old_class {
int additional_variable;
}
But if I have this:
struct x { int i,j; }
class y {
std::vector<x> array_of_x;
}
and I want to add "k" to "x", c++ has not way to do this. history code has this workaround:
class bigger_y: public y
{
std::vector<int> array_of_k;
}
int bigger_y:foo(int n) {
printf("%d %d %d\", array_of_x[n].i, array_of_x[n].j, array_of_k[n]);
}
problem is that it is not obvious that "array_of_x" and "array_of_k" are connected
and they can easily get out of sync (if elements are added or removed). this is the
bug that happened in the history code. I now added assert(array_of_x.size()==array_of_k.size())
to offer at least some protection going forward.
P.S. As final solution I think I want to completely separate file history and sql history code,
they have more things different than common.
K.O. |
3035
|
05 May 2025 |
Konstantin Olchanski | Info | db_delete_key(TRUE) | I was working on an odb corruption crash inside db_delete_key() and I noticed
that I did not test db_delete_key() with follow_links set to TRUE. Then I noticed
that nobody nowhere seems to use db_delete_key() with follow_links set to TRUE.
Instead of testing it, can I just remove it?
This feature existed since day 1 (1st commit) and it does something unexpected
compared to filesystem "/bin/rm": the best I can tell, it is removes the link
*and* whatever the link points to. For people familiar with "/bin/rm", this is
somewhat unexpected and by my thinking, if nobody ever added such a feature to
"/bin/rm", it is probably not considered generally useful or desirable. (I would
think it dangerous, it removes not 1 but 2 files, the 2nd file would be in some
other directory far away from where we are).
By this thinking, I should remove "follow_links" (actually just make it do thing
, to reduce the disturbance to other source code). db_delete_key() should work
similar to /bin/rm aka the unlink() syscall.
K.O. |
3036
|
05 May 2025 |
Konstantin Olchanski | Bug Report | abort and core dump in cm_disconnect_experiment() | I noticed that some programs like mhist, if they take too long, there is an abort and core dump at the very end. This is because they forgot to
set/disable the watchdog timeout, and they got remove from odb and from the SYSMSG event buffer.
mhist is easy to fix, just add the missing call to disable the watchdog, but I also see a similar crash in the mserver which of course requires
the watchdog.
In either case, the crash is in cm_disconnect_experiment() where we know we are shutting down and we know there is no useful information in the
core dump.
I think I will fix it by adding a flag to bm_close_buffer() to bypass/avoid the crash from "we are already removed from this buffer".
Stack trace from mhist:
[mhist,ERROR] [midas.cxx:5977:bm_validate_client_index,ERROR] My client index 6 in buffer 'SYSMSG' is invalid: client name '', pid 0 should be my
pid 3113263
[mhist,ERROR] [midas.cxx:5980:bm_validate_client_index,ERROR] Maybe this client was removed by a timeout. See midas.log. Cannot continue,
aborting...
bm_validate_client_index: My client index 6 in buffer 'SYSMSG' is invalid: client name '', pid 0 should be my pid 3113263
bm_validate_client_index: Maybe this client was removed by a timeout. See midas.log. Cannot continue, aborting...
Program received signal SIGABRT, Aborted.
Download failed: Invalid argument. Continuing without source file ./nptl/./nptl/pthread_kill.c.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
warning: 44 ./nptl/pthread_kill.c: No such file or directory
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007ffff71df27e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007ffff71c28ff in __GI_abort () at ./stdlib/abort.c:79
#5 0x00005555555768b4 in bm_validate_client_index_locked (pbuf_guard=...) at /home/olchansk/git/midas/src/midas.cxx:5993
#6 0x000055555557ed7a in bm_get_my_client_locked (pbuf_guard=...) at /home/olchansk/git/midas/src/midas.cxx:6000
#7 bm_close_buffer (buffer_handle=1) at /home/olchansk/git/midas/src/midas.cxx:7162
#8 0x000055555557f101 in cm_msg_close_buffer () at /home/olchansk/git/midas/src/midas.cxx:490
#9 0x000055555558506b in cm_disconnect_experiment () at /home/olchansk/git/midas/src/midas.cxx:2904
#10 0x000055555556d2ad in main (argc=<optimized out>, argv=<optimized out>) at /home/olchansk/git/midas/progs/mhist.cxx:882
(gdb)
Stack trace from mserver:
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=138048230684480) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=138048230684480) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=138048230684480) at ./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=138048230684480, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3 0x00007d8ddbc4e476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4 0x00007d8ddbc347f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x000059beb439dab0 in bm_validate_client_index_locked (pbuf_guard=...) at /home/dsdaqdev/packages_common/midas/src/midas.cxx:5993
#6 0x000059beb43a859c in bm_get_my_client_locked (pbuf_guard=...) at /home/dsdaqdev/packages_common/midas/src/midas.cxx:6000
#7 bm_close_buffer (buffer_handle=<optimized out>) at /home/dsdaqdev/packages_common/midas/src/midas.cxx:7162
#8 0x000059beb43a89af in bm_close_all_buffers () at /home/dsdaqdev/packages_common/midas/src/midas.cxx:7256
#9 bm_close_all_buffers () at /home/dsdaqdev/packages_common/midas/src/midas.cxx:7243
#10 0x000059beb43afa20 in cm_disconnect_experiment () at /home/dsdaqdev/packages_common/midas/src/midas.cxx:2905
#11 0x000059beb43afdd8 in rpc_check_channels () at /home/dsdaqdev/packages_common/midas/src/midas.cxx:16317
#12 0x000059beb43b0cf5 in rpc_server_loop () at /home/dsdaqdev/packages_common/midas/src/midas.cxx:15858
#13 0x000059beb4390982 in main (argc=9, argv=0x7ffc07e5bed8) at /home/dsdaqdev/packages_common/midas/progs/mserver.cxx:387
K.O. |
3040
|
16 May 2025 |
Konstantin Olchanski | Bug Report | history_schema.cxx fails to build | > we have a CI setup which fails since 06.05.2025 to build the history_schema.cxx.
> There was a major change in this code in the commits fe7f6a6 and 159d8d3.
Missing from this report is critical information: HAVE_PGSQL is set.
I will have to check why it is not set in my development account.
I will have to check why it is not set in our bitbucket build.
Thank you for reporting this problem.
K.O. |
3041
|
16 May 2025 |
Konstantin Olchanski | Bug Report | history_schema.cxx fails to build | > > we have a CI setup which fails since 06.05.2025 to build the history_schema.cxx.
> > There was a major change in this code in the commits fe7f6a6 and 159d8d3.
>
> Missing from this report is critical information: HAVE_PGSQL is set.
>
> I will have to check why it is not set in my development account.
>
The following is needed to build MySQL and PgSQL support in MIDAS,
they were missing on my development machine. MySQL support was enabled
by accident because kde-bloat packages pull in the MySQL (not the MariaDB)
client and server. Fixed now, added to standard list of Ubuntu packages:
https://daq00.triumf.ca/DaqWiki/index.php/Ubuntu#install_missing_packages
apt -y install mariadb-client libmariadb-dev ### mysql client for MIDAS
apt -y install postgresql-common libpq-dev ### postgresql client for MIDAS
>
> I will have to check why it is not set in our bitbucket build.
>
Added MySQL and PgSQL to bitbucket Ubuntu-24 build (sqlite was already enabled).
>
> Thank you for reporting this problem.
>
Fix committed. Sorry about this problem.
K.O. |
3050
|
04 Jun 2025 |
Konstantin Olchanski | Bug Report | Memory leak in mhttpd binary RPC code | Noted. I will look at this asap. K.O.
[quote="Mark Grimes"]Hi,
During an evening of running we noticed that memory usage of mhttpd grew to
close to 100Gb. We think we've traced this to the following issue when making
RPC calls.
[LIST]
[*] The brpc method allocates memory for the response at
[URL=https://bitbucket.org/tmidas/midas/src/67db8627b9ae381e5e28800dfc4c350c5bd0
5e3f/src/mjsonrpc.cxx#lines-3449]src/mjsonrpc.cxx#lines-3449[/URL].
[*] It then makes the call at
[URL=https://bitbucket.org/tmidas/midas/src/67db8627b9ae381e5e28800dfc4c350c5bd0
5e3f/src/mjsonrpc.cxx#lines-3460]src/mjsonrpc.cxx#lines-3460[/URL], which may
set `buf_length` to zero if the response was empty.
[*] It then uses `MJsonNode::MakeArrayBuffer` to pass ownership of the memory to
an `MJsonNode`, providing `buf_length` as the size.
[*] When the `MJsonNode` is destructed at
[URL=https://bitbucket.org/tmidas/mjson/src/9d01b3f72722bbf7bcec32ae218fcc0825cc
9e7f/mjson.cxx#lines-657]mjson.cxx#lines-657[/URL], it only calls `free` on the
buffer if the size is greater than zero.
[/LIST]
Hence, mhttpd will leak at least 1024 bytes for every binary RPC call that
returns an empty response.
I tried to submit a pull request to fix this but I don't have permission to push
to https://bitbucket.org/tmidas/mjson.git. Could somebody take a look?
Thanks,
Mark.[/quote] |
3055
|
10 Jun 2025 |
Konstantin Olchanski | Bug Report | Memory leak in mhttpd binary RPC code | I confirm that MJSON_ARRAYBUFFER does not work correctly for zero-size buffers,
buffer is leaked in the destructor and copied as NULL in MJsonNode::Copy().
I also confirm memory leak in mjsonrpc "brpc" error path (already fixed).
Affected by the MJSON_ARRAYBUFFER memory leak are "brpc" (where user code returns
a zero-size data buffer) and "js_read_binary_file" (if reading from an empty
file, return of "new char[0]" is never freed).
"receive_event" and "read_history" RPCs never use zero-size buffers and are not
affected by this bug.
mjson commit c798c1f0a835f6cea3e505a87bbb4a12b701196c
midas commit 576f2216ba2575b8857070ce7397210555f864e5
rootana commit a0d9bb4d8459f1528f0882bced9f2ab778580295
Please post bug reports a plain-text so I can quote from them.
K.O. |
1833
|
14 Feb 2020 |
Konrad Briggl | Forum | Writting Midas Events via FPGAs | Hello Stefan,
is there a difference for the later data processing (after writing the ring buffer blocks)
if we write single events or multiple in one rb_get_wp - memcopy - rb_increment_wp cycle?
Both Marius and me have seen some inconsistencies in the number of events produced that is reported in the status page when writing multiple events in one go,
so I was wondering if this is due to us treating the buffer badly or the way midas handles the events after that.
Given that we produce the full event in our (FPGA) domain, an option would be to always copy one event from the dma to the midas-system buffer in a loop.
The question is if there is a difference (for midas) between
[pseudo code, much simplified]
while(dma_read_index < last_dma_write_index){
if(rb_get_wp(pdata)!=SUCCESS){
dma_read_index+=event_size;
continue;
}
copy_n(dma_buffer, pdata, event_size);
rb_increment_wp(event_size);
dma_read_index+=event_size;
}
and
while(dma_read_index < last_dma_write_index){
if(rb_get_wp(pdata)!=SUCCESS){
...
};
total_size=max_n_events_that_fit_in_rb_block();
copy_n(dma_buffer, pdata, total_size);
rb_increment_wp(total_size);
dma_read_index+=total_size;
}
Cheers,
Konrad
> The rb_xxx function are (thoroughly tested!) robust against high data rate given that you use them as intended:
>
> 1) Once you create the ring buffer via rb_create(), specify the maximum event size (overall event size, not bank size!). Later there is no protection any more, so if you obtain pdata from rb_get_wp, you can of course write 4GB to pdata, overwriting everything in your memory, causing a total crash. It's your responsibility to not write more bytes into pdata then
> what you specified as max event size in rb_create()
>
> 2) Once you obtain a write pointer to the ring buffer via rb_get_wp, this function might fail when the receiving side reads data slower than the producing side, simply because the buffer is full. In that case the producing side has to wait until space is freed up in the buffer by the receiving side. If your call to rb_get_wp returns DB_TIMEOUT, it means that the
> function did not obtain enough free space for the next event. In that case you have to wait (like ss_sleep(10)) and try again, until you succeed. Only when rb_get_wp() returns DB_SUCCESS, you are allowed to write into pdata, up to the maximum event size specified in rb_create of course. I don't see this behaviour in your code. You would need something
> like
>
> do {
> status = rb_get_wp(rbh, (void **)&pdata, 10);
> if (status == DB_TIMEOUT)
> ss_sleep(10);
> } while (status == DB_TIMEOUT);
>
> Best,
> Stefan
>
>
> > Dear all,
> >
> > we creating Midas events directly inside a FPGA and send them off via DMA into the PC RAM. For reading out this RAM via Midas the FPGA sends as a pointer where it has written the last 4kB of data. We use this pointer for telling the ring buffer of midas where the new events are. The buffer looks something like:
> >
> > // event 1
> > dma_buf[0] = 0x00000001; // Trigger and Event ID
> > dma_buf[1] = 0x00000001; // Serial number
> > dma_buf[2] = TIME; // time
> > dma_buf[3] = 18*4-4*4; // event size
> > dma_buf[4] = 18*4-6*4; // all bank size
> > dma_buf[5] = 0x11; // flags
> > // bank 0
> > dma_buf[6] = 0x46454230; // bank name
> > dma_buf[7] = 0x6; // bank type TID_DWORD
> > dma_buf[8] = 0x3*4; // data size
> > dma_buf[9] = 0xAFFEAFFE; // data
> > dma_buf[10] = 0xAFFEAFFE; // data
> > dma_buf[11] = 0xAFFEAFFE; // data
> > // bank 1
> > dma_buf[12] = 0x1; // bank name
> > dma_buf[12] = 0x46454231; // bank name
> > dma_buf[13] = 0x6; // bank type TID_DWORD
> > dma_buf[14] = 0x3*4; // data size
> > dma_buf[15] = 0xAFFEAFFE; // data
> > dma_buf[16] = 0xAFFEAFFE; // data
> > dma_buf[17] = 0xAFFEAFFE; // data
> >
> > // event 2
> > .....
> >
> > dma_buf[fpga_pointer] = 0xXXXXXXXX;
> >
> >
> > And we do something like:
> >
> > while{true}
> > // obtain buffer space
> > status = rb_get_wp(rbh, (void **)&pdata, 10);
> > fpga_pointer = fpga.read_last_data_add();
> >
> > wlen = last_fpga_pointer - fpga_pointer; \\ in 32 bit words
> > copy_n(&dma_buf[last_fpga_pointer], wlen, pdata);
> > rb_status = rb_increment_wp(rbh, wlen * 4); \\ in byte
> >
> > last_fpga_pointer = fpga_pointer;
> >
> > Leaving the case out where the dma_buf wrap around this works fine for a small data rate. But if we increase the rate the fpga_pointer also increases really fast and wlen gets quite big. Actually it gets bigger then max_event_size which is checked in rb_increment_wp leading to an error.
> >
> > The problem now is that the event size is actually not to big but since we have multi events in the buffer which are read by midas in one step. So we think in this case the function rb_increment_wp is comparing actually the wrong thing. Also increasing the max_event_size does not help.
> >
> > Remark: dma_buf is volatile so memcpy is not possible here.
> >
> > Cheers,
> > Marius |
357
|
02 Mar 2007 |
Kevin Lynch | Forum | event builder scalability | > Hi there:
> I have a question if there's anybody out there running MIDAS with event builder
> that assembles events from more that just a few front ends (say on the order of
> 0x10 or more)?
> Any experiences with scalability?
>
> Cheers
> Piotr
Mulan (which you hopefully remember with great fondness :-) is currently running
around ten frontends, six of which produce data at any rate. If I'm remembering
correctly, the event builder handles about 30-40MB/s. You could probably ping Tim
Gorringe or his current postdoc Volodya Tishenko (tishenko@pa.uky.edu) if you want
more details. Volodya solved a significant number of throughput related
bottlenecks in the year leading up to our 2006 run. |
1225
|
15 Dec 2016 |
Kevin Giovanetti | Bug Report | midas.h error | creating a frontend on MAC Sierra OSX 10
include the midas.h file and when compiling with XCode I get an error based on
this entry in the midas.h include
#if !defined(OS_IRIX) && !defined(OS_VMS) && !defined(OS_MSDOS) &&
!defined(OS_UNIX) && !defined(OS_VXWORKS) && !defined(OS_WINNT)
#error MIDAS cannot be used on this operating system
#endif
Perhaps I should not use Xcode?
Perhaps I won't need Midas.h?
The MIDAS system is running on my MAC but I need to add a very simple front end
for testing and I encounted this error. |
1404
|
30 Oct 2018 |
Joseph McKenna | Bug Report | Side panel auto-expands when history page updates |
One can collapse the side panel when looking at history pages with the button in
the top left, great! We want to see many pages so screen real estate is important
The issue we face is that when the page refreshes, the side panel expands. Can
we make the panel state more 'sticky'?
Many thanks
Joseph (ALPHA)
Version: 2.1
Revision: Mon Mar 19 18:15:51 2018 -0700 - midas-2017-07-c-197-g61fbcd43-dirty
on branch feature/midas-2017-10 |
1406
|
31 Oct 2018 |
Joseph McKenna | Bug Report | Side panel auto-expands when history page updates | > >
> >
> > One can collapse the side panel when looking at history pages with the button in
> > the top left, great! We want to see many pages so screen real estate is important
> >
> > The issue we face is that when the page refreshes, the side panel expands. Can
> > we make the panel state more 'sticky'?
> >
> > Many thanks
> > Joseph (ALPHA)
> >
> > Version: 2.1
> > Revision: Mon Mar 19 18:15:51 2018 -0700 - midas-2017-07-c-197-g61fbcd43-dirty
> > on branch feature/midas-2017-10
>
> Hi Joseph,
>
> In principle a page refresh should now not be necessary, since pages should reload automatically
> the contents which changes. If a custom page needs a reload, it is not well designed. If necessary, I
> can explain the details.
>
> Anyhow I implemented your "stickyness" of the side panel in the last commit to the develop branch.
>
> Best regards,
> Stefan
Hi Stefan,
I apologise for miss using the word refresh. The re-appearing sidebar was also seen with the automatic
reload, I have implemented your fix here and it now works great!
Thank you very much!
Joseph |
Draft
|
14 Oct 2019 |
Joseph McKenna | Forum | tmfe.cxx - Future frontend design | Hi,
I have been looking at the 2019 workshop slides, I am interested in the C++ future of MIDAS.
I am quite interested in using the object oriented
ALPHA will start data taking in 2021 |
1727
|
18 Oct 2019 |
Joseph McKenna | Info | sysmon: New system monitor and performance logging frontend added to MIDAS |
I have written a system monitor tool for MIDAS, that has been merged in the develop branch today: sysmon
https://bitbucket.org/tmidas/midas/pull-requests/8/system-monitoring-a-new-frontend-to-log/diff
To use it, simply run the new program
sysmon
on any host that you want to monitor, no configuring required.
The program is a frontend for MIDAS, there is no need for configuration, as upon initialisation it builds a history display for you. Simply run one instance per machine you want to monitor. By default, it only logs once per 10 seconds.
The equipment name is derived from the hostname, so multiple instances can be run across multiple machines without conflict. A new history display will be created for each host.
sysmon uses the /proc pseudo-filesystem, so unfortunately only linux is supported. It does however work with multiple architectures, so x86 and ARM processors are supported.
If the build machine has NVIDIA drivers installed, there is an additional version of sysmon that gets built: sysmon-nvidia. This will log the GPU temperature and usage, as well as CPU, memory and swap. A host should only run either sysmon or sysmon-nvidia
elog:1727/1 shows the History Display generated by sysmon-nvidia. sysmon would only generate the first two displays (sysmon/localhost and sysmon/localhost-CPU) |
Attachment 1: sysmon-gpu.png
|
|
1746
|
03 Dec 2019 |
Joseph McKenna | Info | mfe.c: MIDAS frontend's 'Equipment name' can embed hostname, determined at run-time | A little advertised feature of the modifications needed support the msysmon program is
that MIDAS equipment names can support the injecting of the hostname of the system
running the frontend at runtime (register_equipment(void)).
https://midas.triumf.ca/MidasWiki/index.php/Equipment_List_Parameters#Equipment_Name
A special string ${HOSTNAME} can be put in any position in the equipment name. It will
be replaced with the hostname of the computer running the frontend at run-time. Note,
the frontend_name string will be trimmed down to 32 characters.
Example usage: msysmon
EQUIPMENT equipment[] = {
{ "${HOSTNAME}_msysmon", /* equipment name */ {
EVID_MONITOR, 0, /* event ID, trigger mask */
"SYSTEM", /* event buffer */
EQ_PERIODIC, /* equipment type */
0, /* event source */
"MIDAS", /* format */
TRUE, /* enabled */
RO_ALWAYS, /* Read when running */
10000, /* poll every so milliseconds */
0, /* stop run after this event limit */
0, /* number of sub events */
1, /* history period */
"", "", ""
},
read_system_load,/* readout routine */
},
{ "" }
}; |
1891
|
01 May 2020 |
Joseph McKenna | Forum | Taking MIDAS beyond 64 clients |
Hi all,
I have been experimenting with a frontend solution for my experiment
(ALPHA). The intention to replace how we log data from PCs running LabVIEW.
I am at the proof of concept stage. So far I have some promising
performance, able to handle 10-100x more data in my test setup (current
limitations now are just network bandwith, MIDAS is impressively efficient).
==========================================================================
Our experiment has many PCs using LabVIEW which all log to MIDAS, the
experiment has grown such that we need some sort of load balancing in our
frontend.
The concept was to have a 'supervisor frontend' and an array of 'worker
frontend' processes.
-A LabVIEW client would connect to the supervisor, then be referred to a
worker frontend for data logging.
-The supervisor could start a 'worker frontend' process as the demand
required.
To increase accountability within the experiment, I intend to have a 'worker
frontend' per PC connecting. Then any rouge behavior would be clear from the
MIDAS frontpage.
Presently there around 20-30 of these LabVIEW PCs, but given how the group
is growing, I want to be sure that my data logging solution will be viable
for the next 5-10 years. With the increased use of single board computers, I
chose the target of benchmarking upto 1000 worker frontends... but I quickly
hit the '64 MAX CLIENTS' and '64 RPC CONNECTION' limit. Ok...
branching and updating these limits:
https://bitbucket.org/tmidas/midas/branch/experimental-beyond_64_clients
I have two commits.
1. update the memory layout assertions and use MAX_CLIENTS as a variable
https://bitbucket.org/tmidas/midas/commits/302ce33c77860825730ce48849cb810cf
366df96?at=experimental-beyond_64_clients
2. Change the MAX_CLIENTS and MAX_RPC_CONNECTION
https://bitbucket.org/tmidas/midas/commits/f15642eea16102636b4a15c8411330969
6ce3df1?at=experimental-beyond_64_clients
Unintended side effects:
I break compatibility of existing ODB files... the database layout has
changed and I read my old ODB as corrupt. In my test setup I can start from
scratch but this would be horrible for any existing experiment.
Edit: I noticed 'make testdiff' pipeline is failing... also fails locally...
investigating
Early performance results:
In early tests, ~700 PCs logging 10 unique arrays of 10 doubles into
Equipment variables in the ODB seems to perform well... All transactions
from client PCs are finished within a couple of ms or less
==========================================================================
Questions:
Does the community here have strong opinions about increasing the
MAX_CLIENTS and MAX_RPC_CONNECTION limits?
Am I looking at this problem in a naive way?
Potential solutions other than increasing the MAX_CLIENTS limit:
-Make worker threads inside the supervisor (not a separate process), I am
using TMFE, so I can dynamically create equipment. I have not yet taken a
deep dive into how any multithreading is implemented
-One could have a round robin system to load balance between a limited pool
of 'worker frontend' proccesses. I don't like this solution as I want to
able to clearly see which client PCs have been setup to log too much data
========================================================================== |
|