ID |
Date |
Author |
Topic |
Subject |
2587
|
16 Aug 2023 |
Konstantin Olchanski | Bug Report | excessive logging of http requests | > > Our default configuration of apache httpd logs every request.
> > MIDAS custom web pages can easily make a huge number of RPC calls creating a
> > huge log file and filling system disk to 100% capacity.
added "daily" to /etc/logrotate.d/httpd, default was "weekly", not often enough.
K.O. |
2591
|
17 Aug 2023 |
Konstantin Olchanski | Bug Report | Error accessing history files | Confirmed. The error message is wrong. It is printed after a short read(), but short read() does not
set errno, and errno reported by the error message is from some previous syscall. Corrected error
message is already committed. K.O.
> Tonight we got another error of that type after the update:
>
> 04:17 - [mhttpd,ERROR] [history_schema.cxx:2913:FileHistory::read_data,ERROR] Cannot read
> '/data2/history/mhf_1692128214_20230815_gassystem.dat', read() errno 2 (No such file or directory)
>
> This morning I looked at the file, and it was there:
>
> [meg@megon02 history]$ ls -alg mhf_1692128214_20230815_gassystem.dat
> -rw-rw-r--. 1 meg 4663228 Aug 17 08:50 mhf_1692128214_20230815_gassystem.dat
> [meg@megon02 history]$
>
>
> Stefan |
2592
|
17 Aug 2023 |
Konstantin Olchanski | Bug Report | excessive logging of http requests | > > > Our default configuration of apache httpd logs every request.
> > > MIDAS custom web pages can easily make a huge number of RPC calls creating a
> > > huge log file and filling system disk to 100% capacity.
> added "daily" to /etc/logrotate.d/httpd, default was "weekly", not often enough.
this should fix it good, make /var/log bigger:
[root@mpmt-test ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdc2 52403200 52296356 106844 100% /
[root@mpmt-test ~]#
[root@mpmt-test ~]# xfs_growfs /
data blocks changed from 13107200 to 106367750
[root@mpmt-test ~]#
[root@mpmt-test ~]# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdc2 425445400 52300264 373145136 13% /
K.O. |
2602
|
14 Sep 2023 |
Konstantin Olchanski | Forum | Hide start and stop buttons | I believe the original "hide run start / stop" was added specifically for ND280 GSC MIDAS. I do not know
why it was removed. "hide pause / resume" is still there. I will restore them. Hiding logger channel
section should probably be automatic of there is no /logger/channels, I can check if it works and what
happens if there is more than one logger channel. K.O. |
2614
|
03 Oct 2023 |
Konstantin Olchanski | Bug Fix | wrong array size after loading xml or json file | both the xml and the json decoders have a bug (fix pending). loading saved odb
from xml and json file did not truncate arrays in odb to the size of arrays in
the file. for example, if /example/double_array has size 20 in odb, but size 5
in xml or json file, after loading the file, array size is still 20.
this is unexpected: after loading an odb save file we expect odb to return to
same state as when odb save file was created. we do not expect some arrays to
have half of their elements restored from file and half their elements left
unchanged.
save and restore from .odb file does not have this problem.
I think this is a bug and I committed (but did not yet push) a fix for both xml
and json odb decoder.
I have run this problem while writing the new history panel editor, where
deleting variables did not work because json rpc db_paste() was not truncating
any arrays.
I am still finishing up the last few bits of the new history panel editor, and
there is a bit of time to discuss and comment this odb change before I push it
to midas.
K.O. |
2615
|
06 Oct 2023 |
Konstantin Olchanski | Bug Report | Error accessing history files | > Still get the same error with the latest version:
> 3:28 [mhttpd,ERROR] [history_schema.cxx:2913:FileHistory::read_data,ERROR] Cannot read
> '/data2/history/mhf_1692391703_20230818_hv_tc.dat', read() errno 2 (No such file or directory)
I figured it out. I claim defense of temporary insanity and old age senility.
1) I added the "short read" check in one place, missed the second place
2) writes of history were meant to be atomic, and they are atomic in my head, but not in the midas
code:
history_schema.cxx:HsFileSchema::write_event()
...
status = write(s->writer_fd, &t, 4);
if (status != 4) {
cm_msg(MERROR, "FileHistory::write_event", "Cannot write to \'%s\', write(timestamp) errno
%d (%s)", s->file_name.c_str(), errno, strerror(errno));
return HS_FILE_ERROR;
}
status = write(s->writer_fd, data, expected_size);
if (status != expected_size) {
cm_msg(MERROR, "FileHistory::write_event", "Cannot write to \'%s\', write(%d) errno %d
(%s)", s->file_name.c_str(), data_size, errno, strerror(errno));
return HS_FILE_ERROR;
}
...
that's not atomic, that's two separate writes. history reader hits the history file between the
two writes and gets a short read of 4 bytes timestamp instead of full record size. that's the
error message reported by mhttpd.
two fixes forthcoming:
a) check for short read in the 2nd place that I missed
b) two write() are replaced by 2 memcpy() to a preallocated buffer and 1 write()
Overall, I am pretty happy that this is the only bug in the FILE history code found in N years,
and it does not even cause data corruption...
K.O. |
2616
|
06 Oct 2023 |
Konstantin Olchanski | Bug Report | Error accessing history files | > two fixes forthcoming:
> a) check for short read in the 2nd place that I missed
> b) two write() are replaced by 2 memcpy() to a preallocated buffer and 1 write()
commit 713ec4a583365d57ffcd700ceeb09dcc14518295
K.O. |
2617
|
06 Oct 2023 |
Konstantin Olchanski | Info | default midas history switched to "FILE" and "PerVariable" history | We are very happy with the "FILE" implementation of MIDAS history and it is time
to make it the default for new experiments. This history driver works best if
"per variable" history is alos enabled. (SQL history already only works in "per-
variable" mode). commit 676051b3024965bd8a04da112965a141d5f61a39
K.O. |
2618
|
06 Oct 2023 |
Konstantin Olchanski | Info | new history panel editor | the new history panel editor has been activated. it is meant to work the same as
the old editor, with some improvements to the history variables selection page.
this new version is written in html+javascript and it will be easier to improve,
update and maintain compared to the old version written in c++. the old history
panel editor is still usable and accessible by pressing the "edit in old editor"
button. please report any problem, quirks and improvements in this thread or in
the bitbucket bug reports. K.O. |
2625
|
13 Nov 2023 |
Konstantin Olchanski | Forum | mlogger does not HAVE_ROOT | > I am setting up Midas (v2.1) for a new experiment. We want to save the data in the ROOT format. We installed ROOT from source (v6.28/06), and ROOTSYS is set. When we compile Midas, it says that it found ROOT. We set up a second logger channel where we set the filename to run%05d.root, the format to ROOT, and the output to ROOT. Nevertheless, when starting a run, the logger writes the error that "channel '1' requested ROOT output, but mlogger is built without HAVE_ROOT". From the CMake file, I would assume that it is set automatically if ROOT is found. Do you have any idea why the mlogger does not find ROOT or save the data in the ROOT format?
when you build midas using "make cmake", it prints information about packages that it finds (or does not). please post this here. it would be even more helpful if you post the whole output of "make cmake" (make cmake >& make.log, post make.log here as attachment).
historically, this problem has been a major annoyance over the years, mlogger would not find ROOT when needed, will find the wrong ROOT when not needed or ROOT at run time will be different from ROOT at build time. "cmake" has been of no help in improving on this, only made all debugging more difficult.
K.O. |
2627
|
14 Nov 2023 |
Konstantin Olchanski | Forum | mlogger does not HAVE_ROOT | > Finally, make sure you start "rmlogger" and not "mlogger". Only "rmlogger" contains the ROOT binding.
Stefan is right. I forgot this. As solution to our troubles, mlogger is built without root support. use rmlogger instead.
K.O. |
2661
|
27 Dec 2023 |
Konstantin Olchanski | Forum | MidasWiki updated to 1.39.6 | MidasWiki was updated to current mediawiki LTS 1.39.6 supported until Nov 2025,
see https://www.mediawiki.org/wiki/Version_lifecycle
as downside, after this update, I see large amounts of "account request" spam,
something that did not exist before. I suspect new mediawiki phones home to
subscribe itself to some "please spam me" list.
if you want a user account on MidasWiki, please email me or Stefan directly, we
will make it happen.
K.O. |
2662
|
29 Dec 2023 |
Konstantin Olchanski | Bug Report | Compilation error on RPi | > git pull
> git submodule update
confirmed. just run into this myself. I think "make" should warn about out of
date git modules. Also check that the build git version is tagged with "-dirty".
K.O. |
2663
|
02 Jan 2024 |
Konstantin Olchanski | Forum | midas.triumf.ca alias moved to daq00.triumf.ca | the DNS alias for midas.triumf.ca moved from old ladd00.triumf.ca to new
daq00.triumf.ca. same as before it redirects to the MidasWiki and to the midas
forum (elog) that moved from ladd00 to daq00 quite some time ago. if you see any
anomalies in accessing them (broken links, bad https certificates), please report
them to this forum or to me directly at olchansk@triumf.ca. K.O. |
2666
|
03 Jan 2024 |
Konstantin Olchanski | Forum | midas.triumf.ca alias moved to daq00.triumf.ca | > I found the first issue: The link to
> https://midas.triumf.ca/MidasWiki/index.php/Custom_plots_with_mplot
fixed.
https://midas.triumf.ca/Custom_plots_with_mplot
also works.
I tried to get rid of redirect to daq00 completely and make the whole MidasWiki show up
under midas.triumf.ca, but discovered/remembered that I cannot do this without changing
MidasWiki config [$wgServer = "https://daq00.triumf.ca";] which causes mediawiki to
redirect everything to daq00 (using the 301 "moved permanently" reply, ouch!). In theory,
if I change it to "https://midas.triumf.ca" it will redirect everything there instead,
but I am hesitant to make this change. It has been like this since forever and I have no
idea what else will break if I change it.
K.O. |
2687
|
28 Jan 2024 |
Konstantin Olchanski | Forum | number of entries in a given ODB subdirectory ? | Very good question. It exposes a very nasty problem, the race condition between "ls" and "rm". While you are
looping over directory entries, somebody else is completely permitted to remove one of the files (or add more
files), making the output of "ls" incorrect (contains non-existant/removed files, does not contain newly added
files). even the simple count of number of files can be wrong.
Exactly the same problem exists in ODB. As you loop over directory entries, some other ODB client can remove or
add new entries.
To help with this, I considered adding an db_ls() function that would take the odb lock, atomically iterate over
a directory and return an std::vector<std::string> with names of all entries. (current odb iterator returns ODB
handles that may be invalid if corresponding entry was removed while we were iterating). Unfortunately the
delete/add race condition remains, some returned entries may be invalid or missing.
For your specific application, you can swear that you will never add/delete files "at the wrong time", and you
will not see this problem until one of your users writes a script that uses odbedit to add/remove subdirectory
entries exactly at the wrong time. (you run your "ls" in the BeginRun() handler of your frontend, they run their
"rm" from their's, so both run at the same time, a race condition.
Closer to your question, I think it is simplest to always iterate over the subdirectory, collect names of all
entries, then work with them:
std::vector<std::string> names;
iterate over odb {
names.push_back(name);
}
foreach (name in names)
work_on(name);
instead of:
size_t n = db_get_num_entries();
for (size_t i=0; i<n; i++) {
std::string name = sprintf("FPGA%d", i);
work_on(name);
}
K.O.
> Dear MIDAS experts,
>
> - I have a detector configuration with a variable number of hardware components - FPGA's receiving data
> from the detector. They are described in ODB using a set of keys ranging
> from "/Detector/FPGAs/FPGA00" .... to "/Detector/FPGAs/FPGA68".
> Each of "FPGAxx" corresponds to an ODB subdirectory containing parameters of a given FPGA.
>
> The number of FPGAs in the detector configuration is variable - [independent] commissioning
> of different detector subsystems involves different number of FPGAs.
>
> In the beginning of the data taking one needs to loop over all of "FPGAxx",
> parse the information there and initialize the corresponding FPGAs.
>
> The actual question sounds rather trivial - what is the best way to implement a loop over them?
>
> - it is certainly possible to have the number of FPGAs introduced as an additional configuration parameter,
> say, "/Detector/Number_of_FPGAs", and this is what I have resorted to right now.
>
> However, not only that loooks ugly, but it also opens a way to make a mistake
> and have the Number_of_FPGAs, introduced separately, different from the actual number
> of FPGA's in the detector configuration.
>
> I therefore wonder if there could be a function, smth like
>
> int db_get_n_keys(HNDLE hdb, HNDLE hKeyParent)
>
> returning the number of ODB keys with a common parent, or, to put it simpler,
> a number of ODB entries in a given subdirectory.
>
> And if there were a better solution to the problem I'm dealing with, knowing it might be helpful
> for more than one person - configuring detector readout may require to deal with a variable number
> of very different parameters.
>
> -- many thanks, regards, Pasha |
2688
|
28 Jan 2024 |
Konstantin Olchanski | Bug Report | Warnings about ODB keys that haven't been touched for 10+ years | > I don't immediately see a reason for saying that if a DB key is older than 10 yrs, it may not be valid.
>
> However, it would be worth learning what was the logic behind choosing 10 yrs as a threshold.
> If 10 is just a more or less arbitrary number, changing 10 --> 100 seems to be the way to go.
Please run "git blame" to find out who added that check.
If I remember right, it was added to complain/correct dates in the future.
I think the oldest experiment at TRIUMF where we still can load an odb into current MIDAS is TWIST,
now about 25 years old. the purpose of loading odb would be to test the history function
to see if we can look at 10-15 year old histories. (TWIST history is in the latest FILE format,
so it will load).
I think this age check should be removed, but there must be *some* check for invalid/bogus timestamps. Or
not, we should check if MIDAS cares about timestamps at all, if ODB functions never use/look at timestamp,
maybe we are okey with bogus timestamps. They may look funny in the odb editor, but that's it.
K.O. |
2689
|
28 Jan 2024 |
Konstantin Olchanski | Forum | History tags | > This part of the system has been designed by KO, so he should reply here.
That's right. Some of this stuff is historical gibberish that is no longer needed
for FILE and SQL histories.
/History/Events is needed to create persistent mapping between history event names
and history event id's (at some point history event id was same equipment event
id, with the obvious problems when equipment event ids are duplicated, reused,
renamed, deleted).
/History/Tags was used by the history editor to speed up "give me all tag names
for this history event name". With the "MIDAS" history storage this required
reading a lot of data from disk. With the "FILE" history and cached ZFS SSD, disk
access is much cheaper and caching history event names and tags in odb is no
longer necessary.
/History/Tags should probably be removed (be check that nobody uses it first).
/History/Events has to remain as long as "MIDAS" history storage is still used.
K.O. |
2690
|
28 Jan 2024 |
Konstantin Olchanski | Forum | a scroll option for "add history variables" window? | > > Have you updated to the current midas version? This issue has been fixed a while ago.
>
> Hi Stefan, thanks a lot! I pulled from the head, and the scrolling works now. -- regards, Pasha
Right, I remember running into this problem, too.
If you have some ideas on how to better present 100500 history variables, please shout out!
K.O. |
2691
|
28 Jan 2024 |
Konstantin Olchanski | Forum | dump history FILE files | $ cat mhf_1697445335_20231016_run_transitions.dat
event name: [Run transitions], time [1697445335]
tag: tag: /DWORD 1 4 /timestamp
tag: tag: UINT32 1 4 State
tag: tag: UINT32 1 4 Run number
record size: 12, data offset: 1024
...
data is in fixed-length record format. from the file header, you read "record size" is 12 and data starts at offset 1024.
the 12 bytes of the data record are described by the tags:
4 bytes of timestamp (DWORD, unix time)
4 bytes of State (UINT32)
4 bytes of "Run number" (UINT32)
endianess is "local endian", which means "little endian" as we have no big-endian hardware anymore to test endian conversions.
file format is designed for reading using read() or mmap().
and you are right mhdump, does not work on these files, I guess I can write another utility that does what I just described and spews the numbers to stdout.
K.O. |
|