| ID |
Date |
Author |
Topic |
Subject |
|
2703
|
05 Feb 2024 |
Ben Smith | Bug Fix | string --> int64 conversion in the python interface ? |
> The symptoms are consistent with a string --> int64 conversion not happening
> where it is needed.
Thanks for the report Pasha. Indeed I was missing a conversion in one place. Fixed now!
Ben |
|
2710
|
13 Feb 2024 |
Konstantin Olchanski | Bug Fix | string --> int64 conversion in the python interface ? |
> > The symptoms are consistent with a string --> int64 conversion not happening
> > where it is needed.
>
> Thanks for the report Pasha. Indeed I was missing a conversion in one place. Fixed now!
>
Are we running these tests as part of the nightly build on bitbucket? They would be part of
the "make test" target. Correct python dependancies may need to be added to the bitbucket OS
image in bitbucket-pipelines.yml. (This is a PITA to get right).
K.O. |
|
2711
|
14 Feb 2024 |
Konstantin Olchanski | Bug Fix | added ubuntu-22 to nightly build on bitbucket, now need python! |
> Are we running these tests as part of the nightly build on bitbucket? They would be part of
> the "make test" target. Correct python dependancies may need to be added to the bitbucket OS
> image in bitbucket-pipelines.yml. (This is a PITA to get right).
I added ubuntu-22 to the nightly builds.
but I notice the build says "no python" and I am not sure what packages I need to install for
midas python to work.
Ben, can you help me with this?
https://bitbucket.org/tmidas/midas/pipelines/results/1106/steps/%7B9ef2cf97-bd9f-4fd3-9ca2-9c6aa5e20828%7D
K.O. |
|
2792
|
26 Jul 2024 |
Lukas Gerritzen | Bug Fix | strlcpy and strlcat added to glibc 2.38 |
A year ago, these two function were included in glibc. If trying to compile midas with a recent version of
Ubuntu or Fedora, one gets errors like this:
/usr/include/string.h:506:15: error: declaration of ‘size_t strlcpy(char*, const char*, size_t) noexcept’ has a
different exception specifier
506 | extern size_t strlcpy (char *__restrict __dest,
| ^~~~~~~
In file included from /home/luk/midas/src/midas.cxx:14:
/home/luk/midas/include/midas.h:2190:17: note: from previous declaration ‘size_t strlcpy(char*, const
char*, size_t)’
My proposed solution is a check in midas.h around line 248:
#if (__GLIBC__ > 2) || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 38)
#ifndef HAVE_STRLCPY
#define HAVE_STRLCPY 1
#endif
#endif
|
|
2793
|
26 Jul 2024 |
Stefan Ritt | Bug Fix | strlcpy and strlcat added to glibc 2.38 |
Good catch. I added your code to the current develop branch of MIDAS.
Stefan |
|
2839
|
12 Sep 2024 |
Konstantin Olchanski | Bug Fix | bitbucket builds repaired |
bitbucket builds work again, also added ubuntu-24 and almalinux-9.
two problems fixed:
- cmake file in examples/experiment was replaced by a non-working version
- unannounced change of strlcpy() to mstrlcpy() broke "make remoteonly"
P.S. I should also fix the rootana and the roody bitbucket builds.
K.O. |
|
2840
|
13 Sep 2024 |
Konstantin Olchanski | Bug Fix | rootana bitbucket build fixed |
rootana bitbucket build is fixed, only a few minor build problems. I am using the
root official docker image (which turned out to not work right out of the box
becuase of missing libvdt-dev package). K.O. |
|
2842
|
13 Sep 2024 |
Konstantin Olchanski | Bug Fix | mstrcpy, was: strlcpy and strlcat added to glibc 2.38 |
for the record, as ultimate solution, strlcpy() and strlcat() were wholesale
replaced by mstrlcpy() and mstrlcat(). this should fix "missing strlcpy()"
problem for good and make midas more consistent across all platforms (including
non-linux, non-unix). on my side, I continue replacing these function with proper
std::string operations. K.O. |
|
2967
|
20 Mar 2025 |
Konstantin Olchanski | Bug Fix | bitbucket builds fixed |
bitbucket automatic builds were broken after mfe.cxx started printing some additional messages added in commit
https://bitbucket.org/tmidas/midas/commits/0ae08cd3b96ebd8e4f57bfe00dd45527d82d7a38
this is now fixed. to check if your changes will break automatic builds, before final push, please do:
make clean
make mini -j
make cmake -j
make test
K.O. |
|
2988
|
21 Mar 2025 |
Stefan Ritt | Bug Fix | bitbucket builds fixed |
> bitbucket automatic builds were broken after mfe.cxx started printing some additional messages added in commit
> https://bitbucket.org/tmidas/midas/commits/0ae08cd3b96ebd8e4f57bfe00dd45527d82d7a38
>
> this is now fixed. to check if your changes will break automatic builds, before final push, please do:
>
> make clean
> make mini -j
> make cmake -j
> make test
Unfortunately we will break the automatic build each time a program outputs one different character, which even might happen if we add a line of code and
a cm_msg() gets produced with a different line number. Is there a standard way to update testexpt.example (like "make testexpt" or so). Should be trigger
the update of testexpt.example before each commit via a hook?
Stefan |
|
2992
|
21 Mar 2025 |
Konstantin Olchanski | Bug Fix | bitbucket builds fixed |
> > bitbucket automatic builds
>
> Unfortunately we will break the automatic build each time a program outputs one different character, which even might happen if we add a line of code and
> a cm_msg() gets produced with a different line number. Is there a standard way to update testexpt.example (like "make testexpt" or so). Should be trigger
> the update of testexpt.example before each commit via a hook?
>
Actually line numbers are not logged by messages printed from "make test", so moving code around does not break the test.
Changing what programs output does break the test and this is intentional - somebody must look at confirm
that program output was changed on purpose or because a bug was introduced (or fixed).
Most "make test" things work this way - run programs, compare output to what is expected. Discrepancies are flagged for human examination.
K.O. |
|
3008
|
28 Mar 2025 |
Konstantin Olchanski | Bug Fix | midas cmake update |
MIDAS git tag midas-2025-01-a introduced an incompatible change to "include midas-targets.cmake". Instead of "midas" one now has to
say "midas::midas", as updated below. K.O.
>
> #
> # CMakeLists.txt for alpha-g frontends
> #
>
> cmake_minimum_required(VERSION 3.12)
> project(agdaq_frontends)
>
> include($ENV{MIDASSYS}/lib/midas-targets.cmake)
>
> add_compile_options("-O2")
> add_compile_options("-g")
> #add_compile_options("-std=c++11")
> add_compile_options(-Wall -Wformat=2 -Wno-format-nonliteral -Wno-strict-aliasing -Wuninitialized -Wno-unused-function)
> add_compile_options("-DTMFE_REV0")
> add_compile_options("-DOS_LINUX")
>
> add_executable(feevb feevb.cxx TsSync.cxx)
> target_link_libraries(feevb midas::midas)
>
> add_executable(fectrl fectrl.cxx GrifComm.cxx EsperComm.cxx JsonTo.cxx KOtcp.cxx $ENV{MIDASSYS}/src/tmfe_rev0.cxx)
> target_link_libraries(fectrl midas::midas)
>
> #end |
|
3010
|
28 Mar 2025 |
Konstantin Olchanski | Bug Fix | manalyzer -R8082 --jsroot |
When processing MIDAS files offline, JSROOT did not work, -Rxxx worked, http
connection would open, but would not serve any histograms. This should now be
fixed.
In addition, normally, after processing all input MIDAS files, manalyzer would
exit, JSROOT would abruptly stop. To look at final results one had to open the
ROOT files using some other method (roody, TBrowser, mjsroot, etc).
I now added a command line switch "--jsroot", if supplied, after processing all
input MIDAS files, manalyzer will keep running in the JSROOT server mode (same as
mjsroot).
"manalyzer -R8082 --jsroot run*.mid.lz4" now does something useful: open
http://localhost:8082 (or ssh tunnel or mhttpd proxy per my mjsroot message) and
watch histograms fill in real time, after analysis finishes, keep looking at the
final results until bored. stop manalyzer using Ctrl-C. (we should add a "Stop
JSROOT" botton to the JSROOT main page).
MIDAS commit 1d0d6448c3ec4ffd225b8d2030fe13e379fcd007
K.O. |
|
3011
|
30 Mar 2025 |
Konstantin Olchanski | Bug Fix | manalyzer improvements |
updated manalyzer:
- similar to --jsroot switch, in online mode, the ROOT output file remains open after run is stopped. Previously, after run was
stopped, all histograms & etc would disappear from JSROOT, making it hard to look at the full collected and analyzed data.
- there was a buglet in the multithreading code, if some module cannot analyze flow events as fast as we can read data from disk,
the flow event queue of the first module thread would grow and grow and grow infinitely, potentially consume lots of RAM. This is
because control of queue size for the first module thread was disabled to avoid a deadlock. I now added the queue size check to the
main event loop (both offline mode and online mode) and this problem should now be fixed.
- also adjusted the default queue size from 100 to 1000 and queue-full wait sleep time from 100 us to 10 us.
- another buglet was in the flow event processing. per the README, module EndRun() should not generate flow events (instead, they
should be generated in PreEndRun()). Previously this was not enforced, now there is an error message about this and the offending
flow events are deleted. (they were not being processed anyway).
K.O. |
|
3017
|
01 Apr 2025 |
Konstantin Olchanski | Bug Fix | ODB and event buffer - release semaphore before abort() and core dump |
There is a long standing problem with ODB and event buffers. If they detect an
internal data inconsistency and cannot continue running, they call abort() to
dump core and stop.
Problem is in some code paths, they do this while holding the ODB or event
buffer semaphore. (Linux kernel automatically releases SYSV semaphores after
core dump is finished and program holding them is stopped).
If core dump takes longer than 10 seconds (for whatever reason, but we see this
often enough), all other programs that wait for ODB or event buffer access, will
also timeout and also crash (with core dump). Result is a core dump storm, at
the end all MIDAS programs are crashed. (Luckily recovery is easy, simply
restart everything).
Now I realize that in many situation, we do not need to hold the semaphore while
dumping core - the content of ODB and event buffer shared memories is not
important for debugging the crash - and it is safe to release the semaphore
before calling abort().
This is now implemented for ODB and event buffers. Hopefully core dump storms
will not happen again.
commit 96369c29deba1752fd3d25bed53e6594773d7e1a
release ODB semaphore before calling abort() to dump core. if core dump takes
longer than 10 sec all other midas programs will timeout and crash.
commit 2506406813f1e7581572f0d5721d3761b7c8e8dd
unlock event buffer before calling abort() in bm_validate_client_index_locked(),
refactor bm_get_my_client_locked()
K.O. |
|
3034
|
05 May 2025 |
Konstantin Olchanski | Bug Fix | Bug fix in SQL history |
A bug was introduced to the SQL history in 2022 that made renaming of variable names not work. This is now fixed.
break commit:
54bbc9ed5d65d8409e8c9fe60b024e99c9f34a85
fix commit:
159d8d3912c8c92da7d6d674321c8a26b7ba68d4
P.S.
This problem was caused by an unfortunate design of the c++ class system. If I want to add more data to an existing
class, I write this:
class old_class {
int i,j,k;
}
class bigger_class: public old_class {
int additional_variable;
}
But if I have this:
struct x { int i,j; }
class y {
std::vector<x> array_of_x;
}
and I want to add "k" to "x", c++ has not way to do this. history code has this workaround:
class bigger_y: public y
{
std::vector<int> array_of_k;
}
int bigger_y:foo(int n) {
printf("%d %d %d\", array_of_x[n].i, array_of_x[n].j, array_of_k[n]);
}
problem is that it is not obvious that "array_of_x" and "array_of_k" are connected
and they can easily get out of sync (if elements are added or removed). this is the
bug that happened in the history code. I now added assert(array_of_x.size()==array_of_k.size())
to offer at least some protection going forward.
P.S. As final solution I think I want to completely separate file history and sql history code,
they have more things different than common.
K.O. |
|
3067
|
24 Jul 2025 |
Konstantin Olchanski | Bug Fix | support for large history files |
FILE history code (mhf_*.dat files) did not support reading history files bigger than about 2GB, this is now
fixed on branch "feature/history_off64_t" (in final testing, to be merged ASAP).
History files were never meant to get bigger than about 100 MBytes, but it turns out large files can still
happen:
1) files are rotated only when history is closed and reopened
2) we removed history close and open on run start
3) so files are rotated only when mlogger is restarted
In the old code, large files would still happen if some equipment writes a lot of data (I have a file from
Stefan with history record size about 64 kbytes, written at 1/second, MIDAS handles this just fine) or if
there is no runs started and stopped for a long time.
There are reasons for keeping file size smaller:
a) I would like to use mmap() to read history files, and mmap() of a 100 Gbyte file on a 64 Gbyte RAM
machine would not work very well.
b) I would like to implement compressed history files and decompression of a 100 Gbyte file takes much
longer than decompression of a 100 Mbyte file. it is better if data is in smaller chunks.
(it is easy to write a utility program to break-up large history files into smaller chunks).
Why use mmap()? I note that the current code does 1 read() syscall per history record (it is much better to
read data in bigger chunks) and does multiple seek()/read() syscalls to find the right place in the history
file (plays silly buggers with the OS read-ahead and data caching). mmap() eliminates all syscalls and has
the potential to speed things up quite a bit.
K.O. |
|
3124
|
20 Nov 2025 |
Konstantin Olchanski | Bug Fix | ODB update, branch feature/db_delete_key merged into develop |
In the darkside vertical slice midas daq, we observed odb corruption which I
traced to db_delete_key(). cause of corruption is not important. important is to
have a robust odb where small corruption will stay localized and will not
require erasing corrupt odb and reloading it from a backup file.
To help debug such corruption one can try to set ODB "/Experiment/Protect ODB"
to "yes". This will make ODB shared memory read-only and user code scribbling
into the wrong memory address will cause a seg fault and core dump instead of
silent ODB corruption. This feature is not enabled by default because changing
ODB shared memory mapping from "read-only" to "writable" (and back) is not very
fast and it slows down MIDAS noticably.
MIDAS right before this merge was tagged "midas-2025-11-a", if you see this ODB
update cause trouble, please report it here and revert to this tagged version.
Updates:
- harden db_delete_key() against internal corruption, if odb inconsistency is
detected, do a clean crash instead of trying to delete stuff and corrupting odb
to the point where it has to be erased and reloaded from a backup file.
- additional refactoring to separate read-locked and write-locked code.
- merge of missing patch to avoid odb corruption when key area becomes 100% full
(or was it the data area? I forget now, I fixed one of them long time ago, now
both are fixed).
- remove the "follow_links" argument from db_delete_key(), see separate
discussion on this.
- add db_delete() to delete things by ODB path not by hkey (atomic fused
together db_find_link() and db_delete_key()).
- fixes for incorrect use of db_find_key() and db_delete_key(), this
unexpectedly follows symlinks and deletes the wrong ODB entry. (should have been
db_find_link(), now replaced with atomic db_delete()).
K.O. |
|
3136
|
25 Nov 2025 |
Stefan Ritt | Bug Fix | ODB update, branch feature/db_delete_key merged into develop |
Thanks for the fixes, which I all approve.
There is still a "follow_links" in midas_c_compat.h line 70 for Python. Probably Ben has to look into that. Also
client.py has it.
Stefan |
|
3137
|
25 Nov 2025 |
Konstantin Olchanski | Bug Fix | fixed db_find_keys() |
Function db_find_keys() added by person unnamed in April 2020 never worked correctly, it is now fixed,
repaired, also unsafe strcpy() replaced by mstrlcpy().
This function is used by msequencer ODBSet function and by odbedit "set" command.
Under all conditions it returned DB_NO_KEYS, only two use cases actually worked:
set runinfo/state 1 <--- no match pattern - works
set run*/state 1 <--- match multiple subdirectories - works
set runinfo/stat* 1 <--- bombs out with DB_NO_KEY
set run*/stat* 1 <--- bombs out with DB_NO_KEY
All four use cases now work.
commit b5b151c9bc174ca5fd71561f61b4288c40924a1a
K.O. |