Back Midas Rome Roody Rootana
  Midas DAQ System, Page 134 of 150  Not logged in ELOG logo
ID Date Author Topicdown Subject
  2989   21 Mar 2025 Stefan RittBug Reportplease fix mscb compiler warning
I hopefully fixed the waring (narrowing down from size_t to int). Please double check with your compiler.

Stefan
  2998   25 Mar 2025 Konstantin OlchanskiBug Reportmidas equipment "format"
> In the PSI muSR laboratory, we are running about 140 slow control devices across six instruments using Format FIXED.
> Could you please wait a bit with removing support for it so that we can assess if/how this will affect us?

Roger. I believe as long as these data do not go into the MIDAS event data stream, you should not see any difference from 
changing "FIXED" to "MIDAS", I hope you can test it and report how it works for you. If you see that things change in ODB 
or in history, perhaps we can implement some kind of workaround to make the transition transparent.

K.O.
  2999   25 Mar 2025 Konstantin OlchanskiBug Reportmanalyzer module init order problem
> Andrea Capra reported a problem with manalyzer module initialization order.

Permanent solution is now implemented.

In the analyzer module constructor (TARunObject), please set the value of fModuleOrder (default value is 0).

Modules with smaller fModuleOrder will run first, modules with bigger fModuleOrder will run last.

Please use the value range between -1 (built-in EventDump module) and 9999 (built-in InteractiveModule).

Modules with equal fModuleOrder will run in the same order as they were registered (std::stable_sort).

Example manalyzer modules and documentation README.md have been updated.

Linker command line and GCC attribute methods should also work, but I thought it better to provide explicit 
programmatic control of module ordering.

P.S. the c++ tmfe frontend module design is newer and the problem of module (equipment) ordering is solved by 
constructing them explicitly in main().

manalyzer commit:

commit 6ca130808cd05ead734450391bf6defc8335c04a (HEAD -> master, origin/master, origin/HEAD)
Author: Konstantin Olchanski <olchansk@daq00.triumf.ca>
Date:   Tue Mar 25 15:53:13 2025 -0700

    implement TARunObject::fModuleOrder to specify required ordering of analyzer modules

K.O.
  3002   25 Mar 2025 Konstantin OlchanskiBug Reportplease fix compiler warning
Confirming warning is fixed. Thanks! K.O.

> > Unnamed person who added this clever bit of c++ coding, please fix this compiler warning. Stock g++ on Ubuntu LTS 24.04. Thanks in advance!
> > 
> > /home/olchansk/git/midas/src/system.cxx: In function ‘std::string ss_execs(const char*)’:
> > /home/olchansk/git/midas/src/system.cxx:2256:43: warning: ignoring attributes on template argument ‘int (*)(FILE*)’ [-Wignored-attributes]
> >  2256 |    std::unique_ptr<FILE, decltype(&pclose)> pipe(popen(cmd, "r"), pclose);
> > 
> > K.O.
> 
> Replace the code with:
> 
>    auto pclose_deleter = [](FILE* f) { pclose(f); };
>    auto pipe = std::unique_ptr<FILE, decltype(pclose_deleter)>(
>       popen(cmd, "r"),
>       pclose_deleter
>    );
> 
> 
> Hope this is now warning-free.
> 
> Stefan
  3003   25 Mar 2025 Konstantin OlchanskiBug Reportplease fix mscb compiler warning
> I hopefully fixed the waring (narrowing down from size_t to int). Please double check with your compiler.

Nope, it turns out complain is about the read() size argument, they really really really want it to be 
size_t.

Looks like correct sequence is:

size_t size = file_size_somehow();
char* buf = (char*)malloc(size+1);
ssize_t rd = read(fd, buf, size);
if (rd < 0) { complain(); return; } // read() error
if ((size_t)rd != size) { complain(); return; } // cast to size_t is safe after we know rd >= 0
buf[rd] = 0; // make sure data is NUL terminated

What a mess. I want my UNIX V7 and K&R C back!

Perhaps I should add in system.cxx - ss_read_file(const char* filename, std::vector<char> &data);

Fixed it in mscb.cxx, odbedit.cxx, mhttpd.cxx and msequencer.cxx, commited, pushed.

U-24 builds without warnings. fwew...

K.O.
  3004   25 Mar 2025 Konstantin OlchanskiBug ReportDefault write cache size for new equipments breaks compatibility with older equipments
>  > All this is kind of reasonable, as only two settings of write cache size are useful: 0 to 
> > disable it, and 10 Mbytes to limit semaphore locking rate to reasonable value for all event 
> > rate and size values practical on current computers.
> 
> Indeed KO is correct that only 0 and 10MB make sense, and we cannot mix it. Having the cache setting in the equipment table is 
> cumbersome. If you have 10 slow control equipment (cache size zero), you need to add many zeros at the end of 10 equipment 
> definitions in the frontend. 
> 
> I would rather implement a function or variable similar to fEqConfWriteCacheSize in the tmfe framework also in the mfe.cxx 
> framework, then we need only to add one line llike
> 
> gEqConfWriteCacheSize = 0;
> 
> in the frontend.cxx file and this will be used for all equipments of that frontend. If nobody complains, I will do that in April when I'm 
> back from Japan.

Cache size is per-buffer. If different equipments write into different event buffers, should be possible to set different cache sizes.

Perhaps have:

set_write_cache_size("SYSTEM", 0);
set_write_cache_size("BUF1", bigsize);

with an internal std::map<std::string,size_t>; for write cache size for each named buffer

K.O.
  3012   01 Apr 2025 Pavel MuratBug ReportMIDAS history system not using the event timestamps ?
Dear MIDAS experts, 

I confirm that when writing out history files corresponding to the slow control event data, 
MIDAS history system timestamps the data not with the event time coming from the event data, 
but with the current time determined by the program - 

https://bitbucket.org/tmidas/midas/src/293d27fad0c87c80c4ed7b94b5c40ba1e150bea4/progs/mlogger.cxx#lines-5321

where 'now' is defined as  

time_t now = time(NULL);

I'm looking for a way to timestamp the history data with the event time - that is important 
for HEP applications outside the DAQ domain. Yes, MIDAS infrastructure is very well suited for that, 
there could have a number of such applications, and experiments could significantly benefit from that.

So I'm wondering whether the implementation is a design choice made or it could be changed. 

The change itself and especially its validation may require a non-negligible amount of work - I'd be happy to contribute.

Any insight much appreciated. 

-- thanks, regards, Pasha
  3018   01 Apr 2025 Konstantin OlchanskiBug ReportODB corruption
We see ODB corruption crashes in the DS20k vertical slice MIDAS instance.

Crash is memset() called by db_delete_key1() called by cm_connect_experiment().

I look at the source code and I see that ODB pkey and hkey validation is absent
from most iterators and it is possible for "bad" pkey to cause corruption. Many
other places in the ODB code use db_get_pkey() and db_validate_hkey() to prevent
invalid data from causing further corruption and breakage.

Also db_delete_key1() needs to be refactored and renamed db_delete_key_wlocked().

I will not do this immediately today, but hopefully next week or so.

Stack trace is attached, observe how free_data() was called on a completely invalid pkey,
bad pkey->type, bad pkey sizes, etc.

#0  __memset_avx512_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:250
250	../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0  __memset_avx512_unaligned_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:250
#1  0x00005ad4102b4217 in memset (__len=<optimized out>, __ch=0, __dest=<optimized out>) at /usr/include/x86_64-linux-
gnu/bits/string_fortified.h:59
#2  free_data (pheader=pheader@entry=0x75aaea4f4000, address=0x75aaed4cea50, size=<optimized out>, caller=caller@entry=0x5ad4102ffb6c 
"db_delete_key1") at /home/dsdaqdev/packages_common/midas/src/odb.cxx:513
#3  0x00005ad4102b6a5b in free_data (caller=0x5ad4102ffb6c "db_delete_key1", size=<optimized out>, address=<optimized out>, 
pheader=0x75aaea4f4000) at /home/dsdaqdev/packages_common/midas/src/odb.cxx:453
#4  db_delete_key1 (hDB=1, hKey=<optimized out>, level=<optimized out>, follow_links=0) at 
/home/dsdaqdev/packages_common/midas/src/odb.cxx:3789
#5  0x00005ad4102b6979 in db_delete_key1 (hDB=1, hKey=288672, level=0, follow_links=0) at 
/home/dsdaqdev/packages_common/midas/src/odb.cxx:3731
#6  0x00005ad4102cc923 in db_create_record (hDB=hDB@entry=1, hKey=hKey@entry=0, orig_key_name=orig_key_name@entry=0x7ffd75987280 
"/Programs/ODBEdit", init_str=<optimized out>) at /home/dsdaqdev/packages_common/midas/src/odb.cxx:12916
#7  0x00005ad4102cca73 in db_create_record (hDB=hDB@entry=1, hKey=hKey@entry=0, orig_key_name=orig_key_name@entry=0x7ffd75987280 
"/Programs/ODBEdit", init_str=<optimized out>) at /home/dsdaqdev/packages_common/midas/src/odb.cxx:12942
#8  0x00005ad4102a00ba in cm_set_client_info (hDB=1, hKeyClient=0x7ffd75987420, host_name=0x5ad412262ee0 "dsdaqgw.triumf.ca", 
client_name=0x7ffd759874c0 "ODBEdit", hw_type=<optimized out>, password=<optimized out>, watchdog_timeout=<optimized out>)
    at /usr/include/c++/11/bits/basic_string.h:194
#9  0x00005ad4102a902b in cm_connect_experiment1 (host_name=<optimized out>, host_name@entry=0x7ffd759876c0 "", 
default_exp_name=default_exp_name@entry=0x7ffd759876a0 "vslice", client_name=client_name@entry=0x5ad4102f28fa "ODBEdit", 
func=func@entry=0x0, 
    odb_size=odb_size@entry=1048576, watchdog_timeout=<optimized out>, watchdog_timeout@entry=10000) at 
/usr/include/c++/11/bits/basic_string.h:194
#10 0x00005ad41027e58d in main (argc=3, argv=0x7ffd759881e8) at /home/dsdaqdev/packages_common/midas/progs/odbedit.cxx:3025
(gdb) up
...
#4  db_delete_key1 (hDB=1, hKey=<optimized out>, level=<optimized out>, follow_links=0) at 
/home/dsdaqdev/packages_common/midas/src/odb.cxx:3789
3789	            free_data(pheader, (char *) pheader + pkey->data, pkey->total_size, "db_delete_key1");
(gdb) p pkey
$1 = (KEY *) 0x75aaea53b400
(gdb) p *pkey
$2 = {type = 1684370529, num_values = 0, name = '\000' <repeats 16 times>, "xQ\375\002\004\000\000\000\004\000\000\000\a\000\000", data = 0, 
total_size = 290944, item_size = 1743544378, access_mode = 0, notify_count = 0, next_key = 15, parent_keylist = 1, 
  last_written = 1953785965}
(gdb) 

K.O.
  3020   01 Apr 2025 Konstantin OlchanskiBug ReportMIDAS history system not using the event timestamps ?
> I confirm that when writing out history files corresponding to the slow control event data, 
> MIDAS history system timestamps the data not with the event time coming from the event data, 
> but with the current time determined by [mlogger].

This is correct. The timestamp in the history file is the mlogger timestamp.

In theory we could use the ODB "last_written" timestamp, but in practice,
timestamps are 1 second granularity and the difference between the two
timestamps normally would be less than 1 second. (time to react to db_watch()).

But ODB last_written also is not the data timestamp. For remote connected clients
it includes the mserver communication delay.

What is the data timestamp, only the user knows - for some FPGA based equipments,
I can see the data timestamp being read from an FPGA register together with the data.

But back to earth.

For making history plots, 1 second granularity with a small (a few seconds) delay should be okey,
and I think the mserver timestamp is good enough.

For data analysis, you are reading history data from a history data file and you are
not constrained to using the MIDAS timestamp.

You can always include your "true" data timestamp as the first value in your data.

We do this in felaview for writing labview data to midas history in the ALPHA antihydrogen experiment at CERN.

This also anticipates your next request, can we have millisecond, microsecond, nanosecond history timestamps:
since you define your "true" data timestamp, you an make it anything you want. (I use "double" time in seconds,
64-bit IEEE-754 "double" has enough precision for microsecond granularity. FPGA based devices can have timestamps
with 10 ns or 8 ns granularity, in this case a uint64_t clock counter could be more appropriate).

K.O.
  3022   02 Apr 2025 Pavel MuratBug ReportMIDAS history system not using the event timestamps ?
> You can always include your "true" data timestamp as the first value in your data.

Are you saying that if the first data word of a history event were a timestamp, 
the MIDAS history system, when plotting the time dependencies, would use that timestamp 
instead of the mlogger timestamp?  

if that is true, what tells MIDAS that the first data word is the timestamp? 

I couldn't find a discussion of that on the page describing the history system - 

 https://daq00.triumf.ca/MidasWiki/index.php/History_System#Frontend_history_event

- perhaps I should be looking at a different page?

-- thanks again, regards, Pasha 
  3023   02 Apr 2025 Konstantin OlchanskiBug ReportMIDAS history system not using the event timestamps ?
> > You can always include your "true" data timestamp as the first value in your data.
> 
> Are you saying that if the first data word of a history event were a timestamp, 
> the MIDAS history system, when plotting the time dependencies, would use that timestamp 
> instead of the mlogger timestamp?
>

you are correct, midas knows nothing about what you put in the history data.

what I suggested is: if you want your true data timestamp recorded in the history,
you can put it into the history data yourself, and I suggested using the 1st value,
but you can also make it the last value or the 10th value, it is up to you.

for making history plots, the history timestamp is used, as you wrote and I confirmed,
this timestamp is generated by mlogger.

what is not clear to me is why this is a problem? do you see a big difference between the 
true data timestamp and the mlogger data timestamp? bigger than 1 second? (this would change 
the shape of "last 10 minutes" plots (600 seconds). bigger than 1 minute? (this would change 
the shape of "last 1 hour plots" (60 minutes, 3600 seconds).

that said, note that we currently store the timestamp as a DWORD 32-bit UNIX time value 
which will overflow in 2038 and which is quickly becoming incompatible with the ongoing 
switch to 64-bit time_t. Ubuntu-24 already build a large number of system libraries with 64-
bit time_t and building MIDAS with 32-bit time_t may soon become as difficult as building 
32-bit MIDAS for 32-bit i686 VME processors. we have to move with the times.

what it means is that the history system data format will have to be updated to 64-bit 
time_t and at the same time, we may try to change the timestamp from mlogger-generated to 
frontend-generated.

but it is still not clear to me how that helps you, because the frontend-generated timestamp 
is not the true data timestamp that you wanted. (and only you know what the true data 
timestamp is and where it comes from and how to tell it to MIDAS).

K.O.
  170   22 Oct 2004 Konstantin OlchanskiBug Fixmhttpd message colouring
I commited a fix to mhttpd logic that decides which messages should be shown in
"red" colour- before, any message with square brackets and colons would be
highlighted in red. Now only messages matching the pattern [...:...] are
highlighted. The decision logic was moved into a function message_red(). K.O.
  174   09 Nov 2004 Pierre-Andre AmaudruzBug FixNew transition scheme
Problem:
If cm_set_transition_sequence() is used for changing the sequence number, the 
command odbedit> start/stop/resume/pause -v report the propre sequence but the
action on the client side is actually not performed!

Fix:
Local transition table updated in midas.c (1.226)

Note:
The transition number under /system/clients/<pid>/transition...
is used internally. Changing it won't have any effect on the client action
if sequence number is not registered.
  200   25 Feb 2005 Konstantin OlchanskiBug Fixfixed: double free in FORMAT_MIDAS ybos.c causing lazylogger crashes
We stumbled upon and fixed a "double free" bug in src/ybos.c causing crashes in
lazylogger writing .mid files in the FORMAT_MIDAS format (why does it use
ybos.c? Pierre says- for generic file i/o). Why this code had ever worked before
remains a mystery. K.O.
  211   05 May 2005 Konstantin OlchanskiBug Fixfix: minor bit rot in the example experiment
I fixed some minor bit rot in the example experiment: a few minor Makefile
problems, make the analyzer use the current histogram creation macros, etc. I
also added startup and shutdown scripts. These will be documented as we work
through them with our Summer student. K.O.
  212   02 Aug 2005 Konstantin OlchanskiBug Fixfix odb corruption when running analzer for the first time
I have been plagued by ODB corruption when I run the analyzer for the first time
after setting up the new experiment. Some time ago, I traced this to
mana.c::book_ttree() and now I found and fixed the bug, fix now commited to
midas cvs. In book_ttree(), db_find("/Analyzer/Bank switches") was returning an
error and setting hkey to zero. Then we called db_open_record() with hkey==0,
which cased ODB corruption later on. The normal db_validate_hkey() did not catch
this because it considers hkey==0 to be valid (when most likely it is not). K.O.
  216   18 Aug 2005 Konstantin OlchanskiBug Fixfix race condition between clients on run start/stop, pause/resume
It turns out that the new priority sequencing of run state transitions had a
flaw: the frontends, the analyzer and the logger all registered at priority 500
and were invoked in essentially a random order. For example the frontend could
get a begin-run transition before the logger and so start sending data before
the logger opened the output file. Same for the analyzer and same for the end of
run. Also the sequencing for pause/resume run and begin/end run was different
when the two pairs ought to have identical sequencing.

I now commited changes to mana.c and mlogger.c changing their transition sequencing:

start and resume:
200 - logger (mlogger.c, no change)
300 - analyzer (mana.c, was 500)
500 - frontends (mfe.c, no change)

stop and pause:
500 - frontends (mfe.c, no change)
700 - analyzer (mana.c, was 500)
800 - mlogger (mlogger.c, was 500)

P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
see an anomaly in the analyzer, where it receives data events after the
end-of-run transition.

K.O.
  219   01 Sep 2005 Stefan RittBug Fixfix race condition between clients on run start/stop, pause/resume
> It turns out that the new priority sequencing of run state transitions had a
> flaw: the frontends, the analyzer and the logger all registered at priority 500
> and were invoked in essentially a random order. For example the frontend could
> get a begin-run transition before the logger and so start sending data before
> the logger opened the output file. Same for the analyzer and same for the end of
> run. Also the sequencing for pause/resume run and begin/end run was different
> when the two pairs ought to have identical sequencing.
> 
> I now commited changes to mana.c and mlogger.c changing their transition sequencing:
> 
> start and resume:
> 200 - logger (mlogger.c, no change)
> 300 - analyzer (mana.c, was 500)
> 500 - frontends (mfe.c, no change)
> 
> stop and pause:
> 500 - frontends (mfe.c, no change)
> 700 - analyzer (mana.c, was 500)
> 800 - mlogger (mlogger.c, was 500)
> 
> P.S. However, even after this change, the TRIUMF ISAC/Dragon experiment still
> see an anomaly in the analyzer, where it receives data events after the
> end-of-run transition.
> 
> K.O.

Thanks for fixing that bug. It happend because during the implementatoin of the priority
sequencing we have up the pre/post tansition, which took care of the proper sequence
between the logger, frontend and analyzer. The way you modified the sequence is
absolutely correct. It is important to have >10 numbers "around" the frontends (like
450...550) in case one has an experiment with >10 frontends which need to make a
transition in a certain sequence (like the DANCE experiment in Los Alamos).
  229   17 Oct 2005 Exaos LeeBug Fix"make install" error under MacOS X
Under MacOS X, "make install" will cours an error like this:
...
install: darwin/bin/dio: No such file or directory
make: *** [install] Error 71

This can be fixed as the following diff:
404,405c404,405
< $(BIN_DIR)/mcnaf: $(UTL_DIR)/mcnaf.c $(DRV_DIR)/camac/camacrpc.c
<       $(CC) $(CFLAGS) $(OSFLAGS) -o $@ $(UTL_DIR)/mcnaf.c $(DRV_DIR)/camac/camacrpc.c $(LIB) $(LIBS)
---
> $(BIN_DIR)/mcnaf: $(UTL_DIR)/mcnaf.c $(DRV_DIR)/bus/camacrpc.c
>       $(CC) $(CFLAGS) $(OSFLAGS) -o $@ $(UTL_DIR)/mcnaf.c $(DRV_DIR)/bus/camacrpc.c $(LIB) $(LIBS)
438c438,439
<       @for i in mserver mhttpd odbedit mlogger ; \
---
> 
>       @for i in mserver mhttpd odbedit mlogger dio ; \
444,447d444
<       chmod +s $(SYSBIN_DIR)/mhttpd
< 
< ifeq ($(OSTYPE),linux)
<       install -v -m 755 $(BIN_DIR)/dio $(SYSBIN_DIR)
449c446
< endif
---
>       chmod +s $(SYSBIN_DIR)/mhttpd
  235   23 Nov 2005 Stefan RittBug FixEndian swapping in mana.c
It was reported that following code in mana.c :
  /* swap event header if in wrong format */
  if (pevent->serial_number > 0x1000000) {
     WORD_SWAP(&pevent->event_id);
     WORD_SWAP(&pevent->trigger_mask);
     DWORD_SWAP(&pevent->serial_number);
     DWORD_SWAP(&pevent->time_stamp);
     DWORD_SWAP(&pevent->data_size);
  }

does not work correctly for events having a true serial number above 16777216 (=0x10000000). After some considerations, I concluded that there is no good way to determine automatically the endian format of midas events, without adding another field in the header, which would break the compatibility with all recorded data up to date. I therefore changed the above code to
  /* swap event header if in wrong format */
#ifdef SWAP_EVENTS
  WORD_SWAP(&pevent->event_id);
  WORD_SWAP(&pevent->trigger_mask);
  DWORD_SWAP(&pevent->serial_number);
  DWORD_SWAP(&pevent->time_stamp);
  DWORD_SWAP(&pevent->data_size);
#endif

So if one wants to analyze events with the midas analyzer on a PC system for example where the events come from a VxWorks system with the opposite endian encoding, one has to set the flag -DSWAP_EVENTS when compiling the analyzer for that type of analysis.
ELOG V3.1.4-2e1708b5