ELOG Midas

Back Midas Rome Roody Rootana

Midas DAQ System, Page 126 of 158

Not logged in

Find | Login | Help

Full | Summary | Threaded | Hide attachments

3149 Entries

Goto page Previous 1, 2, 3 ... 125, 126, 127 ... 156, 157, 158 Next

ID	Date	Author	Topic	Subject
2876	13 Oct 2024	Konstantin Olchanski	Bug Report	Difficulty running MIDAS on Rocky 9.4
Thank you for the stack trace, I fixed the buglet that cause midas programs to crash twice, once on failure to lock ODB, then call exit() -> atexit() handlers -> cm_check_connect() -> crash on ODB lock failure is the cm_msg() codes. Replaced exit(1) with abort(). Could have used kill(getpid(),SIGKILL) to avoid making a core dump, but what the heck... Of course this does nothing to the original bug where ODB was locked and nobody will ever unlock it (reboot will unlock it!). commit bdd1d7fdc093b5a8d54a1b8467002bb3cac3ac11 K.O. > > > I've uploaded the current core dump at: https://gitlab.com/det-lab/coredumps#. > > > > I cannot read the core dump without the corresponding executable (and likely all it's shared libraries). > > > > It is best if you run gdb and extract the stack traces on your end. > > > > In case you are not familiar with gdb: > > > > gdb mserver core # start gdb > > bt # stack trace of crashed thread > > info thr # get list of threads > > thr 1 > > bt > > thr 2 > > bt > > # etc, get stack trace of each thread, there should not be too many of them > > > > K.O. > > Hi Konstantin, thanks for the instructions. I do appear to be missing some debug symbols, but the output > looks potentially useful: > > [lekhraj@sdfcdmsdaq ~]$ gdb mserver > core.mserver.17468.b174bb74f2bb44f9a0905e78ec6b2677.601715.1728422354000000 > GNU gdb (GDB) Rocky Linux 10.2-11.1.el9_3 > ... > For help, type "help". > Type "apropos word" to search for commands related to "word"... > Reading symbols from mserver... > [New LWP 601715] > > warning: Section `.reg-xstate/601715' in core file too small. > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > Core was generated by `mserver'. > Program terminated with signal SIGABRT, Aborted. > > warning: Section `.reg-xstate/601715' in core file too small. > #0 0x00007fbdeaca154c in __pthread_kill_implementation () from /lib64/libc.so.6 > Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-83.el9.12.x86_64 libgcc-11.4.1- > 3.el9.x86_64 libstdc++-11.4.1-2.1.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 mysql-libs-8.0.36-1.el9_3.x86_64 > openssl-libs-3.0.7-25.el9_3.x86_64 zlib-1.2.11-40.el9.x86_64 > (gdb) > (gdb) bt > #0 0x00007fbdeaca154c in __pthread_kill_implementation () from /lib64/libc.so.6 > #1 0x00007fbdeac54d06 in raise () from /lib64/libc.so.6 > #2 0x00007fbdeac287f3 in abort () from /lib64/libc.so.6 > #3 0x0000000000430ee4 in db_lock_database (hDB=hDB@entry=1) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:2473 > #4 0x0000000000437e9c in db_find_key (subhKey=0x7ffcc536d348, key_name=0x4687a8 "/Logger/Message file date > format", > hKey=0, hDB=1) at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:4099 > #5 db_find_key (hDB=1, hKey=0, key_name=0x4687a8 "/Logger/Message file date format", > subhKey=0x7ffcc536d348) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:4075 > #6 0x0000000000448297 in db_get_value_string (hdb=1, hKeyRoot=hKeyRoot@entry=0, > key_name=key_name@entry=0x4687a8 "/Logger/Message file date format", index=index@entry=0, > s=s@entry=0x7ffcc536d470, create=create@entry=1, create_string_length=0) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:13950 > #7 0x000000000040a690 in cm_msg_get_logfile (fac=<optimized out>, t=<optimized out>, > filename=0x7ffcc536d690, > linkname=0x7ffcc536d6b0, linktarget=0x7ffcc536d6d0) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/midas.cxx:573 > #8 0x000000000041a307 in cm_msg_log (message_type=1, facility=0x46db0e "midas", > message=0x7e4290 "[mserver,ERROR] [odb.cxx:2498:db_lock_database,ERROR] cannot lock ODB semaphore, > timeout 10000 ms, exiting...") at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/midas.cxx:685 > #9 0x0000000000421fcd in cm_msg_flush_buffer () at /usr/include/c++/11/bits/basic_string.h:194 > #10 0x00007fbdeac574dd in __run_exit_handlers () from /lib64/libc.so.6 > #11 0x00007fbdeac57620 in exit () from /lib64/libc.so.6 > #12 0x0000000000430f7a in db_lock_database (hDB=hDB@entry=1) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:2499 > #13 0x0000000000437e9c in db_find_key (subhKey=0x7ffcc536da04, key_name=0x476a21 "/Alarms/Alarms", hKey=0, > hDB=1) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:4099 > #14 db_find_key (hDB=1, hKey=hKey@entry=0, key_name=key_name@entry=0x476a21 "/Alarms/Alarms", > subhKey=subhKey@entry=0x7ffcc536da04) at > /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:4075 > #15 0x0000000000455fd2 in al_check () at > /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/alarm.cxx:614 > --Type <RET> for more, q to quit, c to continue without paging-- > #16 0x000000000041ff85 in cm_periodic_tasks () > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/midas.cxx:5596 > #17 0x00000000004235c5 in cm_yield (millisec=millisec@entry=1000) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/midas.cxx:5676 > #18 0x00000000004065c2 in main (argc=<optimized out>, argv=0x7ffcc536e628) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/progs/mserver.cxx:295 > (gdb) info thr > Id Target Id Frame > * 1 Thread 0x7fbdec0b1740 (LWP 601715) 0x00007fbdeaca154c in __pthread_kill_implementation () from > /lib64/libc.so.6 > (gdb) thr 1 > [Switching to thread 1 (Thread 0x7fbdec0b1740 (LWP 601715))] > #0 0x00007fbdeaca154c in __pthread_kill_implementation () from /lib64/libc.so.6 > (gdb) bt > #0 0x00007fbdeaca154c in __pthread_kill_implementation () from /lib64/libc.so.6 > #1 0x00007fbdeac54d06 in raise () from /lib64/libc.so.6 > #2 0x00007fbdeac287f3 in abort () from /lib64/libc.so.6 > #3 0x0000000000430ee4 in db_lock_database (hDB=hDB@entry=1) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:2473 > #4 0x0000000000437e9c in db_find_key (subhKey=0x7ffcc536d348, key_name=0x4687a8 "/Logger/Message file date > format", > hKey=0, hDB=1) at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:4099 > #5 db_find_key (hDB=1, hKey=0, key_name=0x4687a8 "/Logger/Message file date format", > subhKey=0x7ffcc536d348) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:4075 > #6 0x0000000000448297 in db_get_value_string (hdb=1, hKeyRoot=hKeyRoot@entry=0, > key_name=key_name@entry=0x4687a8 "/Logger/Message file date format", index=index@entry=0, > s=s@entry=0x7ffcc536d470, create=create@entry=1, create_string_length=0) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:13950 > #7 0x000000000040a690 in cm_msg_get_logfile (fac=<optimized out>, t=<optimized out>, > filename=0x7ffcc536d690, > linkname=0x7ffcc536d6b0, linktarget=0x7ffcc536d6d0) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/midas.cxx:573 > #8 0x000000000041a307 in cm_msg_log (message_type=1, facility=0x46db0e "midas", > message=0x7e4290 "[mserver,ERROR] [odb.cxx:2498:db_lock_database,ERROR] cannot lock ODB semaphore, > timeout 10000 ms, exiting...") at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/midas.cxx:685 > #9 0x0000000000421fcd in cm_msg_flush_buffer () at /usr/include/c++/11/bits/basic_string.h:194 > #10 0x00007fbdeac574dd in __run_exit_handlers () from /lib64/libc.so.6 > #11 0x00007fbdeac57620 in exit () from /lib64/libc.so.6 > #12 0x0000000000430f7a in db_lock_database (hDB=hDB@entry=1) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:2499 > #13 0x0000000000437e9c in db_find_key (subhKey=0x7ffcc536da04, key_name=0x476a21 "/Alarms/Alarms", hKey=0, > hDB=1) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:4099 > #14 db_find_key (hDB=1, hKey=hKey@entry=0, key_name=key_name@entry=0x476a21 "/Alarms/Alarms", > subhKey=subhKey@entry=0x7ffcc536da04) at > /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/odb.cxx:4075 > #15 0x0000000000455fd2 in al_check () at > /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/alarm.cxx:614 > #16 0x000000000041ff85 in cm_periodic_tasks () > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/midas.cxx:5596 > #17 0x00000000004235c5 in cm_yield (millisec=millisec@entry=1000) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/src/midas.cxx:5676 > #18 0x00000000004065c2 in main (argc=<optimized out>, argv=0x7ffcc536e628) > at /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/midas_fork/progs/mserver.cxx:295 > (gdb)
2879	18 Oct 2024	Konstantin Olchanski	Bug Report	Difficulty running MIDAS on Rocky 9.4
> [aroberts@sdfcdmsdaq midas]$ odbedit > [ODBEdit,ERROR] [odb.cxx:2043:db_open_database,ERROR] Removed ODB client 'ODBEdit', index 0 because process pid > 1615051 does not exists > [ODBEdit,INFO] Removed open record flag from "/Experiment/Security/RPC hosts/Allowed hosts" > [ODBEdit,INFO] Removed exclusive access mode from "/Experiment/Security/RPC hosts/Allowed hosts" > [ODBEdit,INFO] Corrected 1 ODB entries > [ODBEdit,INFO] Deleted entry '/System/Clients/1615051' for client 'ODBEdit' because it is not connected to ODB > [ODBEdit,INFO] Client 'ODBEdit' on buffer 'SYSMSG' removed by bm_open_buffer because process pid 1615051 does not > exist so far, so good, we connected to ODB (lock was not stuck), cleared out client "odbedit" with pid 1615051 that crashed without properly disconnecting. ODB semaphore is working correctly. > [ODBEdit,ERROR] [odb.cxx:2489:db_lock_database,ERROR] cannot lock ODB semaphore, timeout 10000 ms, aborting... > Aborted (core dumped) suddenly, an ODB semaphore timeout... can you post the stack trace from this core dump? I am pretty sure it will be boring, but just in case... K.O.
2880	18 Oct 2024	Konstantin Olchanski	Bug Report	Difficulty running MIDAS on Rocky 9.4
> suddenly, an ODB semaphore timeout... I am wondering if something bizarre is going on, like the system clock going backwards. I heard of things like that happening in virtual environments. https://stackoverflow.com/questions/4801122/how-to-stop-time-from-running-backwards-on-linux I added some debugging information to the semaphore locking code. Please update to commit eb625af119067f6d702211542d88a28ccb57ad2c of src/system.cxx (plus small change in include/msystem.h) and try again. Now for each timeout it will print detailed syscall and timing information, if time goes backwards, it should catch it. K.O.
2896	15 Nov 2024	Konstantin Olchanski	Suggestion	Issue with creating banks
> Hello, I am a coop student working at SNOLAB. > void* data_acquisition_thread(void* param) > { > EVENT_HEADER pevent; > if (complicated) { > status = rb_get_wp(rbh, (void ) &pevent, 0); > } > bm_compose_event_threadsafe(pevent, 1, 0, 0, &equipment[0].serial_number); > } this code is buggy. it should read "EVENT_HEADER pevent = NULL;" to avoid an uninitialized variable and bm_compose_event() & co should be inside an "if (pevent != NULL)" block, unless you can absolutely proove that rb_get_wp() is always called and pevent is never NULL. (even is somebody changes the code later). if you build your code with "gcc -O2 -g -Wall -Wuninitialized" it would probably warn you about use of uninitilialized "pevent". P.S. for building multithreaded frontends, you are much better off starting from the c++ tmfe frontend framework, a good starting point is study tmfe_example_everything.cxx. K.O.
2897	15 Nov 2024	Konstantin Olchanski	Info	New sequencer command ODBLOOKUP
> A new sequencer command "ODBLOOKUP" has been implemented, which does a lookup of a string in a string > array in the ODB given by a path and returns its index as a number. If we have for example an array > > /Examples/Names > [0] Hello > [1] Test > [2] Other > > and do a > > ODBLOOKUP "/Examples/Names", "Test", index > > we get a index equal 1. > "value not found" sets "index" to ? "odb key not found" sets "index" to ? link to documentation? K.O.
2902	22 Nov 2024	Konstantin Olchanski	Bug Report	ODB lock timeout, Difficulty running MIDAS on Rocky 9.4
> try to replicate the issue ... I see ODB lock timeout (and abort() of everything) in the dsvslice test station. We have about 15-20 MIDAS clients connected. I am pretty sure we have not seen this problem until recently (and I have not seen it personally for a very long time). There were no changes to the MIDAS ODB locking code in a long time. I suspect a recent change in the linux kernel. But I am likely to be wrong. I have 1000 core dumps from this crash of dsvslice, and among them should be the 1 thread that has ODB locked. Wish me luck finding it. Worst case is to discover that ODB is locked but nobody is holding a lock ("missing unlock bug"). This is hard to debug, I would have add tracking of "who was the last one to lock it, who forgot to unlock it". K.O.
2906	27 Nov 2024	Konstantin Olchanski	Bug Report	TMFE::Sleep() errors
> [tmfe.cxx:1033:TMFE::Sleep,ERROR] select() returned -1, errno 22 (Invalid argument). The very original copy of this function had an error and was spewing out this error quite often, this was a missing handler for EINTR. Now it looks like we are missing a handler for EINVAL. Most likely sleep is called with a funny sleep time value that fills struct timeval with values select() does not like. I see Ben added a check for negative sleep times, and this is good. I think I will do these changes: a) add an error message for negative sleep time, I think user should never call ::Sleep with negative or zero sleep times and if they do it is a bug and they should fix it, the error message will inform them so. b) add a handler for EINVAL, which will report the requested sleep time and the values in struct timeval K.O.
2907	27 Nov 2024	Konstantin Olchanski	Bug Report	TMFE::Sleep() errors
> > I wonder also if now that midas requires stricter/newer c++ standards if there maybe > some more straightforward method to sleep that is sufficiently robust and portable. > I believe POSIX defined clock_nanosleep() & co, so on most recent machines that is the most portable way to sleep. Historically, select() was the only way to sleep for less than 1 sec, but it was never portable because of differences between BSD UNIX and Linux implementations. (MacOS is BSD UNIX via FreeBSD). On difference is the update of struct timeval is select() is interrupted. In this elog entry, I compare sleep using select() with sleep using clock_nanosleep() and see that there is no difference: https://daq00.triumf.ca/elog-midas/Midas/2115 As you can see tmfe.cxx has both implementations, select() and clock_nanosleep(), and anybody can try which one works better on their computer. K.O.
2908	27 Nov 2024	Konstantin Olchanski	Bug Report	TMFE::Sleep() errors
> status = select(1, &fdset, NULL, NULL, &timeout); > > On error, -1 is returned, ... timeout becomes undefined. I have been reading "man select" for 30 years and I do not remember seeing this text. I believe on BSD UNIX (MacOS) it says timeout is unchanged and on Linux is says timeout is updated to time actually slept. I will have to investigate, but I suspect the man page was posix-ized, by sweeping BSD/MacOS and Linux implementations under the same "instead of saying what it actually does, we will just say 'undefined'". In any case, EINTR is not an error, it's an artefact of UNIX signal handling. Linux implementations always try very hard to handle signals without causing EINTR to select(), read() and write(). This is most painful when reading and writing to TCP sockets, because one most handle partial reads and EINTR. K.O.
2915	04 Dec 2024	Konstantin Olchanski	Bug Report	ODB key picker does not close when creating link / Edit-on-run string box too large
> > Actual result: > > The key picker does not close. > > Thanks for reporting that bug. It has been fixed in the current commit (installed already on megon02) > Stefan, thank you for fixing both problems, I have seen them, too, but no time to deal with them. K.O.
2916	05 Dec 2024	Konstantin Olchanski	Bug Report	ODB key picker does not close when creating link / Edit-on-run string box too large
> > Another more minor visual problem is the edit-on-start dialog. There seems to be no upper bound to the > > size of the text box. In the attached screenshot, ShortString has a maximum length of 32 characters, > > LongString has 255. Both are empty at the time of the screenshot. Maybe, the size should be limited to a > > reasonable width. > > I limited the input size now to (arbitrarily) 100 chars. The string can still be longer than 100 chars, and you start then scrolling inside the input box. Let me know if > that's ok this way. I am moving the dragon experiment to the new midas and we see this problem on the begin-of-run page. Old midas: no horizontal scroll bar, edit-on-start names, values and comments are all squeezed in into the visible frame. New midas: page is very wide, values entry fields are very long and there is a horizontal scroll bar. So something got broken in the htlm formatting or sizing. I should be able to spot the change by doing a diff between old resources/start.html and the new one. K.O.
2918	06 Dec 2024	Konstantin Olchanski	Bug Report	TMFE::Sleep() errors
> > status = select(1, &fdset, NULL, NULL, &timeout); > > On error, -1 is returned, ... timeout becomes undefined. I confirm Linux and MacOS man pages and select() with EINTR work as I remember, Linux updates "timeout" to account for the time already slept, MacOS does not ("timeout" is unchanged). So the original code is roughly correct, but long sleeps will not work right if SIGALRM fires during sleeping. Note that MIDAS no longer uses SIGALRM to fire cm_watchdog() (it was moved to a thread) and MIDAS does not use signals, so handling of EINTR is now moot. (Please correct me if I missed something). The original bug report was about EINVAL, and best I can tell, it was caused by calls to TMFE::Sleep() with strange sleep times that caused invalid values to be computed into the select() timeout. To improve on this, I make these changes: 1) TMFE::Sleep(0) will report an error and will not sleep 2) TMFE::Sleep(negative number) will report an error and will not sleep (please check the sleep time before calling TMFE::Sleep()) 3) TMFE::Sleep(1 sec or less) will sleep using select(). (I also looked into using poll(), ppoll() and pselect()). 4) TMFE::Sleep(more than 1 second) will use a loop to sleep in increments of 1 second and will use one additional syscall to read the current time to decide how much more to sleep. 5) if select() returns EINVAL, the error message will reporting the sleep time and the values in "timeout". A side effect of this is that on both Linux and MacOS long sleeps work correctly if interrupted by SIGALRM, because SIGALRM granularity is 1 sec and our sleep time is also 1 sec. Commit [develop 06735d29] improve TMFE::Sleep() K.O.
2919	06 Dec 2024	Konstantin Olchanski	Bug Report	TMFE::Sleep() errors
> Commit [develop 06735d29] improve TMFE::Sleep() Report of test_sleep on Ubuntu-22 (Intel E-2236) and MacOS 14.6.1 (M1 MAX Mac Studio). It is easy to see Ubuntu-22 (kernel 6.2.x) sleep granularity is ~60 usec, MacOS sleep granularity is <2 usec. (Sub 1 usec sleep likely measures the syscall() speed, 500 ns on Intel and 200 ns on ARM M1 MAX). (NOTE: long sleep is interrupted by an alarm roughly 10 seconds into the sleep, see progs/test_sleep.cxx) daq00:midas$ uname -a Linux daq00.triumf.ca 6.2.0-39-generic #40~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 16 10:53:04 UTC 2 x86_64 x86_64 x86_64 GNU/Linux daq00:midas$ ./bin/test_sleep test short sleep: sleep 10 loops, 100000.000 usec per loop, 1.000000000 sec total, 1.002568007 sec actual total, 100256.801 usec actual per loop, oversleep 256.801 usec, 0.3% sleep 100 loops, 10000.000 usec per loop, 1.000000000 sec total, 1.025897980 sec actual total, 10258.980 usec actual per loop, oversleep 258.980 usec, 2.6% sleep 1000 loops, 1000.000 usec per loop, 1.000000000 sec total, 1.169670105 sec actual total, 1169.670 usec actual per loop, oversleep 169.670 usec, 17.0% sleep 10000 loops, 100.000 usec per loop, 1.000000000 sec total, 1.573357105 sec actual total, 157.336 usec actual per loop, oversleep 57.336 usec, 57.3% sleep 99999 loops, 10.000 usec per loop, 0.999990000 sec total, 6.963442087 sec actual total, 69.635 usec actual per loop, oversleep 59.635 usec, 596.4% sleep 1000000 loops, 1.000 usec per loop, 1.000000000 sec total, 60.939687967 sec actual total, 60.940 usec actual per loop, oversleep 59.940 usec, 5994.0% sleep 1000000 loops, 0.100 usec per loop, 0.100000000 sec total, 0.613572121 sec actual total, 0.614 usec actual per loop, oversleep 0.514 usec, 513.6% sleep 1000000 loops, 0.010 usec per loop, 0.010000000 sec total, 0.576359987 sec actual total, 0.576 usec actual per loop, oversleep 0.566 usec, 5663.6% test long sleep: requested 126.000000000 sec ... sleeping ... alarm! test long sleep: requested 126.000000000 sec, actual 126.000875950 sec bash-3.2$ uname -a Darwin send.triumf.ca 23.6.0 Darwin Kernel Version 23.6.0: Mon Jul 29 21:14:30 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T6000 arm64 bash-3.2$ ./bin/test_sleep test short sleep: sleep 10 loops, 100000.000 usec per loop, 1.000000000 sec total, 1.032556057 sec actual total, 103255.606 usec actual per loop, oversleep 3255.606 usec, 3.3% sleep 100 loops, 10000.000 usec per loop, 1.000000000 sec total, 1.245460033 sec actual total, 12454.600 usec actual per loop, oversleep 2454.600 usec, 24.5% sleep 1000 loops, 1000.000 usec per loop, 1.000000000 sec total, 1.331466913 sec actual total, 1331.467 usec actual per loop, oversleep 331.467 usec, 33.1% sleep 10000 loops, 100.000 usec per loop, 1.000000000 sec total, 1.281141996 sec actual total, 128.114 usec actual per loop, oversleep 28.114 usec, 28.1% sleep 99999 loops, 10.000 usec per loop, 0.999990000 sec total, 1.410759926 sec actual total, 14.108 usec actual per loop, oversleep 4.108 usec, 41.1% sleep 1000000 loops, 1.000 usec per loop, 1.000000000 sec total, 2.400593996 sec actual total, 2.401 usec actual per loop, oversleep 1.401 usec, 140.1% sleep 1000000 loops, 0.100 usec per loop, 0.100000000 sec total, 0.188431025 sec actual total, 0.188 usec actual per loop, oversleep 0.088 usec, 88.4% sleep 1000000 loops, 0.010 usec per loop, 0.010000000 sec total, 0.188102007 sec actual total, 0.188 usec actual per loop, oversleep 0.178 usec, 1781.0% test long sleep: requested 126.000000000 sec ... sleeping ... alarm! test long sleep: requested 126.000000000 sec, actual 126.001244068 sec K.O.
2927	01 Jan 2025	Konstantin Olchanski	Forum	time ordering of run transition calls to TMFeEquipment things
> I have a question about "tmfe approach" to implementing MIDAS frontends. If I read the code correctly, > within this approach it is the TMFeEquipment things, not the TMFrontend's themselves, > which handle the run transitions - the TMFrontend class that's correct and it is documented so in https://bitbucket.org/tmidas/midas/src/develop/tmfe.md > So how does a user control the sequence in which TMFeEquipment::HandleBeginRun functions of different > TMFeEquipment pieces are called at begin run? - there are two cases to consider: TMFeEquipment things > defined by the same TMFrontend and by different TMFrontend's. I am not sure what you are trying to do. It is always easier to suggest a solution to a specific problem. But I will try to answer anyway: 1) "time ordering of run transitions" - of course midas transitions are ordered by transition sequence numbers and the tmfe class provides methods to control this. ditto for the mfe.cxx frontends. 2) for one TMFrontend, the order of calling HandleBeginRun() is the order in which equipments were added to the equipment using FeAddEquipment(). HandleEndRun() is called in reverse order. (I better check this). 3) to have multiple TMFrontends in one program would be unusual (mfe.cxx frontends completely do not support this), but should work. Everything was coded to support this, but it was never tested in practice because we cannot invent any useful use-case for it. HandleBeginRun() handlers are likely to be called in the frontends are created. (I could check this and confirm it works, as long as you have a valid use-case for this configuration). 4) Frontend X has EquipmentA and EquipmentB, you want EqA::HandleBeginRun() to be called at run transition 200 and EqB::HandleBeginRun() to be called at run transition 400. This is not directly supported by mfe.cxx frontends (the begin_run() handler is a global function) and I did not directly implement it in the TMFE frontend. But I think this would be a useful improvement. I will look into this. Likely I will add per-equipment data members fEqConfBeginRunSeqNo, fEqConfEndRunSeqno, etc. Value 0 would unregister the corresponding run transition handler. This would cleanup the code quite a bit, a bunch of RegisterTranstionXXX functions could go away. K.O.
2939	06 Feb 2025	Konstantin Olchanski	Forum	Transition from mana -> manalyzer
> Could somebody please give me a boost? no need to shout into the void, it is pretty easy to identify the author of manalyzer and ask me directly. > we are planning to migrate from mana to manalyzer. I started to have a look into it and realized that I have some lose ends. > Is there a clear migration docu somewhere? README.md and examples in the manalyzer git repository. If something is missing, or unclear, please ask. > Currently I understand it the following way (which might be wrong): > The class TARunObject is used to write analyzer modules which are registered by TAFactory. I hope this is right? Please read the README file. It explains what is going on there. Design of manalyzer had 2 main goals: a) lifetime of all c++ objects (and ROOT objects) is well defined (to fix a design defect in rootana) b) event flow and data flow are documentable (problem in mfe.c frontends, etc) > However, in mana there is an analyzer implemented by the user which binds the modules and has additional routines: > analyzer_init(), analyzer_exit(), analyzer_loop() > ana_begin_of_run(), ana_end_of_run(), ana_pause_run(), ana_resume_run() > which we are using. I have never used the mana analyzer, I wrote the c++ rootana analyzer very early on (first commit in 2006). But the basic steps should all be there for you: - initialization (create histograms, open files) can be done in the module constructor or in BeginRun() - finalization (fit histograms, close files) should be done in EndRun() - event processing (obviously) in Analyze() - pause run, resume run and switch to next subrun file have corresponding methods - all the "flow" and multithreading stuff you can ignore to first order. To start the migration, I recommend you take manalyzer_example_root.cxx and start stuffing it with your code. If you run into any problems, I am happy to help with them. Ask here or contact me directly. > This part I somehow miss in manalyzer, most probably due to lack of understanding, and missing documentation. True, I wrote a migration guide for the frontend mfe.c to c++ tmfe, because we do this migration quite often. But I never wrote a migration guide from mana.c analyzer, because we never did such migration. Most experiments at TRIUMF are post-2006 and use rootana in it's different incarnations. P.S. I designed the C++ TMFe frontend after manalyzer and I think it came out quite better, I especially value the design input from Stefan, Thomas, Pierre, Joseph and Ben. P.P.S. Be free to ignore all this manalyzer business and write your own analyzer based on the midasio library: int main() { TMReaderInteraface* f = TMNewReader(file.mid.gz"); while (1) { TMEvent* e = TMReadEvent(f); dwim(e); delete e; } delete f; } For online processing I use the TMFe class, it has enough bits to be a frontend and an analyzer, or you can use the older TMidasOnline from rootana. Access to ODB is via the mvodb library, which is new in midas, but has been part of rootana and my frontend toolkit since at least 2011 or earlier, inspired by Peter Green's even older "myload" ODB access library. K.O.
2940	06 Feb 2025	Konstantin Olchanski	Forum	Transition from mana -> manalyzer
> > Is there a clear migration docu somewhere? I can also give you links to the alpha-g analyzer (very complex) and the DarkLight trigger TDC analyzer (very simple), there is also analyzer examples of in-between complexity. K.O.
2941	07 Feb 2025	Konstantin Olchanski	Info	switch midas to next c++
to continue where we left off in 2019, https://daq00.triumf.ca/elog-midas/Midas/1520 time to choose the next c++! snapshot of 2019: - Linux RHEL/SL/CentOS6 - gcc 4.4.7, no C++11. - Linux RHEL/SL/CentOS7 - gcc 4.8.5, full C++11, no C++14, no C++17 - Ubuntu 18.04.2 LTS - gcc 7.3.0, full C++11, full C++14, "experimental" C++17. - MacOS 10.13 - llvm 10.0.0 (clang-1000.11.45.5), full C++11, full C++14, full C++17 the world moved on: - el6/SL6 is gone - el7/CentOS-7 is out the door, only two experiments on my plate (EMMA and ALPHA-g) - el8 was a still born child of RedHat - el9 - gcc 11.5 with 12, 13, and 14 available. - el10 - gcc 14.2 - U-18 - gcc 7.5 - U-20 - gcc 9.4 default, 10.5 available - U-22 - gcc 11.4 default, 12.3 available - U-24 - gcc 13.3 default, 14.2 available - MacOS 15.2 - llvm/clang 16 Next we read C++ level support: (see here for GCC C++ support: https://gcc.gnu.org/projects/cxx-status.html) (see here for LLVM clang c++ support: https://clang.llvm.org/cxx_status.html) (see here for GLIBC c++ support: https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html) gcc: 4.4.7 - no C++11 4.8.5 - full C++11, no C++14, no C++17 7.3.0 - full C++11, full C++14, "experimental" C++17. 7.5.0 - c++17 and older 9.4.0 - c++17 and older 10.5 - no c++26, no c++23, mostly c++20, c++17 and older 11.4 - no c++26, no c++23, full c++20 and older 12.3 - no c++26, mostly c++23, full c++20 and older 13.3 - no c++26, mostly c++23, full c++20 and older 14.2 - limited c++26, mostly c++23, full c++20 and older clang: 16 - no c++26, mostly c++23, mostly c++20, full c++17 and older I think our preference is c++23, the number of useful improvements is quite big. This choice will limit us to: - el9 or older - U-22 or older - current MacOS 15.2 with Xcode 16. It looks like gcc and llvm support for c++23 is only partial, so obviously we will use a subset of c++23 supported by both. Next step is to try to build midas with c++23 on el9 and U-22,24 and see what happens. K.O.
2949	14 Mar 2025	Konstantin Olchanski	Bug Report	python hist_get_events not returning events, but javascript does
> After starting midas (mhttpd &, and mlogger -D) and running the `fetest` frontend I went into the midas/python/examples directory and ran basic_hist_script.py, and, even though I could see the 'pytest' program in the Programs page, > > Valid events are: > Enter event name: > > was printed out, which signified that no events were found. No errors were displayed. To check that MIDAS itself is built correctly, you can try "make test", this will create a sample experiment, run fetest, start, stop a run and check that data file and history file is created with correct history events. If "make test" fails, I can help debug it. In your experiment, you can check that history files are created correctly: 1) "mhist -l" should show all available events 2) "mhdump -L .hst" should show all events in the .hst history files 3) if you have the newer mhf.dat files, you can "more mhf_1449770978_20151210_hv.dat" to see what data is inside If all of that works as expected, there must be a problem with the python side and we will have to figure out how to reproduce it. This reminds me, "make test" does not test any of the python code, it should be added (and python should be added to the bitbucket builds). K.O.
2956	18 Mar 2025	Konstantin Olchanski	Bug Report	python hist_get_recent_data returns no historical data
> > However right after running these commands I removed a .SHM_HOST.TXT file > Instead of deleting .SHM_HOSTS.TXT, please create it as an empty file. I thought the documentation is clear about it? Also we recommend installing MIDAS to $HOME/packages/midas. There is a number of problems if installed at top level. If you want to be compliant with the Linux LFS, /opt/midas is also a good place. > > ... and it suddenly worked! > We still did not establish if mhist, mhdump and the other commands I sent you work correctly, to confirm MIDAS is creating correct history files. (before you try to read them with python). Also we did not establish that you have correct paths setup in ODB /Logger/History. Many things can go wrong. Next time python history malfunctions, please do all those other things and report to us. Thanks! K.O.
2958	19 Mar 2025	Konstantin Olchanski	Forum	MidasWiki special page access now restricted to logged in users
We have a problem with over aggressive AI bots. They are scanning everything and are putting a crazy load on the mediawiki mysql database. This makes all out wikis very slow and unresponsive, which is a big problem. These AI scanners do not seem to know what they are doing and are trying to load all the special pages, like "what links here" and "recent changes" and complete revision histories for each page. I am not impressed. Old school google and bing scanner bots are much more respectful and do not cause problems. AI bots are also scanning this elog, and to Stefan's credit elogd is holding the load just fine. To reduce load on the MidasWiki mysql database, I now restricted access to most special pages to logged in users. It seems to have helped. If this causes big difficulty or if you have suggestion on better ways to deal with aggressive AI scanner bots, please speak up. One alternative is to put MidasWiki behind a generic login page with a well known password. Downside is it will block google and bing search engines, too. K.O.

Goto page Previous 1, 2, 3 ... 125, 126, 127 ... 156, 157, 158 Next

ELOG V3.1.4-2e1708b5