Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  07 Oct 2024, Amy Roberts, Bug Report, Difficulty running MIDAS on Rocky 9.4 
    Reply  07 Oct 2024, Ben Smith, Bug Report, Difficulty running MIDAS on Rocky 9.4 
       Reply  08 Oct 2024, Amy Roberts, Bug Report, Difficulty running MIDAS on Rocky 9.4 
          Reply  08 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
             Reply  08 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
    Reply  07 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
       Reply  08 Oct 2024, Mark Grimes, Bug Report, Difficulty running MIDAS on Rocky 9.4 
          Reply  08 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
       Reply  08 Oct 2024, Amy Roberts, Bug Report, Difficulty running MIDAS on Rocky 9.4 
          Reply  08 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
             Reply  10 Oct 2024, Amy Roberts, Bug Report, Difficulty running MIDAS on Rocky 9.4 
                Reply  13 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
                   Reply  16 Oct 2024, Amy Roberts, Bug Report, Difficulty running MIDAS on Rocky 9.4 
                      Reply  18 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
                         Reply  18 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
                            Reply  28 Oct 2024, Amy Roberts, Bug Report, Difficulty running MIDAS on Rocky 9.4 
    Reply  06 Nov 2024, Amy Roberts, Bug Report, Difficulty running MIDAS on Rocky 9.4 
       Reply  22 Nov 2024, Konstantin Olchanski, Bug Report, ODB lock timeout, Difficulty running MIDAS on Rocky 9.4 
          Reply  24 Nov 2024, Pavel Murat, Bug Report, ODB lock timeout, Difficulty running MIDAS on Rocky 9.4 
Message ID: 2865     Entry time: 08 Oct 2024     In reply to: 2862     Reply to this: 2868
Author: Amy Roberts 
Topic: Bug Report 
Subject: Difficulty running MIDAS on Rocky 9.4 
> > We're trying to install the SuperCDMS version of MIDAS on a Rocky 9.4 Virtual 
> > Machine and are getting a persistent error when we run mserver.  As far as I 
> > know there are minimal changes between this and the MIDAS branch, but Ben Smith 
> > may have more to say on this.
> 
> For reference, "the SuperCDMS version of MIDAS" is just a fork that no longer has any meaningful differences vs the main MIDAS repo, but we only pull updates infrequently after testing a bunch. We last pulled from the develop branch in November 2023. But that should be irrelevant here as semaphore code hasn't been touched for a very long time.
> 
> We're running Alma 9.4 on a machine at TRIUMF and the same version of midas works fine there (Amy, you may already have access to scdms-zeus). I believe Alma and Rocky should be basically identical for this.
> 
> So the questions are:
> * Have you tried other midas programs, or only mserver? E.g. did odbedit and mhttpd work?
> * If other programs work, have you been running them all as the same user? In particular, if you ran one program as root and another as an unprivileged user, then you will likely get odd permissions issues.
> * What do you see if you run `ls -l /dev/shm` and `ls -l ~/packages/SuperCDMS_DAQ/MidasDAQ/online/.*SHM`? (Or wherever your online dir is for the 2nd one).
> * Did you follow the full instructions for recovering from a corrupt ODB? https://daq00.triumf.ca/MidasWiki/index.php/FAQ#How_to_recover_from_a_corrupted_ODB   In particular the bit about running odbinit with the --cleanup flag?

Here's what happens when I try to run odbedit:

[lekhraj@sdfcdmsdaq setup]$ odbedit
[ODBEdit,ERROR] [odb.cxx:2052:db_open_database,ERROR] Removed ODB client 'ODBEdit', index 0 because process pid 481823 does not exists
[ODBEdit,INFO] Removed open record flag from "/Experiment/Security/RPC hosts/Allowed hosts"
[ODBEdit,INFO] Removed exclusive access mode from "/Experiment/Security/RPC hosts/Allowed hosts"
[ODBEdit,INFO] Corrected 1 ODB entries
[ODBEdit,INFO] Deleted entry '/System/Clients/481823' for client 'ODBEdit' because it is not connected to ODB
[ODBEdit,INFO] Client 'ODBEdit' on buffer 'SYSMSG' removed by bm_open_buffer because process pid 481823 does not exist
[ODBEdit,ERROR] [odb.cxx:2498:db_lock_database,ERROR] cannot lock ODB semaphore, timeout 10000 ms, exiting...
[ODBEdit,ERROR] [midas.cxx:2205:cm_check_connect,ERROR] cm_disconnect_experiment not called at end of program
db_lock_database: Detected recursive call to db_{lock,unlock}_database() while already inside db_{lock,unlock}_database(). Maybe this is a call from a signal handler. Cannot continue, aborting...
Aborted (core dumped)
[lekhraj@sdfcdmsdaq setup]$

And mhttpd:

[lekhraj@sdfcdmsdaq setup]$ mhttpd
[mhttpd,ERROR] [odb.cxx:2052:db_open_database,ERROR] Removed ODB client 'ODBEdit', index 0 because process pid 601054 does not exists
[mhttpd,INFO] Removed open record flag from "/Experiment/Security/RPC hosts/Allowed hosts"
[mhttpd,INFO] Removed exclusive access mode from "/Experiment/Security/RPC hosts/Allowed hosts"
[mhttpd,INFO] Corrected 1 ODB entries
[mhttpd,INFO] Deleted entry '/System/Clients/601054' for client 'ODBEdit' because it is not connected to ODB
[mhttpd,INFO] Client 'ODBEdit' on buffer 'SYSMSG' removed by bm_open_buffer because process pid 601054 does not exist
[mhttpd,INFO] ODB subtree /Runinfo corrected successfully
Password protection is off
Hostlist off, connections from anywhere will be accepted
Listening on "http://localhost:8080", passwords OFF, hostlist OFF
Listening on "http://[::1]:8080", passwords OFF, hostlist OFF
bm_lock_buffer: Lock buffer "SYSMSG" is taking longer than 1 second!
[mhttpd,ERROR] [odb.cxx:2498:db_lock_database,ERROR] cannot lock ODB semaphore, timeout 10000 ms, exiting...
[mhttpd,ERROR] [midas.cxx:2205:cm_check_connect,ERROR] cm_disconnect_experiment not called at end of program
db_lock_database: Detected recursive call to db_{lock,unlock}_database() while already inside db_{lock,unlock}_database(). Maybe this is a call from a signal handler. Cannot continue, aborting...
Aborted (core dumped)
[lekhraj@sdfcdmsdaq setup]$

We have been running everything as a single user, the user who cloned the repositories and owns the directories.

We did follow the corrupted-ODB cleanup instructions.

[lekhraj@sdfcdmsdaq setup]$ ls -lh /dev/shm
total 1.3M
-rw------- 1 lekhraj dm 1.2M Oct  8 14:13 17468_test_ODB__sdf_home_l_lekhraj_packages_SuperCDMS_DAQ_MidasDAQ_online_
-rw------- 1 lekhraj dm 114K Oct  7 14:06 17468_test_SYSMSG__sdf_home_l_lekhraj_packages_SuperCDMS_DAQ_MidasDAQ_online_

[lekhraj@sdfcdmsdaq setup]$ ls -lh ~/packages/SuperCDMS_DAQ/MidasDAQ/online/.*SHM
-rw-r--r-- 1 lekhraj dm    0 Oct  3 08:46 /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/MidasDAQ/online/.ALARM.SHM
-rw-r--r-- 1 lekhraj dm    0 Oct  3 08:46 /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/MidasDAQ/online/.ELOG.SHM
-rw-r--r-- 1 lekhraj dm    0 Oct  3 08:46 /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/MidasDAQ/online/.HISTORY.SHM
-rw-r--r-- 1 lekhraj dm    0 Oct  3 08:46 /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/MidasDAQ/online/.LAZY.SHM
-rw-r--r-- 1 lekhraj dm    0 Oct  3 08:46 /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/MidasDAQ/online/.MSG.SHM
-rw-r--r-- 1 lekhraj dm 1.2M Oct  8 14:12 /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/MidasDAQ/online/.ODB.SHM
-rw-r--r-- 1 lekhraj dm    0 Oct  3 08:46 /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/MidasDAQ/online/.SYSMSG.SHM
-rw-r--r-- 1 lekhraj dm    0 Oct  3 08:46 /sdf/home/l/lekhraj/packages/SuperCDMS_DAQ/MidasDAQ/online/.SYSTEM.SHM
ELOG V3.1.4-2e1708b5