Back Midas Rome Roody Rootana
  Midas DAQ System  Not logged in ELOG logo
Entry  07 Oct 2024, Amy Roberts, Bug Report, Difficulty running MIDAS on Rocky 9.4 
    Reply  07 Oct 2024, Ben Smith, Bug Report, Difficulty running MIDAS on Rocky 9.4 
       Reply  08 Oct 2024, Amy Roberts, Bug Report, Difficulty running MIDAS on Rocky 9.4 
          Reply  08 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
             Reply  08 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
    Reply  07 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
       Reply  08 Oct 2024, Mark Grimes, Bug Report, Difficulty running MIDAS on Rocky 9.4 
          Reply  08 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
       Reply  08 Oct 2024, Amy Roberts, Bug Report, Difficulty running MIDAS on Rocky 9.4 
          Reply  08 Oct 2024, Konstantin Olchanski, Bug Report, Difficulty running MIDAS on Rocky 9.4 
Message ID: 2869     Entry time: 08 Oct 2024     In reply to: 2864
Author: Konstantin Olchanski 
Topic: Bug Report 
Subject: Difficulty running MIDAS on Rocky 9.4 
> Basically I think Rocky 9.4 is a red herring.

This is what likely happened:
- some program crashed while holding the ODB lock semaphore (by ctrl-C at the wrong time or by kill -KILL at the wring time)
- semaphore is locked with a flag "unlock if this program stops"
- this is supposed to ensure ODB lock semaphore never gets stuck in the locked state
- (there is no code in MIDAS to unlock the ODB lock semaphore without locking it first)
- we have observed a malfunction in the Linux kernel, where this automatic unlock does not happen.
- it is rare and so far cannot be reproduced. you can find more about it by searching this forum.
- I think it is a bug in the System-V semaphore code or in the Linux "program stop" code (a path where they fail to call the semaphore unlock handler, and who knows what 
other handlers).
- System-V semaphores are very obsolete, replaced by POSIX semaphores.
- POSIX semaphores do not have the "unlock if this program stops" magic, the user (MIDAS) is responsible with detecting that the program who locked the semaphore is gone 
and and with taking corrective action, i.e. release the lock, automatically.

I do not know if this problem with System-V semaphores is in the generic Linux kernel, or if it is specific
to the Red Hat kernels (they are known to have many patches, deviating quite far from vanilla kernels).

I do not know if this problem is somehow sensitive to virtual machines.

So yes/no, Red Hat derived linux on a virtual machine could be where this problem happens more often that elsewhere.

K.O.
ELOG V3.1.4-2e1708b5