Back Midas Rome Roody Rootana
  Midas DAQ System, Page 126 of 138  Not logged in ELOG logo
ID Date Author Topic Subjectup
  2560   21 Jul 2023 Gennaro TortoneForumpull request for PostgreSQL support









Hi Konstantin,

thanks a lot for your work on PostgreSQL and TimescaleDB integration...
and sorry for unrelated changes on source code !

I will return on this task at end of this year (maybe October or November) because
I'm working on different tasks... but I will keep in mind your suggestions in order
to provide good source code.

Thanks,
Gennaro


> 
> I merged the PgSql bits by hand - the automatic tools make a dog's breakfast from the history_schema.cxx diffs. Ouch.
> 
> history_schema.cxx merged pretty much cleanly, but I have one question about CreateSqlColumn() with sql_strict set to "true". Can you say 
> more why this is needed? Should this also be made the default for MySQL? The best I can tell the default values are only needed if we write 
> to SQL but forget to provide values that should not be NULL? But our code never does this? Or this is for reading from SQL, where NULL values 
> are replaced with the default values? I do not have time to look into this right now, I hope you can clarify it for me?
> 
> Also notice the fDownsample is set to zero and cannot be changed. I recommend we set it through the MakeMidasHistoryPgsql() factory method.
> 
> Please pull, merge, retest, update the pull request, check that there is no unrelated changes (changes in mlogger.cxx is a direct red flag!) 
> and we should be able to merge the rest of your stuff pronto.
> 
> K.O.
> 
> commit e85bb6d37c85f02fc4895cae340ba71ab36de906 (HEAD -> develop, origin/develop, origin/HEAD)
> Author: Konstantin Olchanski <olchansk@triumf.ca>
> Date:   Fri Jul 21 09:45:08 2023 -0700
> 
>     merge PQSQL history in history_schema.cxx
> 
> commit f254ebd60a23c6ee2d4870f3b6b5e8e95a8f1f09
> Author: Konstantin Olchanski <olchansk@triumf.ca>
> Date:   Fri Jul 21 09:19:07 2023 -0700
> 
>     add PGSQL Makefile bits
> 
> commit aa5a35ba221c6f87ae7a811236881499e3d8dcf7
> Author: Konstantin Olchanski <olchansk@triumf.ca>
> Date:   Fri Jul 21 08:51:23 2023 -0700
> 
>     merge PGSQL support from https://bitbucket.org/gtortone/midas/branch/feature/timescaledb_support except for history_schema.cxx
  2563   28 Jul 2023 Stefan RittForumpull request for PostgreSQL support
The compilation of midas was broken by the last modification. The reason is that 

   Pgsql *fPgsql = NULL;

was not protected by 

#ifdef HAVE_PGSQL

So I put all PGSQL code under a big #ifdef and now it compiles again. You might want to double check my modification at

https://bitbucket.org/tmidas/midas/commits/e3c7e73459265e0d7d7a236669d1d1f2d9292a74

Best,
Stefan
  2576   09 Aug 2023 Konstantin OlchanskiForumpull request for PostgreSQL support
> The compilation of midas was broken by the last modification. The reason is that 
>    Pgsql *fPgsql = NULL;
> was not protected by #ifdef HAVE_PGSQL

confirmed, my mistake, I forgot to test with "make cmake NO_PGSQL". your fix is correct, thanks.

K.O.
  423   05 Feb 2008 Denis BilenkoInfopymidas 0.6.0 released - python bindings for Midas
Hi!

I have released pymidas - Python binding to Midas.
It includes support for Online Database, Buffer, event
construction and parsing. 

We have used it for a couple years now here at CMD. (http://cmd.inp.nsk.su)
One of principal DAQ applications here (Slow Control Frontend) is
written in Python using pymidas.

http://cmd.inp.nsk.su/~bilenko/projects/pymidas/pymidas.html
  2493   01 May 2023 Giovanni MazzitelliBug Reportpython issue with mathplot lib vs odb query
Ciao, 
we have a very strange issue with python lib with client.odb_get("/") function 
when running as midas process and matplotlib is used.


we are developing a remote console by means of sending via kafka producer the odb, 
camera image and pmt waveforms, in the INFN cloud where grafana make available 
data for non expert shifters, as well as sending midas events for online 
reconstruction to the htcondr queue on cloud. The process work perfectly and allow 
use to parallelise to standard midas pipeline for file production, ecc the online 
monitoring and data processing where we have computing resources (our DAQ is 
underground at LNGS). Part of the work will be presented next weak at CHEP
the full code is available at https://github.com/CYGNUS-
RD/middleware/blob/master/dev/event_producer_s3.py
but to get the strange behaviour I report here a test script:

----
def main(verbose=False):
    from matplotlib import pyplot as plt

    import time

    import midas
    import midas.client

    
    client = midas.client.MidasClient("middleware")
    buffer_handle = client.open_event_buffer("SYSTEM",None,1000000000)
    request_id = client.register_event_request(buffer_handle, sampling_type = 2) 
    
    fpath = os.path.dirname(os.path.realpath(sys.argv[0]))
    
    
    while True:
        # 
        odb = client.odb_get("/")
        if verbose:
            print(odb)
        start1 = time.time()
              
        client.communicate(10)
        time.sleep(1)
        

    client.deregister_event_request(buffer_handle, request_id)

    client.disconnect()
----
if I run it as cli interactivity including or not matplotlib the everything si ok. 
As I run it as midas "program" I get: 
-----
Traceback (most recent call last):
  File "/home/standard/daq/middleware/dev/test_midas_error.py", line 48, in 
<module>
    main(verbose=options.verbose)
  File "/home/standard/daq/middleware/dev/test_midas_error.py", line 29, in main
    odb = client.odb_get("/")
  File "/home/standard/packages/midas/python/midas/client.py", line 354, in 
odb_get
    retval = midas.safe_to_json(buf.value, use_ordered_dict=True)
  File "/home/standard/packages/midas/python/midas/__init__.py", line 552, in 
safe_to_json
    return json.loads(decoded, strict=False, 
object_pairs_hook=collections.OrderedDict)
  File "/usr/lib/python3.8/json/__init__.py", line 370, in loads
    return cls(**kw).decode(s)
  File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: 
line 300 column 26 (char 17535)
----
if I comment out the import of matplotlib every think works perfectly again also 
as midas program. 

it seams that there is a difference between the to way of use the code, and that 
is sufficient the call to matplotlib to corrupt in some way the odb. any ideas?
  2494   01 May 2023 Ben SmithBug Reportpython issue with mathplot lib vs odb query
> it seams that there is a difference between the to way of use the code, and that 
> is sufficient the call to matplotlib to corrupt in some way the odb. any ideas?

I can't reproduce this on my machines, so this is going to be fun to debug!

Can you try running the program below please? It takes the important bits from odb_get() but prints out the string before we try to parse it as JSON. Feel free to send me the output via email (bsmith@triumf.ca) if you don't want to post your entire ODB dump in the elog.




import sys
import os
import time
import midas
import midas.client
import ctypes

def debug_get(client):
    c_path = ctypes.create_string_buffer(b"/")
    hKey = ctypes.c_int()
    client.lib.c_db_find_key(client.hDB, 0, c_path, ctypes.byref(hKey))

    buf = ctypes.c_char_p()
    bufsize = ctypes.c_int()
    bufend = ctypes.c_int()

    client.lib.c_db_copy_json_save(client.hDB, hKey, ctypes.byref(buf), ctypes.byref(bufsize), ctypes.byref(bufend))

    print("-" * 80)
    print("FULL DUMP")
    print("-" * 80)
    print(buf.value)
    print("-" * 80)
    print("Chars 17000-18000")
    print("-" * 80)
    print(buf.value[17000:18000])
    print("-" * 80)

    as_dict = midas.safe_to_json(buf.value, use_ordered_dict=True)

    client.lib.c_free(buf)

    return as_dict

def main(verbose=False):
    client = midas.client.MidasClient("middleware")
    buffer_handle = client.open_event_buffer("SYSTEM",None,1000000000)
    request_id = client.register_event_request(buffer_handle, sampling_type = 2)

    fpath = os.path.dirname(os.path.realpath(sys.argv[0]))

    while True:
        # odb = client.odb_get("/")
        odb = debug_get(client)

        if verbose:
            print(odb)
        start1 = time.time()

        client.communicate(10)
        time.sleep(1)


    client.deregister_event_request(buffer_handle, request_id)

    client.disconnect()

if __name__ == "__main__":
    main()
  Draft   01 May 2023 Giovanni MazzitelliBug Reportpython issue with mathplot lib vs odb query
> > it seams that there is a difference between the to way of use the code, and that 
> > is sufficient the call to matplotlib to corrupt in some way the odb. any ideas?
> 
> I can't reproduce this on my machines, so this is going to be fun to debug!
> 
> Can you try running the program below please? It takes the important bits from odb_get() but prints out the string before we try to parse it as JSON. Feel free to send me the output via email (bsmith@triumf.ca) if you don't want to post your entire ODB dump in the elog.
> 
> 
> 
> 
> import sys
> import os
> import time
> import midas
> import midas.client
> import ctypes
> 
> def debug_get(client):
>     c_path = ctypes.create_string_buffer(b"/")
>     hKey = ctypes.c_int()
>     client.lib.c_db_find_key(client.hDB, 0, c_path, ctypes.byref(hKey))
> 
>     buf = ctypes.c_char_p()
>     bufsize = ctypes.c_int()
>     bufend = ctypes.c_int()
> 
>     client.lib.c_db_copy_json_save(client.hDB, hKey, ctypes.byref(buf), ctypes.byref(bufsize), ctypes.byref(bufend))
> 
>     print("-" * 80)
>     print("FULL DUMP")
>     print("-" * 80)
>     print(buf.value)
>     print("-" * 80)
>     print("Chars 17000-18000")
>     print("-" * 80)
>     print(buf.value[17000:18000])
>     print("-" * 80)
> 
>     as_dict = midas.safe_to_json(buf.value, use_ordered_dict=True)
> 
>     client.lib.c_free(buf)
> 
>     return as_dict
> 
> def main(verbose=False):
>     client = midas.client.MidasClient("middleware")
>     buffer_handle = client.open_event_buffer("SYSTEM",None,1000000000)
>     request_id = client.register_event_request(buffer_handle, sampling_type = 2)
> 
>     fpath = os.path.dirname(os.path.realpath(sys.argv[0]))
> 
>     while True:
>         # odb = client.odb_get("/")
>         odb = debug_get(client)
> 
>         if verbose:
>             print(odb)
>         start1 = time.time()
> 
>         client.communicate(10)
>         time.sleep(1)
> 
> 
>     client.deregister_event_request(buffer_handle, request_id)
> 
>     client.disconnect()
> 
> if __name__ == "__main__":
>     main()
Thank you!
if I added the mat
  2496   01 May 2023 Giovanni MazzitelliBug Reportpython issue with mathplot lib vs odb query
> > it seams that there is a difference between the to way of use the code, and that 
> > is sufficient the call to matplotlib to corrupt in some way the odb. any ideas?
> 
> I can't reproduce this on my machines, so this is going to be fun to debug!
> 
> Can you try running the program below please? It takes the important bits from odb_get() but prints out the string before we try to parse it as JSON. Feel free to send me the output via email (bsmith@triumf.ca) if you don't want to post your entire ODB dump in the elog.

Thank you!
if I added the matplotlib as follow:

#!/usr/bin/env python3

import sys
import os
import time
import midas
import midas.client
import ctypes
from matplotlib import pyplot as plt

def debug_get(client):
    c_path = ctypes.create_string_buffer(b"/")
    hKey = ctypes.c_int()
    client.lib.c_db_find_key(client.hDB, 0, c_path, ctypes.byref(hKey))

    buf = ctypes.c_char_p()
    bufsize = ctypes.c_int()
    bufend = ctypes.c_int()

    client.lib.c_db_copy_json_save(client.hDB, hKey, ctypes.byref(buf), ctypes.byref(bufsize), ctypes.byref(bufend))

    print("-" * 80)
    print("FULL DUMP")
    print("-" * 80)
    print(buf.value)
    print("-" * 80)
    print("Chars 17000-18000")
    print("-" * 80)
    print(buf.value[17000:18000])
    print("-" * 80)

    as_dict = midas.safe_to_json(buf.value, use_ordered_dict=True)

    client.lib.c_free(buf)

    return as_dict

def main(verbose=False):
    client = midas.client.MidasClient("middleware")
    buffer_handle = client.open_event_buffer("SYSTEM",None,1000000000)
    request_id = client.register_event_request(buffer_handle, sampling_type = 2)

    fpath = os.path.dirname(os.path.realpath(sys.argv[0]))

    while True:
        # odb = client.odb_get("/")
        odb = debug_get(client)

        if verbose:
            print(odb)
        start1 = time.time()

        client.communicate(10)
        time.sleep(1)


    client.deregister_event_request(buffer_handle, request_id)

    client.disconnect()

        
if __name__ == "__main__":
    from optparse import OptionParser
    parser = OptionParser(usage='usage: %prog\t ')
    parser.add_option('-v','--verbose', dest='verbose', action="store_true", default=False, help='verbose output;');
    (options, args) = parser.parse_args()
    main(verbose=options.verbose)


then tested the code in interactive mode without any error. as soon as I submit as midas "Program" I get the attached output.
thank you again, Giovanni
  2497   01 May 2023 Ben SmithBug Reportpython issue with mathplot lib vs odb query
Looks like a localisation issue. Your floats are formatted as "6,6584e+01", whereas the JSON decoder expects "6.6584e+01".

Can you run the following few lines please? Then I'll be able to write a test using the same setup as you:


import locale
print(locale.getlocale())
from matplotlib import pyplot as plt
print(locale.getlocale())
 
  2498   01 May 2023 Ben SmithBug Reportpython issue with mathplot lib vs odb query
> Looks like a localisation issue. Your floats are formatted as "6,6584e+01", whereas the JSON decoder expects "6.6584e+01".

This should be fixed in the latest commit to the midas develop branch. The JSON specification requires a dot for the decimal separator, so we must ignore the user's locale when formatting floats/doubles for JSON.

I've tested the fix on my machine by manually changing the locale, and also added an automated test in the python directory.
  2499   01 May 2023 Giovanni MazzitelliBug Reportpython issue with mathplot lib vs odb query
> > Looks like a localisation issue. Your floats are formatted as "6,6584e+01", whereas the JSON decoder expects "6.6584e+01".
> 
> This should be fixed in the latest commit to the midas develop branch. The JSON specification requires a dot for the decimal separator, so we must ignore the user's locale when formatting floats/doubles for JSON.
> 
> I've tested the fix on my machine by manually changing the locale, and also added an automated test in the python directory.

Thanks very macth Ben,
so if I understand correctly we have to update MIDAS to latest develop branch available? can you sand me the link to be sure of install the right update. 
can you also tell me how you fix manually? we are restarting and then well be difficult install and makes updete.
thank you again, regards, Giovanni
  737   26 Dec 2010 Konstantin OlchanskiBug Reportrace condition and deadlock between ODB lock and SYSMSG lock in cm_msg()
> 
> The only remaining problem when running my script is some kind of deadlock between the ODB and SYSMSG semaphores...
> 


In theory, we understand how programs that use 2 semaphores to protect 2 shared resources can deadlock
if there are mistakes in how locks are used.

For example, consider 2 semaphores A and B and 2 concurrent
subroutines foo() and bar() running at exactly the same time:

foo() { lock(A); lock(B); do stuff; unlock(B); unlock(A); } and
bar() { lock(B); lock(A); do stuff; unlock(A); unlock(B); }

This system will deadlock immediately with foo() taking semaphore A, bar() taking semaphore B,
then foo() waiting for B and bar() waiting for A forever.

This situation can also be described as a race condition where foo() and bar() are racing each
other to get the semaphores, with the result depending on who gets there first
and, in this case, sometimes the result is deadlock.

In this example, the size of the race condition time window is the wall clock time
between actually locking both semaphores in the sequence "lock(X); lock(Y);". While
locking a semaphore is "instantaneous", the actual function lock() takes time to call
and execute, and this time is not fixed - it can change if the CPU takes a hardware
interrupt (quick), a page fault (when we may have to wait until data is read from the swap file)
or a scheduler interrupt (when we are outright stopped for milliseconds while the CPU runs
some other process).

In reality, subroutines foo() and bar() do not run at exactly the same time, so the probability
of deadlock will depend on how often foo() and bar() are executed, the size of the race condition time window,
the number of processes executing foo() and bar(), and the amount of background activity
like swapping, hardware interrupts, etc.

(Also note that on a single-cpu system, we will probably never see a deadlock between foo() and bar()
because they will never be running at the same time. But the deadlock is still there, waiting
for the lucky moment when the scheduler switches from foo() to bar() just at the wrong place).

There is more on deadlocks and stuff written at:
http://en.wikipedia.org/wiki/Deadlock
http://en.wikipedia.org/wiki/Race_condition

In case of MIDAS, the 2 semaphores are the ODB lock and the SYSMSG lock (also remember about locks
for the shared memory event buffers, SYSTEM, etc, but they seem to be unlikely to deadlock).

The function foo() is any ODB function (db_xxx) that locks ODB and then calls cm_msg() (which locks SYSMSG).

The function bar() is cm_msg() which locks SYSMSG and then calls some ODB db_xxx() function which tries to lock ODB.

(This is made more interesting by cm_watchdog() periodically called by alarm(), where we alternately
take SYSMSG (via bm_cleanup) and ODB locks.)

I think this establishes a theoretical possibility for MIDAS to deadlock on the ODB and SYSMSG semaphores.

In practice, I think we almost never see this deadlock because cm_msg() is not called very often, and during normal
operation, is almost never called from inside ODB functions holding the ODB lock - almost all calls to cm_msg from
ODB functions are made to report some kind of problem with the ODB internal structure, something that "never"
happens.

By "luck" I stumbled into this deadlock when doing the "odbedit" fork-bomb torture tests, when high ODB lock
activity is combined with high cm_msg() activity reporting clients starting and stopping, combined with a large
number of MIDAS clients running, starting and stopping.

So a deadlock I see within 1 minute of running the torture test, other lucky people will see after running an experiment
for 1 year, or 1 month, or 1 day, depending.

In theory, this deadlock can be removed by establishing a fixed order of taking locks. There will never be a deadlock
if we always take the SYSMSG lock first, then ask for the ODB lock.

In practice, it means that using cm_msg() while holding an ODB lock is automatically dangerous
and should be avoided if not forbidden.

And it does work. By refactoring a few places in client startup, shutdown and cleanup code, I made the deadlock "go away",
and my test script (posted in my first message) no longer deadlocks, even if I run hundreds of odbedit's at the same time.

Unfortunately,  it is impractical to audit and refactor all of MIDAS to completely remove this problem. MIDAS call graphs
are sufficiently complicated for making manual analysis of lock sequences infeasible and
I expect any automatic lock analysis tool will be defeated by the cm_watchdog() periodic interrupt.

An improvement is possible if we make cm_msg() safe for calling from inside the ODB db_xxx() function. Instead
of immediately sending messages to SYSMSG (requiring a SYSMSG lock), if ODB is locked, cm_msg() could
save the messages in a buffer, which would be flushed when the ODB lock is released. (This does not fix
all the other places that take ODB and SYSMSG locks in arbitrary order, but I think those places are not as
likely to deadlock, compared to cm_msg()).

However, now that I have greatly reduced the probability of deadlock in the client startup/shutdown/cleanup code,
maybe there is no urgency for changing cm_msg() - remember that if we do not call cm_msg() we will never deadlock -
and during normal operation, cm_msg() is almost never called.

Investigation completed, I will now cleanup, retest and commit my changes to midas.c and odb.c. Looking into this
and writing it up was a good intellectual exercise.

P.S. Also remember that there are locks for shared memory event buffers (SYSTEM, etc), but those do not involve
lock inversion leading to deadlock. I think all lock sequences are like this: SYSTEM->ODB, SYSTEM->SYSMSG->ODB,
there are no inverted sequences SYSMSG->SYSTEM or ODB->SYSTEM and the only deadlocking
sequence SYSTEM->ODB->SYSMSG, does not really involve the SYSTEM lock.

K.O.
  2441   22 Oct 2022 Lars MartinSuggestionread_only odbxx?
I really like the concept of the odbxx interface.
I think it would be a nice feature if one could have a read_only connection, e.g. by declaring a "const midas::odb".
Just for fun I tried if this already works, but the compiler doesn't allow const midas::odb for e.g. the [] operator. I'm guessing this would be non-trivial to implement, but I like the idea of certain Midas clients being able to read the odb without risking corruption.
  2443   24 Oct 2022 Stefan RittSuggestionread_only odbxx?
> I really like the concept of the odbxx interface.
> I think it would be a nice feature if one could have a read_only connection, e.g. by declaring a "const midas::odb".
> Just for fun I tried if this already works, but the compiler doesn't allow const midas::odb for e.g. the [] operator. I'm guessing this would be non-trivial to implement, but I like the idea of certain Midas clients being able to read the odb without risking corruption.

Having a "const midas::odb" probably does not work (at least I would not know how to implement that).

But I could make an internal flag analog to the auto refresh flags. So you would have

    o.set_write_protect(true);

to turn on write protection. Would that work for you?

Best,
Stefan
  2444   26 Oct 2022 Lars MartinSuggestionread_only odbxx?
> Having a "const midas::odb" probably does not work (at least I would not know how to implement that).
> 
> But I could make an internal flag analog to the auto refresh flags. So you would have
> 
>     o.set_write_protect(true);
> 
> to turn on write protection. Would that work for you?

Absolutely. Looking at the underlying code I was also at a loss how const would work.
I'm mostly just interested in having small clients that only read from the odb (for whatever reason) without risking corrupting it by messing something 
up in the code, especially since such small clients are almost by definition hacked together quickly on the fly.
  2445   29 Oct 2022 Stefan RittSuggestionread_only odbxx?
Ok, I implemented and committed that. Just call

o.set_write_protect(true)

on a key you don't want to modify. If you do so, an exception gets thrown.

Best,
Stefan
  1701   23 Sep 2019 Frederik WautersSuggestionrecover daq and hardware safety.

We have encountered a safety issue with our HPGe HV and it's midas frontend. Turning off or changing HV unknowingly has to be avoided at all costs.

 

Current safety protection

We use the DF_REPORT_STATUS flag to give the hardware settings precedence over odb settings. This all takes place in the init.

 

DAQ recovery Issue?

In the setup / development state, we sometimes have to remove the SHM files and reload an odb dump to recover the DAQ. When the FE is running, this can modify hardware settings. E.g. change a voltage

 

Question

Is there a way one can let the frontend know the "load"  function is called in odbedit? Or other suggestions to build in this safety.

 

  1708   27 Sep 2019 Konstantin OlchanskiSuggestionrecover daq and hardware safety.
> We have encountered a safety issue with our HPGe HV and it's midas frontend.

At TRIUMF and other labs the words "safety issue" have very specific meaning and
we tend to follow this guidance: MIDAS is not certified for and is not intended for use with 
safety critical applications as defined here:
https://en.wikipedia.org/wiki/Safety-critical_system

> A safety-critical system ... malfunction may result in ... following outcomes:
> death or serious injury to people
> loss or severe damage to equipment/property
> environmental harm

If this is your case, you should use properly certified software *and hardware*. Safety 
officers at most institutions require certified hardware interlocks and other protections to 
prevent such undesirable outcomes. Use of certified PLCs is sometimes permitted.

But I suspect in your case, there is no "safety issue", you only want to protect some 
valuable but not critical equipment against accidental damage.

In this case, you can probably use midas, but if midas malfunction may result in destroying 
your experiment (i.e. accidentally set wrong voltage on 3000 phototubes), you should also 
have hardware based protections (hardware limits on max/min high voltage). Most HV 
power supplies implement such protections (screw-driver actuated max voltage limits).

If there is danger of destroying your experiment you should also have an independent 
review of your control system to avoid avoidable mistakes and obvious problems.

> Turning off or changing HV unknowingly has to be avoided at all costs

The function of changing high-voltage is implemented in your frontend program. Right in 
the place in this program where you transmit the voltage setting from ODB to the hardware 
is where you implement your protections (validate the voltage range, check that changing 
the voltage is permitted, etc). This protects you against unexpected/incorrect/erroneous
changes in ODB (wrong ODB is loaded, wrong values in ODB, ODB is corrupted, etc).

In addition, it is wise to set software based limits in the HV power supply (software 
controlled max high voltage, software controlled max current, etc). Most HV power supplies 
implement such functions.

To ensure high voltage cannot be changed at the wrong times, you can also implement 
procedural and hardware protections, such as unplug the power supply control connection 
(usually ethernet or serial or usb cable). This will prevent you from monitoring the high 
voltage currents and the only solution is to use a  power supply with a hardware "write 
protect" function (a key needs to be inserted and turned to allow changing anything).

All of this is generic and applies to any controls software, not just MIDAS.

Without at least some of these protections (especially protections in your frontend 
program), the questions you asked about loading ODB are insufficient.

K.O.
  1712   28 Sep 2019 Frederik WautersSuggestionrecover daq and hardware safety.
Dear Konstantin,

So let me retract the term "safety issue" then, it was more a request/question for this type of 
info between the fe and the odb.

We have most of what you mention:
* The HV hardware has current limits
* The Hardware has fixed ramping limits.

same for the software. 

The issue occurs when e.g. one channel can not be turned on and ramp for some temp/specific 
reason, and someone else is working on the daq and reloads the odb for e.g. 1h ago.  

> > We have encountered a safety issue with our HPGe HV and it's midas frontend.
> 
> At TRIUMF and other labs the words "safety issue" have very specific meaning and
> we tend to follow this guidance: MIDAS is not certified for and is not intended for use with 
> safety critical applications as defined here:
> https://en.wikipedia.org/wiki/Safety-critical_system
> 
> > A safety-critical system ... malfunction may result in ... following outcomes:
> > death or serious injury to people
> > loss or severe damage to equipment/property
> > environmental harm
> 
> If this is your case, you should use properly certified software *and hardware*. Safety 
> officers at most institutions require certified hardware interlocks and other protections to 
> prevent such undesirable outcomes. Use of certified PLCs is sometimes permitted.
> 
> But I suspect in your case, there is no "safety issue", you only want to protect some 
> valuable but not critical equipment against accidental damage.
> 
> In this case, you can probably use midas, but if midas malfunction may result in destroying 
> your experiment (i.e. accidentally set wrong voltage on 3000 phototubes), you should also 
> have hardware based protections (hardware limits on max/min high voltage). Most HV 
> power supplies implement such protections (screw-driver actuated max voltage limits).
> 
> If there is danger of destroying your experiment you should also have an independent 
> review of your control system to avoid avoidable mistakes and obvious problems.
> 
> > Turning off or changing HV unknowingly has to be avoided at all costs
> 
> The function of changing high-voltage is implemented in your frontend program. Right in 
> the place in this program where you transmit the voltage setting from ODB to the hardware 
> is where you implement your protections (validate the voltage range, check that changing 
> the voltage is permitted, etc). This protects you against unexpected/incorrect/erroneous
> changes in ODB (wrong ODB is loaded, wrong values in ODB, ODB is corrupted, etc).
> 
> In addition, it is wise to set software based limits in the HV power supply (software 
> controlled max high voltage, software controlled max current, etc). Most HV power supplies 
> implement such functions.
> 
> To ensure high voltage cannot be changed at the wrong times, you can also implement 
> procedural and hardware protections, such as unplug the power supply control connection 
> (usually ethernet or serial or usb cable). This will prevent you from monitoring the high 
> voltage currents and the only solution is to use a  power supply with a hardware "write 
> protect" function (a key needs to be inserted and turned to allow changing anything).
> 
> All of this is generic and applies to any controls software, not just MIDAS.
> 
> Without at least some of these protections (especially protections in your frontend 
> program), the questions you asked about loading ODB are insufficient.
> 
> K.O.
  1715   29 Sep 2019 Konstantin OlchanskiSuggestionrecover daq and hardware safety.
> 
> The issue occurs when e.g. one channel can not be turned on and ramp for some temp/specific 
> reason, and someone else is working on the daq and reloads the odb for e.g. 1h ago.  
> 

So you want to ensure that some HV channels are turned off and stay turned off. Yes?

Most effective solution will depend on the consequences of unwanted turning-on of your channels:

- if hardware is destroyed if turned on - I think you should have a hardware lock-out. (unplug the HV cable)
- if hardware malfunctions and will degrade if left turned on for long time (i.e. a hot phototube or sparking wire chamber) - your data 
monitoring software should detect the anomaly (it will show up as a hot channel, dead channel, etc) and the people running the 
experiment will realize the mistake and turn the channel back off. also hardware monitoring (HV currents, etc) should detect this, with 
same effect.
- if collected data becomes useless (the turned-off channel make big noise in all other channels), then same thing, your data 
monitoring should catch it.

The next consideration is what are you protecting against:

a) one person flags channel defective, turns it off, next person knows nothing, turns it back on - you need to work on documentation, 
shift hand-off and other human-level procedures
b) people running experiment load random odb files - same thing, from human-level procedures and documentation it should be made 
clear which odb files are correct and which should not be used
c) software malfunction (not human person) causes data change in odb causes turned-off channel to turn back on
d) hardware malfunction causes turned-off channel to turn back on (HV power supply hardware or firmware malfunctions and decides 
that all channels should be turned on at maximum high voltage)

In the experiments I am most familiar with, problem (b) is avoided by never loading/reloading odb files directly, most/all interaction
with the experiment is done through web pages, and these web pages are carefully coded to be safe against most user mistakes.

Cases (a), (b) and (c) you can protect against by changing the frontend code to refuse to turn on some channels:

int set_hv(int channel, int voltage) {
   if (channel == 35) return COMMAND_REFUSED;
   write_to_hardware(channel, voltage);
   return COMMAND_SUCCESS;
}

But in reality this solution only creates problem (e):

e) people running the experiment start random versions of the frontend program, make random changes to the frontend source code, 
multiple people working on the frontend have their own personal versions/copies of the source code, etc.

This is the worst-case scenario, meaning the experiment lost control of software configuration, and even basic software version 
control tools (like svn or git) are not being used. If your experiment gets that chaotic, all protections are likely to be ineffective - 
documentation will not work (people will ignore post-it notes "do not turn on!"), hardware protections will not work (unplugged cable 
labeled "do not plug in!" will be plugged back in and powered), etc. good luck, then.

K.O.
ELOG V3.1.4-2e1708b5