21 Apr 2023, Ben Smith, Forum, Setup Midas with Caen vx2740 - ask for help
|
> I'm not able to find documentation what is purpose of the RPC? Could someone give any indicators how I can start debug this behavior? Or there is some documentation about the RPC?
The RPC system allows midas clients to issue commands to each other. In the case of the VX2740 code we use it so the midas webserver (mhttpd) can tell the frontend to perform some actions when a user clicks a button on a webpage.
I've been writing most of the code for the VX2740 for Darkside, so will contact you directly to help debug the issue. |
01 May 2023, Ben Smith, Bug Report, python issue with mathplot lib vs odb query
|
> it seams that there is a difference between the to way of use the code, and that
> is sufficient the call to matplotlib to corrupt in some way the odb. any ideas?
I can't reproduce this on my machines, so this is going to be fun to debug!
Can you try running the program below please? It takes the important bits from odb_get() but prints out the string before we try to parse it as JSON. Feel free to send me the output via email (bsmith@triumf.ca) if you don't want to post your entire ODB dump in the elog.
import sys
import os
import time
import midas
import midas.client
import ctypes
def debug_get(client):
c_path = ctypes.create_string_buffer(b"/")
hKey = ctypes.c_int()
client.lib.c_db_find_key(client.hDB, 0, c_path, ctypes.byref(hKey))
buf = ctypes.c_char_p()
bufsize = ctypes.c_int()
bufend = ctypes.c_int()
client.lib.c_db_copy_json_save(client.hDB, hKey, ctypes.byref(buf), ctypes.byref(bufsize), ctypes.byref(bufend))
print("-" * 80)
print("FULL DUMP")
print("-" * 80)
print(buf.value)
print("-" * 80)
print("Chars 17000-18000")
print("-" * 80)
print(buf.value[17000:18000])
print("-" * 80)
as_dict = midas.safe_to_json(buf.value, use_ordered_dict=True)
client.lib.c_free(buf)
return as_dict
def main(verbose=False):
client = midas.client.MidasClient("middleware")
buffer_handle = client.open_event_buffer("SYSTEM",None,1000000000)
request_id = client.register_event_request(buffer_handle, sampling_type = 2)
fpath = os.path.dirname(os.path.realpath(sys.argv[0]))
while True:
# odb = client.odb_get("/")
odb = debug_get(client)
if verbose:
print(odb)
start1 = time.time()
client.communicate(10)
time.sleep(1)
client.deregister_event_request(buffer_handle, request_id)
client.disconnect()
if __name__ == "__main__":
main() |
01 May 2023, Ben Smith, Bug Report, python issue with mathplot lib vs odb query
|
Looks like a localisation issue. Your floats are formatted as "6,6584e+01", whereas the JSON decoder expects "6.6584e+01".
Can you run the following few lines please? Then I'll be able to write a test using the same setup as you:
import locale
print(locale.getlocale())
from matplotlib import pyplot as plt
print(locale.getlocale())
|
01 May 2023, Ben Smith, Bug Report, python issue with mathplot lib vs odb query
|
> Looks like a localisation issue. Your floats are formatted as "6,6584e+01", whereas the JSON decoder expects "6.6584e+01".
This should be fixed in the latest commit to the midas develop branch. The JSON specification requires a dot for the decimal separator, so we must ignore the user's locale when formatting floats/doubles for JSON.
I've tested the fix on my machine by manually changing the locale, and also added an automated test in the python directory. |
31 May 2023, Ben Smith, Bug Report, Event builder fails at every 10 runs
|
> The event builder fails to initiate the 10th run since its startup,
> 'BM_NO_MEMORY: too many requests,'
Hi Kou,
It sounds like you might be calling bm_request_event() when starting a run, but not calling bm_delete_request() when the run stops. So you end up "leaking" event requests and eventually reach the limit of 10 open requests.
In examples/eventbuilder/mevb.c the request deletion happens in source_unbooking(), which is called as part of the "run stopping" logic. I've just updated the midas repository so the example compiles correctly, and was able to start/stop 15 runs without crashing.
Can you check the end-of-run logic in your version to ensure you're calling bm_delete_request()? |
16 Oct 2023, Ben Smith, Bug Report, Python midas.file_reader get_eor_odb_dump()
|
Thanks for the bug report Gennaro!
I've fixed the code so that we'll now find the end-of-run ODB dump even if the user is already at the end of the file when they call get_eor_odb_dump().
Ben |
22 Nov 2023, Ben Smith, Forum, run number from an external (*SQL) db?
|
> I wonder if there is a non-intrusive way to have an external (wrt MIDAS)*SQL database
> serving as a primary source of the run number information for a MIDAS-based DAQ system?
> - like a plugin with a getNextRunNumber() function, for example, or a special client?
One of my experiments has special rules for run numbering as well. I created a client that registers a begin-of-run transition handler with sequence 1 (so it's the first client to handle the begin-of-run transition). That client updates "/Runinfo/Run number" in the ODB.
This mostly works. mlogger will create .mid files based on the new run number, the ODB dumps within those files show the new run number etc.
But there are 2 quirks. Let's say your client changed the number from 11 to 400. The message log will say "Run #11 started" and "Run #400 stopped". And the history system will record the start/stop times the same way. That only matters for when you're viewing history plots on the webpage and zoom in far enough to see the run transitions (represented by green and red vertical dashed lines) - the green line will be labelled 11 and the red line 400.
Depending on the exact logic you need, you may be able to avoid these quirks by also recomputing the run number before the user even tries to start a run (e.g. after the end of the previous run, or when the user changes an important setting in the ODB). If you're changing the run number between runs, make sure to set it to "desired number - 1", as midas will increment the run number automatically before handling the next start run request. |
22 Jan 2024, Ben Smith, Bug Report, Warnings about ODB keys that haven't been touched for 10+ years
|
We have an experiment that's been running for a long time and has some ODB keys that haven't been touched in ages. Mostly related to features that we don't use like the elog and lazylogger, or things that don't change often (like the logger data directory).
When we start any program, we now got dozens of error messages in the log with lines like:
hkey 297088, path "/Elog/Display run number", invalid pkey->last_written time 1377040124
That timestamp is reasonable though, as the experiment was set up in 2013!
What's the best way to make these messages go away?
- Change the logic in db_validate_and_repair_key_wlocked() to not worry if keys are 10+ years old?
- Write a script to "touch" all the old keys so they've been modified recently?
- Something else? |
05 Feb 2024, Ben Smith, Bug Fix, string --> int64 conversion in the python interface ?
|
> The symptoms are consistent with a string --> int64 conversion not happening
> where it is needed.
Thanks for the report Pasha. Indeed I was missing a conversion in one place. Fixed now!
Ben |
05 Sep 2024, Ben Smith, Forum, Python frontend rate limitations?
|
> What limits the rate that poll_func is called in a python frontend?
First the general advice: if you reduce the "period" of your equipment, then your function will get called more frequently. You can set it to 0 and we'll call it as often as possible. You can set this in the ODB at "/Equipment/Python Data Simulator/Common/Period"
If that's still not fast enough, then you can return a *list* of events from your readout_func. I've seen real-world cases of 25kHz+ of midas events generated in this fashion.
However in your case the limitation is likely that you're sending 1.25MB per event and we have a lot of data marshalling to do between the python and C++ layer. In particular it takes 15ms on my machine to just pack the data into a memory buffer (see timeit command below). I am sure there must be a faster way to do this packing, especially in the case where the bank contains a numpy array rather than a python list.
I'll add it to my to-do list to investigate improving the performance of medium-to-large events in the python code.
Cheers,
Ben
P.S. You may have a bug in your calculations (depending on how you did your testing). In poll_func I think you should be updating the stats every time the function is called, not just the times when you return True.
P.P.S. Command I used to test how slow it is to pack the data. One-time setup of creating the buffers, then multiple tests of the pack_into function:
python -m timeit -s "import struct;import ctypes;arr = [0]*1250001;buf = ctypes.create_string_buffer(10000000);fmt = \">1250000d\"" "struct.pack_into(fmt, buf, *arr)"
20 loops, best of 5: 15.3 msec per loop |
27 Sep 2024, Ben Smith, Forum, Python frontend rate limitations?
|
> in your case the limitation is likely that you're sending 1.25MB per event and we have a lot of data marshalling to do between the python and C++ layer.
>
> I'll add it to my to-do list to investigate improving the performance of medium-to-large events in the python code.
I've now added better support for numpy arrays in the python code that encodes a `midas.event.Event` object. If you use the "correct" numpy data type then you can get vastly improved performance as numpy already stores the data in memory in the format that we need.
In your example, if you change
self.zero_buffer = [0] * self.total_data_size
to
self.zero_buffer = np.ndarray(self.total_data_size, np.int16)
then the max data rate of the frontend goes from 330MB/s to 7600MB/s on my laptop (a factor 20 improvement from one line of code!)
To ensure you're using the optimal numpy dtype for your bank, you can reference a dict called `midas.tid_np_formats`. For example `midas.tid_np_formats[midas.TID_SHORT]` is equivalent to `np.int16`. If you use an int16 array and write it as a TID_SHORT bank, then we'll use the fast path. If there is a mismatch, we'll have to do type conversions and will end up on the slow path. |
07 Oct 2024, Ben Smith, Bug Report, Difficulty running MIDAS on Rocky 9.4
|
> We're trying to install the SuperCDMS version of MIDAS on a Rocky 9.4 Virtual
> Machine and are getting a persistent error when we run mserver. As far as I
> know there are minimal changes between this and the MIDAS branch, but Ben Smith
> may have more to say on this.
For reference, "the SuperCDMS version of MIDAS" is just a fork that no longer has any meaningful differences vs the main MIDAS repo, but we only pull updates infrequently after testing a bunch. We last pulled from the develop branch in November 2023. But that should be irrelevant here as semaphore code hasn't been touched for a very long time.
We're running Alma 9.4 on a machine at TRIUMF and the same version of midas works fine there (Amy, you may already have access to scdms-zeus). I believe Alma and Rocky should be basically identical for this.
So the questions are:
* Have you tried other midas programs, or only mserver? E.g. did odbedit and mhttpd work?
* If other programs work, have you been running them all as the same user? In particular, if you ran one program as root and another as an unprivileged user, then you will likely get odd permissions issues.
* What do you see if you run `ls -l /dev/shm` and `ls -l ~/packages/SuperCDMS_DAQ/MidasDAQ/online/.*SHM`? (Or wherever your online dir is for the 2nd one).
* Did you follow the full instructions for recovering from a corrupt ODB? https://daq00.triumf.ca/MidasWiki/index.php/FAQ#How_to_recover_from_a_corrupted_ODB In particular the bit about running odbinit with the --cleanup flag? |
30 Mar 2016, Belina von Krosigk, Forum, mserver ERR message saying data area 100% full, though it is free
|
Hi,
I have just installed Midas and set-up the ODB for a SuperCDMS test-facility (on
a SL6.7 machine). All works fine except that I receive the following error message:
[mserver,ERROR] [odb.c:944:db_validate_db,ERROR] Warning: database data area is
100% full
Which is puzzling for the following reason:
-> I have created the ODB with: odbedit -s 4194304
-> Checking the size of the .ODB.SHM it says: 4.2M
-> When I save the ODB as .xml and check the file's size it says: 1.1M
-> When I start odbedit and check the memory usage issuing 'mem', it says:
...
Free Key area: 1982136 bytes out of 2097152 bytes
...
Free Data area: 2020072 bytes out of 2097152 bytes
Free: 1982136 (94.5%) keylist, 2020072 (96.3%) data
So it seems like nearly all memory is still free. As a test I created more
instances of one of our front-ends and checked 'mem' again. As expected the free
memory was decreasing. I did this ten times in fact, reaching
...
Free Key area: 1440976 bytes out of 2097152 bytes
...
Free Data area: 1861264 bytes out of 2097152 bytes
Free: 1440976 (68.7%) keylist, 1861264 (88.8%) data
So I could use another >20% of the database data area, which is according to the
error message 100% (resp. >95%) full. Am I misunderstanding the error message?
I'd appreciate any comments or ideas on that subject!
Thanks, Belina |
14 Jan 2019, Becky Chislett, Bug Report, Custom script with new MIDAS
|
I am having difficulty getting the custom scripts to work within the updated MIDAS. Before the
update I was using something like :
<input type=submit name=customscript value="test">
on my custom page to run a script under /CustomScript/test, however, with the update to
MIDAS this is no longer working. I can't find any information about this functionality being
updated in the latest version - has this changed? Or should it still work?
Thanks,
Becky (g-2 DAQ) |
23 Jul 2006, Art Olin, Forum, File output for histories
|
The ALPHA experiment at CERN has recently adopted MIDAS, and the history data in numerical form is needed by the collaboration. Furthermore the DAQ is running under linux and most collaborators are windows or mac users, so it should be available in a platform independent way.
Basically we need the output from the mhist code. The most convenient, and possibly easiest implementation would be to select required data (ID, variable, time range) in the midas history display, click a button requesting file output and input a file name. One might also want to specify the interval time required.
A related nice feature would be like the root "view event status" , where text at the bottom of the history would display the position of the cursor in the history chart coordinates. Probably more work and less important to us.
Comments on the practicality? |
23 Jul 2006, Art Olin, Forum, File output for histories
|
Hi, Stefan,
Using mhist is how I'll start, but I'm getting substantial resistance. It's not so much the command line that's the problem. First I have to install an ssh client on their machines! Then they ssh to the server, pipe the result to a file, then ftp the file back to their machine.
A browser implementation of this is much simpler.
I agree that the "View Event Status idea is not practical. I didn't know about the GIF implementation of the histories.
Art |
24 Jul 2006, Art Olin, Bug Report, Elog attachments
|
Hi. When I attach the file below, Mix+Positronorig.xlx to an elog, and then open it or download it to disk, the file, 060... is severely truncated.
-rw-r--r-- 1 alpha users 17408 Jul 24 11:25 Mix+Positronorig.xls
-rw-r--r-- 1 alpha users 1 Jul 24 11:04 060724_100544_Mix+Positron Cabling 20060723.xls
It's something to do with long filenames or special characters in filenames. Worked OK when I renamed the original file to M1.xls. |
08 Aug 2019, Art Olin, Suggestion, midas cmake migration
|
I want to report a bug in the ROOT build process that might be relevant to the midas implementation. I had an annoying failure to build root 6.18 (current pro version) with a misleading error message about a fault in the root code. It turned out this was a cmake problem, and the error was from my cmake version being older than 3.14, which is quite recent. Took a bit of searching to find this.
I recommend when the cmake version is distributed that the instructions include the required cmake version. Developers are generally working well ahead of what is available in the older OS's. |
11 Jul 2023, Anubhav Prakash, Forum, Possible ODB corruption! Webpages https://midptf01.triumf.ca/?cmd=Programs not loading!
|
The ODB server seems to have crashed/corrupted. I tried reloading the previous
working version of ODB(using the commands in folliwng image) but it didn't work.
I have also attached the screenshot of the site https://midptf01.triumf.ca/?cmd=Programs. Any help to resolve this would be appreciated! Normally Prof. Thomas Lindner would solve such issues, but he is busy working at CERN till 17th of July, and we cannot afford to wait until then.
The following is the error: when I run bash /home/midptf/online/bin/start_daq.sh
[ODBEdit1,INFO] Fixing ODB "/Programs/ODBEdit" struct size mismatch (expected
316, odb size 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/ODBEdit" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11381:db_get_record1,ERROR] after db_check_record()
still struct size mismatch (expected 316, odb size 92) of "/Programs/ODBEdit",
calling db_create_record()
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11387:db_get_record1,ERROR] repaired struct size
mismatch of "/Programs/ODBEdit"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/ODBEdit" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [alarm.cxx:702:al_check,ERROR] Cannot get program info record
for program "ODBEdit", db_get_record1() status 319
[ODBEdit1,INFO] Fixing ODB "/Programs/mhttpd" struct size mismatch (expected
316, odb size 60)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/mhttpd" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11381:db_get_record1,ERROR] after db_check_record()
still struct size mismatch (expected 316, odb size 92) of "/Programs/mhttpd",
calling db_create_record()
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11387:db_get_record1,ERROR] repaired struct size
mismatch of "/Programs/mhttpd"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/mhttpd" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [alarm.cxx:702:al_check,ERROR] Cannot get program info record
for program "mhttpd", db_get_record1() status 319
[ODBEdit1,INFO] Fixing ODB "/Programs/Logger" struct size mismatch (expected
316, odb size 60)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/Logger" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11381:db_get_record1,ERROR] after db_check_record()
still struct size mismatch (expected 316, odb size 92) of "/Programs/Logger",
calling db_create_record()
[ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot malloc_data(256),
called from db_set_link_data
[ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot reallocate
"/System/Tmp/140305391605888I/Start command" with new size 256 bytes, online
database full
[ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of zero, set
to 32, odb path "Start command"
[ODBEdit1,ERROR] [odb.cxx:11387:db_get_record1,ERROR] repaired struct size
mismatch of "/Programs/Logger"
[ODBEdit1,ERROR] [odb.cxx:11293:db_get_record,ERROR] struct size mismatch for
"/Programs/Logger" (expected size: 316, size in ODB: 92)
[ODBEdit1,ERROR] [alarm.cxx:702:al_check,ERROR] Cannot get program info record
for program "Logger", db_get_record1() status 319
14:54:29 [ODBEdit,ERROR] [odb.cxx:1763:db_validate_db,ERROR] Warning: database
data area is 100% full
14:54:29 [ODBEdit,ERROR] [odb.cxx:1283:db_validate_key,ERROR] hkey 643368, path
"/Alarms/Classes/<NULL>/Display BGColor", string value is not valid UTF-8
14:54:29 [ODBEdit1,ERROR] [odb.cxx:556:realloc_data,ERROR] cannot
malloc_data(256), called from db_set_link_data
14:54:29 [ODBEdit1,ERROR] [odb.cxx:6923:db_set_link_data,ERROR] Cannot
reallocate "/System/Tmp/140305391605888I/Start command" with new size 256 bytes,
online database full
14:54:29 [ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of
zero, set to 32, odb path "Start command"
14:54:29 [ODBEdit1,ERROR] [odb.cxx:8531:db_paste,ERROR] found string length of
zero, set to 32, odb path "Start command" |
28 Sep 2015, Anthony Villano, Suggestion, Feature Request: MIDAS sequencer abort.
|
I am working for the SuperCDMS collaboration on some DAQ issues for our upcoming
SNOLAB installation. So far, the MIDAS sequencer seems to be a good paradigm
for us to do procedural tasks for our detectors and data running interspersed
with other protocols.
In our testing we've found that the sequencer works very well for this kind of
activity, although it would be useful to have a kind of scripted "abort" for
when something goes wrong -- especially if the user selects to abort a run
sequence.
Because the sequencer is setting various detector parameters to a certain value
before performing the tasks, the values will never be restored if the user
aborts the sequence. Instead, perhaps there can be a portion of a MIDAS
sequence script which is instructed to happen on an abort. Perhaps something
like all commands after a given tag like:
ON ABORT:
get run on a user-initiated abort? |
|