16 Sep 2024, Stefan Ritt, Bug Report, Crash using ODB watch
|
The answer is in the error message: „Object went out of scope“. When your frontent_init() exits, the odb objects are destroyed. When you get a callback, it‘s linked to the
destroyed object. This is like if you have a local string and pass a reference to that string in the return of the function.
Use a global object (bad) or use „new“ (potential memory leak). I would use a global structure which holds all odb objects.
Stefan
>
> last week I was running MIDAS with the commit 3ad98c5. Today I updated MIDAS and now all my watch functions are crashing. Attached I have a minimal example frontend of the problem.
>
> In our software we have two functions one which sets up the ODB values of the frontend and another one which sets up all watch functions. So overall we connect two time to the ODB during fronend_init one time to create the values and one time to create the watch. In the example code a simple version of this setup is shown:
>
> INT frontend_init() {
>
> cm_msg(MINFO, "frontend_init() setup", "Test FE");
>
> odb settings = {
> {"Test", 123},
> {"sub", {}}
> };
> settings.connect_and_fix_structure("/Equipment/Test FE/Settings");
> // settings.watch(watch); <-- this works without segmentation fault
>
> odb new_settings("/Equipment/Test FE/Settings");
> new_settings.watch(watch); // <-- here I am getting a segmentation fault
>
> return CM_SUCCESS;
> }
>
> When I directly set the watch everything runs fine however, when I create a new ODB object and use this one to set a watch I am getting the following segmentation fault:
>
> Process 18474 stopped
> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x34)
> frame #0: 0x000000010004fa38 test_fe`midas::odb::watch_callback(hDB=<unavailable>, hKey=<unavailable>, index=0, info=0x00006000002001c0) at odbxx.cxx:96:25 [opt]
> 93 if (po->m_data == nullptr)
> 94 mthrow("Callback received for a midas::odb object which went out of scope");
> 95 midas::odb *poh = search_hkey(po, hKey);
> -> 96 poh->m_last_index = index;
> 97 po->m_watch_callback(*poh);
> 98 poh->m_last_index = -1;
> 99 }
>
> Best,
> Marius |
16 Sep 2024, Stefan Ritt, Bug Report, Crash using ODB watch
|
Well, the object *went* out of scope. For my code it‘s hard to realize this, so the error reporting is poor. Also the first object should have the same
problem. Just by accident that it does not crash.
Stefan
> This is not the case here. Note that the error message: "Callback received for a midas::odb object which went out of scope" is not called! The segmentation fault happens later line 96.
>
> > The answer is in the error message: „Object went out of scope“. When your frontent_init() exits, the odb objects are destroyed. When you get a callback, it‘s linked to the
> > destroyed object. This is like if you have a local string and pass a reference to that string in the return of the function.
> >
> > Use a global object (bad) or use „new“ (potential memory leak). I would use a global structure which holds all odb objects.
> >
> > Stefan
> >
> > >
> > > last week I was running MIDAS with the commit 3ad98c5. Today I updated MIDAS and now all my watch functions are crashing. Attached I have a minimal example frontend of the problem.
> > >
> > > In our software we have two functions one which sets up the ODB values of the frontend and another one which sets up all watch functions. So overall we connect two time to the ODB during fronend_init one time to create the values and one time to create the watch. In the example code a simple version of this setup is shown:
> > >
> > > INT frontend_init() {
> > >
> > > cm_msg(MINFO, "frontend_init() setup", "Test FE");
> > >
> > > odb settings = {
> > > {"Test", 123},
> > > {"sub", {}}
> > > };
> > > settings.connect_and_fix_structure("/Equipment/Test FE/Settings");
> > > // settings.watch(watch); <-- this works without segmentation fault
> > >
> > > odb new_settings("/Equipment/Test FE/Settings");
> > > new_settings.watch(watch); // <-- here I am getting a segmentation fault
> > >
> > > return CM_SUCCESS;
> > > }
> > >
> > > When I directly set the watch everything runs fine however, when I create a new ODB object and use this one to set a watch I am getting the following segmentation fault:
> > >
> > > Process 18474 stopped
> > > * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x34)
> > > frame #0: 0x000000010004fa38 test_fe`midas::odb::watch_callback(hDB=<unavailable>, hKey=<unavailable>, index=0, info=0x00006000002001c0) at odbxx.cxx:96:25 [opt]
> > > 93 if (po->m_data == nullptr)
> > > 94 mthrow("Callback received for a midas::odb object which went out of scope");
> > > 95 midas::odb *poh = search_hkey(po, hKey);
> > > -> 96 poh->m_last_index = index;
> > > 97 po->m_watch_callback(*poh);
> > > 98 poh->m_last_index = -1;
> > > 99 }
> > >
> > > Best,
> > > Marius |
20 Sep 2024, Stefan Ritt, Bug Report, Crash using ODB watch
|
The problem has been fixed in the current version. Here is my analysis:
- the midas::odb object *can* go out of scope in the function, since the odb::watch() function creates a deep copy of the object.
This does not cause a memory leak if one call odb::unwatch_all() at the end of a program.
- The creation from XML had a flaw where the ODB key handle ("hKey") is not initialized since it is not passed by the db_copy_xml() function.
I added code to db_copy_xml() to also fetch the key handle in the XML file, which now fixes the issue. Please note that you have to
update both the server and client side of midas to get this functionality if you are using it by a remote client.
- I saw the flag MK added on his pull request to the constructor of odb::odb(). This is a way to fight the symptoms (by creating an
object the "old" way if not otherwise needed, but how we have the cause cured. Nevertheless I added that parameter, but set to to true by default:
odb::odb(const std::string &str, bool init_via_xml = true);
since this should be fully working now and should always be faster than the old method. I only keep it for debugging should we observe
another flaw in odb_from_xml().
Best regards,
Stefan |
20 Sep 2024, Stefan Ritt, Suggestion, Clean up compiler warning in manalyzer
|
> I like the look of std::format, looks cleaner than string streams
I fully agree. String streams is a pain if you want to do zero-leading hex output mixed with decimal output. Yes it's easier to read if you don't know printf syntax,
but 10-20 times more chars to write and not necessarily cleaner.
Proble is that we would have to convert about a few thousand of sprintf's() in midas.
Stefan |
24 Sep 2024, Stefan Ritt, Info, News MSCB++ API
|
> Where is the example of error handling?
#include "mscbxx.h"
#include "mexcept.h"
...
try {
// connect to node 10 at submaster mscb123
midas::mscb m("mscb123", 10);
// print a variable
std::cout << m["Input0"] << std::endl;
} catch (mexception e) {
std::cout << e << std::endl; // simply print exception
}
... |
09 Oct 2024, Stefan Ritt, Suggestion, odbedit minor quality of life
|
Ok, accepted, done and pushed.
Stefan
Lukas Gerritzen wrote: | I have made two minor quality of life changes to odbedit.
- cd command: Typing cd without arguments now changes the directory to /, similar to the behaviour of the cd command in Linux sending you to the home directory.
- Exit behavior: Upon exiting the program with Ctrl+C, a newline character is printed so that the command line starts on an empty line rather than the last line from odbedit.
Here's the diff:@@ -1668,7 +1668,10 @@ int command_loop(char *host_name, char *exp_name, char *cmd, char *start_dir)
/* cd */
else if (param[0][0] == 'c' && param[0][1] == 'd') {
- compose_name(pwd, param[1], str);
+ if (strlen(param[1]) == 0)
+ strcpy(str, "/");
+ else
+ compose_name(pwd, param[1], str);
status = db_find_key(hDB, 0, str, &hKey);
@@ -2962,6 +2965,7 @@ void ctrlc_odbedit(INT i)
cm_disconnect_experiment();
+ printf("\n");
exit(EXIT_SUCCESS);
}
Please consider incorporating those changes to odbedit.
Lukas |
|
11 Oct 2024, Stefan Ritt, Bug Report, Frontend name must differ from others by more than the last three characters
|
Hi Denis,
indeed a bug. Will fix it next week.
Best,
Stefan
> Hi,
> I have developed two Midas front-end programs for different hardware. The frontend_name of the first one is "FSCD_SC" (slow control) and that of the second one is "FSCD_PS" (power supply).
>
> Each front-end program runs fine separately, but when attempting to start FSCD_SC while FSCD_PS is running, FSCD_PS is terminated and Midas indicates "Previous frontend stopped" in the window where it starts FSCD_SC.
>
> The problem is that these two frontend names only differ in their last two characters, and Midas currently does not distinguish them properly. |
18 Oct 2024, Stefan Ritt, Bug Report, Frontend name must differ from others by more than the last three characters
|
Fixed and committed.
Best,
Stefan
> Hi Denis,
>
> indeed a bug. Will fix it next week.
>
> Best,
> Stefan
>
>
> > Hi,
> > I have developed two Midas front-end programs for different hardware. The frontend_name of the first one is "FSCD_SC" (slow control) and that of the second one is "FSCD_PS" (power supply).
> >
> > Each front-end program runs fine separately, but when attempting to start FSCD_SC while FSCD_PS is running, FSCD_PS is terminated and Midas indicates "Previous frontend stopped" in the window where it starts FSCD_SC.
> >
> > The problem is that these two frontend names only differ in their last two characters, and Midas currently does not distinguish them properly. |
07 Nov 2024, Stefan Ritt, Suggestion, Stop run and sequencer button
|
I don't find this very useful. Some experiments do not only want to stop the run, but also do other cleanup things. To do that, I proposed and "atexit" function like C has it. Then the user can put a run stop there, plus any other cleanup. This will be much more flexible. Think about the "reset" script we have to manually run if we abort a sequencer. The atexit function will come next week, so you should consider to use it instead your additional button.
Stefan |
13 Nov 2024, Stefan Ritt, Info, New sequencer command ODBLOOKUP
|
A new sequencer command "ODBLOOKUP" has been implemented, which does a lookup of a string in a string
array in the ODB given by a path and returns its index as a number. If we have for example an array
/Examples/Names
[0] Hello
[1] Test
[2] Other
and do a
ODBLOOKUP "/Examples/Names", "Test", index
we get a index equal 1.
/Stefan |
14 Nov 2024, Stefan Ritt, Suggestion, Issue with creating banks
|
All I can see is that your bank header gets corrupted along the way. The funny character reported by
cm_write_event_to_odb indicates that your original name "RPD0" got overwritten somewhere, but I could not spot any
mistake in your code.
I would play around: change max_event_size, produce dummy data of size N instead of the recv() and so on. Also monitor
the bank header to see when it gets overwritten. I guess you only write form one thread, so that should be safe, right?
Best,
Stefan |
18 Nov 2024, Stefan Ritt, Info, New sequencer command ODBLOOKUP
|
> "value not found" sets "index" to ?
It sets it actually to "not found". Since all variables are stings in the sequencer, you can then do a test like
ODBLOOKUP ..., index
if ($index == "not found")
...
> "odb key not found" sets "index" to ?
If the odb key is not found, the sequencer aborts.
> link to documentation?
The documentation is where it always has been:
https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#Sequencer_Commands
/Stefan |
21 Nov 2024, Stefan Ritt, Info, What do the status numbers mean and where can I find more information about them?
|
> [RP Streaming Frontend,ERROR] [midas.cxx:17806:cm_write_event_to_odb,ERROR]
> cannot create key for bank "DATA" with tid 24 in ODB, db_create_key() status 309
>
>
>
> I just need more information on what the error message means. Which data type
> refers to tid 24 and what does status 309 indicate??
A tid (type identification) of 24 does actually not exist. See midas.h:327, so this tells
me that your bank header got corrupted. Somewhere you write over your data.
Stefan |
01 Dec 2024, Stefan Ritt, Bug Report, EQ_PERIODIC-only equipment ?
|
There is no requirement that you pair an EQ_PERIODIC with an EQ_TRIGGER. Take for exmaple
midas/examples/experiment/frontend.cxx
and remove there the triggered event. The frontend runs happily with the periodic event only (I just tried that myself). You have probably some problem in
your event definition. Start with the running example frontend, and add your code line by line until you see the problem.
Stefan |
02 Dec 2024, Stefan Ritt, Bug Report, ODB key picker does not close when creating link / Edit-on-run string box too large
|
> Actual result:
> The key picker does not close.
Thanks for reporting that bug. It has been fixed in the current commit (installed already on megon02)
Stefan |
02 Dec 2024, Stefan Ritt, Bug Report, ODB key picker does not close when creating link / Edit-on-run string box too large
|
> Another more minor visual problem is the edit-on-start dialog. There seems to be no upper bound to the
> size of the text box. In the attached screenshot, ShortString has a maximum length of 32 characters,
> LongString has 255. Both are empty at the time of the screenshot. Maybe, the size should be limited to a
> reasonable width.
I limited the input size now to (arbitrarily) 100 chars. The string can still be longer than 100 chars, and you start then scrolling inside the input box. Let me know if
that's ok this way.
Stefan |
02 Dec 2024, Stefan Ritt, Forum, "Safe" abort of sequencer scripts
|
The atexit() function has been implemented in the current develop branch of midas, see
https://daq00.triumf.ca/MidasWiki/index.php/Sequencer#ATEXIT_subroutine
Stefan |
06 Dec 2024, Stefan Ritt, Info, New slow control framework "mdev"
|
A new slow control mini-framework has been developed for MIDAS and been successfully tested in the Mu3e experiment. It
might be suited for other experiments as well.
Background
Since the late 90’s we have the three-tier bases slow control framework in MIDAS with class drivers, device drivers and bus
drivers. While it was used successfully since many years, it is complicated to understand and limited in its flexibility. If we
have a HV device with a demand value, a measured voltage and a current it’s fine, but if we want to control more things like
trip voltage, temperature and status readout etc. it soon hits its limits. With the development of the new odbxx API
(https://daq00.triumf.ca/MidasWiki/index.php/Odbxx) there is now an opportunity to make everything much simpler.
Design principles
Instead of a three-tier system, the new “mdev” framework (“m”idas “dev”ices) uses a simple base class which is attached to
a certain MIDAS equipment. It implements five simple functions:
- odb_setup() to setup /Equipment/<name>/Settings and /Equipment/<name>/Variables to its desired structure
- init() to initialize the slow control device
- exit() to close the connection to the device
- loop() which is called periodically to read the device
- read_event() which returns a MIDAS event going to the data stream
A device driver inherits from this base class and implements the functions. A simple example can be found in
midas/drivers/mdev/mdev_mscb.[h,cxx]
for the MSCB field bus system used at TRIUMF and PSI. It basically boils down to two calls:
Init:
m_variables.connect(“/Equipment/<name>/Variables”);
m_variables[“Output”].watch(midas::odb &o) {
m_mscb[“HV”] = o[0]; // transfer value from ODB to MSCB device
}
Reading a value in the loop function:
m_variables[“Input”][0] = m_mscb[“HVMeas”];
The member variable m_variables is a midas::odb variable attached to the “Input” and “Output” variables in the ODB. The
watch() functions executes the lambda function whenever the “Output” in the ODB changes. It then simply transfers the new
value to the device. The reading of measured values just work in the other direction from the device to the ODB.
If you look at the mdev_mscb.cxx code, you see of course some more things like connecting to the MSCB device with proper
error handling, looping over several devices and variables, setting up the “Setting” directory in the ODB to define labels for
all variables. In addition we have a mirror for output variables, so that new values are only sent to the device if they differ
from the previous variable (needed to reduce some communication traffic).
The midas/drivers/mdev directory contains also an example frontend in the mfe.cxx framework, but this is no a requirement.
The mdev framework can also be used in the tmfe framework and others as well. Please note how compact the frontend
code now looks.
User interface
Since the beginning, MIDAS allows access to the the slow control devices through the “equipment” page (on the main status
page, click on one equipment). A few more options can control now the behavior of this page, allowing quite some flexibility
without having to write a dedicated custom page (which of course can still be done). Attached is an example from Mu3e where
the details of the equipment display are controlled through some options in the setting subdirectory as described in
https://daq00.triumf.ca/MidasWiki/index.php//Equipment_ODB_tree (especially the “grid display”, “Editable” and “Format”
flags).
Conclusions
The new “mdev” framework offers a compact and effective way to communicate from MIDAS to slow control devices. Since
all interface code is now not “hidden” any more in system class and device drivers, the user has much higher flexibility in
controlling different devices. If a device has a new parameter, the user can add a single line of code to connect this
parameter to an ODB entry.
The framework is very simple and misses some features of the old system. Ramping of HV voltages and current trips are not
available in the framework (like with the old HV class driver), but modern devices usually implement this in hardware which
is much better. The new framework is not multi-threaded, but modern devices are these day much faster than in the ‘90s.
Since the ODB is thread save, nothing prevents us from putting a device readout into its own thread in the frontend.
We will use the new system for all devices in Mu3e, with probably some new features being added soon, so stay tuned.
/Stefan |
10 Dec 2024, Stefan Ritt, Suggestion, Comma-separated indices in alarm conditions
|
These kind of alarm conditions have been implemented and committed. The documentation at
https://daq00.triumf.ca/MidasWiki/index.php/Alarm_System
has been updated.
/Stefan |
12 Dec 2024, Stefan Ritt, Suggestion, New alarm sound flag to be tested
|
We had the case in MEG that some alarms were actually just warnings, not very severe. This happens for example if we calibrate our detector
once every other day and modify the hardware which actually triggers the alarm for about an hour or so.
The problem with this is now that the alarm sounds every few minutes, and people get annoyed during that hour. They turn down the volume
of their speakers, or even disable the alarm sound. If the detector gets back into the default mode again, they might forget to re-enable the
alarm, which causes some risk.
Turning down the volume is also not good, since during that hour we could have a "real" alarm on which people have to react quickly in order
not destroy the detector.
The art is now to configure the alarm system in a way that "normal" changes do not annoy people or cover up really severe alarms. After long
discussions we came to following conclusion: We need a special class of alarm (let's call it 'warning') which does not annoy people. The
warning should be visible on the screen, but not ring the alarm bell.
While we have different alarm classes in midas, which let us customize the frequency of alarms and the screen colors, all alarms or warnings
ring the alarm sound right now. This can be changed in the browser under "Config/Alarm sound" but that switch affects ALL alarms, which is
not what we want.
The idea we came up with was to add a flag "Alarm sound" to the alarm classes. For the 'warning' we can then turn off the alarm sound, so
only the banner is shown on top of the screen, and the spoken message is generated every 15 mins to remind people, but not to annoy them.
I added this "Alarm sound" flag in the branch feature/alarm_sound so everybody can test it. The downside is that all "Alarm/Classs/xxx" need
to be modified to add this flag. While the new code will add this flag automatically (with a default value of 'true'), the size of the alarm class
record changes by four bytes (one bool). Therefore, all running midas programs will complain about the changed size, until they get
recompiled.
Therefore, to test the new feature, you have to checkout the branch and re-compile all midas programs you use, otherwise you will get errors
like
Fixing ODB "/Alarms/Classes/Alarm" struct size mismatch (expected 352, odb size 348)
I will keep the branch for a few days for people to try it out and report any issue, and later merge it to develop.
Stefan |
|