Back Midas Rome Roody Rootana
  Midas DAQ System, Page 133 of 146  Not logged in ELOG logo
Entry  23 Sep 2013, Stefan Ritt, Info, Custom page header implemented Screen_Shot_2013-09-23_at_15.17.40_.png
Due to popular request, I implemented a custom header for mhttpd. This allows to inject some HTML code 
to be shown on top of the menu bar on all mhttpd pages. One possible application is to bring back the old 
status line with the name of the current experiment, the actual time and the refresh interval. 

To use this feature, one can put a new entry into the ODB under

/Custom/Header

which can be either a string (to show some short HTML code directly) or the name of a file containing some 
HTML code. If /Custom/Path is present, that path is used to locate the header file. A simple header file to 
recreate the GOT look (good-old-times) is here:

<div id="footerDiv" class="footerDiv">
<div style="display:inline; float:left;">MIDAS experiment "Test"</div>
<div id="refr" style="display:inline; float:right;"></div>
</div>
<script type="text/javascript">
var r = document.getElementById('refr');
var now	= new Date();
var c =	document.cookie.split('midas_refr=');
r.innerHTML = now.toString() + '&nbsp;&nbsp;&nbsp;' + 'Refr:' + c.pop().split(';').shift();
</script>

The JavaScript code is used to retrieve the midas_refr cookie which stores the refresh interval and displays 
it together with the current time.

Another application of this feature might be to check certain values in the ODB (via the ODBGet function) 
and some some important status or error condition.

/Stefan
    Reply  12 Feb 2014, Stefan Ritt, Info, Custom page header implemented 
As reported in the bug tracker, the proposed header does not work if no specific (= different from the default 60 sec.) update period is specified, 
since then no cookie is present. Here is the updated code which works for all cases:



<div id="footerDiv" class="footerDiv">
<div style="display:inline; float:left;">MIDAS experiment "Test"</div>
<div id="refr" style="display:inline; float:right;"></div>
</div>
<script type="text/javascript">
var r = document.getElementById('refr');
var now = new Date();
var refr;
if (document.cookie.search('midas_refr') == -1)
   refr = 60;
else {
   var c = document.cookie.split('midas_refr=');
   refr = c.pop().split(';').shift();
}
r.innerHTML = now.toString() + '&nbsp;&nbsp;&nbsp;' + 'Refr:' + refr;
</script>



/Stefan
    Reply  18 Feb 2014, Konstantin Olchanski, Info, Custom page header implemented 
I am not sure what to do with the javascript snippet - I understand it should be somehow connected to /Custom/Header, but if I create the /Custom/Header string, I cannot put this snippet 
into this string using odbedit - if I try to cut&paste it into odbedit, it is truncated to the first line - nor using the mhttpd odb editor - when I cut&paste it into the odb editor text entry box, it 
is truncated to the first 519 bytes (must be a hard limit somewhere). K.O.

> As reported in the bug tracker, the proposed header does not work if no specific (= different from the default 60 sec.) update period is specified, 
> since then no cookie is present. Here is the updated code which works for all cases:
> 
> 
> 
> <div id="footerDiv" class="footerDiv">
> <div style="display:inline; float:left;">MIDAS experiment "Test"</div>
> <div id="refr" style="display:inline; float:right;"></div>
> </div>
> <script type="text/javascript">
> var r = document.getElementById('refr');
> var now = new Date();
> var refr;
> if (document.cookie.search('midas_refr') == -1)
>    refr = 60;
> else {
>    var c = document.cookie.split('midas_refr=');
>    refr = c.pop().split(';').shift();
> }
> r.innerHTML = now.toString() + '&nbsp;&nbsp;&nbsp;' + 'Refr:' + refr;
> </script>
> 
> 
> 
> /Stefan
    Reply  19 Feb 2014, Stefan Ritt, Info, Custom page header implemented 
> I am not sure what to do with the javascript snippet 

Just read elog:908, it tells you to put this into a file, name it header.html for example, and put into the ODB:

/Custom/Header [string32] = header.html

make sure that you put the file into the directory indicated by /Custom/Path.

Cheers,
Stefan
Entry  25 Jul 2017, Stefan Ritt, Info, Current git repository "develop" branch broken 
Dear all,

we are currently undergoing major modifications in the way mhttpd is working. I realized that 
we are now at a state where mhttpd is currently broken, and it will take a few weeks in order to 
get everything converted to the new scheme we plan to use. Therefore I moved the git branch 
"master" to the last known stable version of midas. So for any practical purpose, please do 
NOT update your "develop" branch until further notice. To get the last stable version, you can 
do a 

$ git checkout master

which moves you right before we started to make major modifications. Once we are finished, 
we will announce this here in the forum.

Best regards,
Stefan
Entry  10 Mar 2004, Jan Wouters, , Creation of secondary Midas output file. dance193.tar
Dear Midas Team,

I have run into a problem with Midas and was wondering if you could explain what I 
am doing wrong.  I have included a simple demo to illustrate what I am doing and 
can send a small input data file if needed.

WHAT I AM TRYING TO DO:
Every midas event for the DANCE experiment consists of many physics events.  I am 
trying to create a secondary mid file where the event boundaries are now the 
physics events rather than the midas events.  This secondary mid file will be 
analyzed using a second stage midas analyzer.

For the demo, I use the data from EV02 (one of our 15 frontends), which consists of a 
variable number of fixed length structures where each structure contains the data for 
one crystal from the DANCE detector. 
 I treat each crystal as a separate physics event and write it out in the TREK bank, 
which is a demo calculated output bank, as a separate event.   

(The only difference between this demo and our real system is that we would include 
all the crystals from the other frontends that have approximately the same time stamp 
in the output bank.  Thus the output bank would consist of a varing number of 
crystals in one event rather than the fixed one crystal per event used in this demo.)

THE CHANGES TO analyzer.c AND adccalib.c
I loop through the EV02 bank examining each crystal structure in turn.  I calculate 
"calibrated" parameters and put them into an output bank called TREK.  The unusual 
part of this example is that the TREK bank is no longer part of the main list of input 
banks, ana_trigger_bank_list[].   Instead it is now part of a new bank list called 
ana_physics_bank_list[].  See the analyzer.c file for this definition.

In adccalib.c I  create the space for this new bank as follows. 

	EVENT_HEADER 	gPhysicsEventHeaders[ MAX_EVENT_SIZE / sizeof( 
EVENT_HEADER ) ];  
	WORD* 		gPhysicsEventData = ( WORD * )( gPhysicsEventHeaders + 1 );		

In the adc_calib routine I create the bank header as follows.  Note that the serial 
numbers will restart at 0 at the beginning of each midas event.  Should I let the serial 
number increment monotonically until the end of the run?:

	gPhysicsEventHeaders->serial_number = (DWORD) - 1;
	gPhysicsEventHeaders->event_id = 2;
	gPhysicsEventHeaders->trigger_mask = 0;
	gPhysicsEventHeaders->time_stamp = pheader->time_stamp;

In a loop that loops through all the crystals contained in EV02,  I extract each crystal, 
calibrate it, and store it in a TREK structure.  In creating the TREK bank I assume that 
each one will be a separate physics event thus I update the event serial number and 
use bk_init32 to initialize the memory.   

   	for ( short i = 0; i < nItems; i++ )
  	{	++(gPhysicsEventHeaders->serial_number);  	// Update serial number.
  		bk_init32( gPhysicsEventData );		// Initialize storage.
  		bk_create( gPhysicsEventData, "TREK", TID_STRUCT, &trek );
  	
  		trek->one = (double) pev->areahg * 1.0;
  		trek->two = (float) pev->timelo * 1.0;

  		bk_close( gPhysicsEventData, trek+1 );
  		
  		pev++; 					// Loop to next crystal's data. 
	}	

The output bank should consist of multiple events for each individual EV02 midas 
input event. 

 As far as I can tell the code compiles and runs fine, but I get no data in the .mid 
output file except for the ODB. I have a print statement at the beginning of each 
midas event stating how many crystals were found in the EV02 bank.  I also print out 
the calibrated value for each crystal as it is being placed in its own TREK output 
bank.  The data appears correct.

 I cannot place TREK in the input bank the way it normally is done in the examples 
because there is not a one-to-one correspondence between a midas event and a 
true physics event.  Instead one midas event has many physics events.  Thus the 
output bank needs to be in a new memory area so that I can create a custom header 
and increment the serial number properly for each event.  Our follow-on analysis 
using a second Midas analyzer only needs to analyze one physics event at a time 
rather than one Midas event at a time, which is why we are going to all the trouble to 
get this paradigm working.

I include all the code for this very simple example. 

RUNNING THE CODE:
To run the example just use the run01220.mid file I will send:

./analyzer -i run01220.mid.gz -o run01220out.mid -c settings.odb_cfg -n 50

The only thing done by the settings.odb_cfg file is to turn on the TREK output bank.  I 
have verified that the bank is on.

SUMMARY:
I believe that I must not be creating the new TREK output bank correctly so that 
midas understands that the event-by-event calculated physics data should be written 
out event-by-event.  I have pointed out several places in the above discussion where 
I might be making a mistake.

I would like to get both this example running and a similar which create Root trees, 
though the Root trees are of secondary importance.  With this example I can finish 
writing the second stage analyzer and get the DANCE collaboration moving forward 
with their analysis.  Currently, we cannot use this paradigm because I cannot create 
a secondary mid file in our stage one analysis.  I would be very grateful if you could 
take a look at this example and tell me what I am doing incorrectly.

Jan
    Reply  10 Mar 2004, Stefan Ritt, , Creation of secondary Midas output file. adccalib.c
Dear Jan,

I had a look at your code. You create a gPhysicsEventHeader array, fill it, and expect the 
framework to write it to disk. But how can the framework "guess" that you want your private 
global array being written? Unfortunately it cannot do magic!

Do do what you want, you have to write a "secondary" midas file yourself. I modified your 
code to do that. First, I define the event storage like

BYTE           gSecEvent[ MAX_EVENT_SIZE ];
EVENT_HEADER   *gPhysicsEventHeader = (EVENT_HEADER *) gSecEvent;
WORD* 	       gPhysicsEventData = ( WORD * )( gPhysicsEventHeader + 1 );		

I use gSecEvent as a BYTE array, since it only contains one avent at a time, so this is more 
appropriate. Then, in the BOR routine, I open a file:

  sprintf(str, "sec%05d.mid", run_number);
  sec_fh = open(str, O_CREAT | O_RDWR | O_BINARY, 0644);

and close it in the EOR routine

  close(sec_fh);

The event routine now manually fills events into the secondary file:

      /* write event to secondary .mid file */
      gPhysicsEventHeader->data_size = bk_size(gPhysicsEventData);
      write(sec_fh, gPhysicsEventHeader, sizeof(EVENT_HEADER)+bk_size(gPhysicsEventData));

Note that this code is placed *inside* the for() loop over nItems, so for each detector you 
create and event and write it.

That's all you need, the full file adccalib.c is attached. I tried to produce a sec01220.mid 
file and was able to read it back with the mdump utility.

Best regards,

  Stefan
    Reply  11 Mar 2004, Renee Poutissou, , Creation of secondary Midas output file. 
Jan , 

Do you need to log this stage 1 output?  If not, you would use the 
eventbuilder mechanism to create your stage 2 events.  
I use the eventbuilder mechanism with success for my TWIST experiment.

Renee
Entry  16 Sep 2024, Marius Köppel, Bug Report, Crash using ODB watch test_fe.cpp
Hi all,

last week I was running MIDAS with the commit 3ad98c5. Today I updated MIDAS and now all my watch functions are crashing. Attached I have a minimal example frontend of the problem.

In our software we have two functions one which sets up the ODB values of the frontend and another one which sets up all watch functions. So overall we connect two time to the ODB during fronend_init one time to create the values and one time to create the watch. In the example code a simple version of this setup is shown:

INT frontend_init() {

  cm_msg(MINFO, "frontend_init() setup", "Test FE");

  odb settings = {
    {"Test", 123},
    {"sub", {}}
  };
  settings.connect_and_fix_structure("/Equipment/Test FE/Settings");
  // settings.watch(watch); <-- this works without segmentation fault

  odb new_settings("/Equipment/Test FE/Settings");
  new_settings.watch(watch); // <-- here I am getting a segmentation fault

  return CM_SUCCESS;
}

When I directly set the watch everything runs fine however, when I create a new ODB object and use this one to set a watch I am getting the following segmentation fault:

Process 18474 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x34)
    frame #0: 0x000000010004fa38 test_fe`midas::odb::watch_callback(hDB=<unavailable>, hKey=<unavailable>, index=0, info=0x00006000002001c0) at odbxx.cxx:96:25 [opt]
   93  	      if (po->m_data == nullptr)
   94  	         mthrow("Callback received for a midas::odb object which went out of scope");
   95  	      midas::odb *poh = search_hkey(po, hKey);
-> 96  	      poh->m_last_index = index;
   97  	      po->m_watch_callback(*poh);
   98  	      poh->m_last_index = -1;
   99  	   }

Best,
Marius
    Reply  16 Sep 2024, Stefan Ritt, Bug Report, Crash using ODB watch 
The answer is in the error message: „Object went out of scope“. When your frontent_init() exits, the odb objects are destroyed. When you get a callback, it‘s linked to the
destroyed object. This is like if you have a local string and pass a reference to that string in the return of the function.

Use a global object (bad) or use „new“ (potential memory leak). I would use a global structure which holds all odb objects.

Stefan
 
> 
> last week I was running MIDAS with the commit 3ad98c5. Today I updated MIDAS and now all my watch functions are crashing. Attached I have a minimal example frontend of the problem.
> 
> In our software we have two functions one which sets up the ODB values of the frontend and another one which sets up all watch functions. So overall we connect two time to the ODB during fronend_init one time to create the values and one time to create the watch. In the example code a simple version of this setup is shown:
> 
> INT frontend_init() {
> 
>   cm_msg(MINFO, "frontend_init() setup", "Test FE");
> 
>   odb settings = {
>     {"Test", 123},
>     {"sub", {}}
>   };
>   settings.connect_and_fix_structure("/Equipment/Test FE/Settings");
>   // settings.watch(watch); <-- this works without segmentation fault
> 
>   odb new_settings("/Equipment/Test FE/Settings");
>   new_settings.watch(watch); // <-- here I am getting a segmentation fault
> 
>   return CM_SUCCESS;
> }
> 
> When I directly set the watch everything runs fine however, when I create a new ODB object and use this one to set a watch I am getting the following segmentation fault:
> 
> Process 18474 stopped
> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x34)
>     frame #0: 0x000000010004fa38 test_fe`midas::odb::watch_callback(hDB=<unavailable>, hKey=<unavailable>, index=0, info=0x00006000002001c0) at odbxx.cxx:96:25 [opt]
>    93  	      if (po->m_data == nullptr)
>    94  	         mthrow("Callback received for a midas::odb object which went out of scope");
>    95  	      midas::odb *poh = search_hkey(po, hKey);
> -> 96  	      poh->m_last_index = index;
>    97  	      po->m_watch_callback(*poh);
>    98  	      poh->m_last_index = -1;
>    99  	   }
> 
> Best,
> Marius
    Reply  16 Sep 2024, Marius Koeppel, Bug Report, Crash using ODB watch 
This is not the case here. Note that the error message: "Callback received for a midas::odb object which went out of scope" is not called! The segmentation fault happens later line 96.

> The answer is in the error message: „Object went out of scope“. When your frontent_init() exits, the odb objects are destroyed. When you get a callback, it‘s linked to the
> destroyed object. This is like if you have a local string and pass a reference to that string in the return of the function.
> 
> Use a global object (bad) or use „new“ (potential memory leak). I would use a global structure which holds all odb objects.
> 
> Stefan
>  
> > 
> > last week I was running MIDAS with the commit 3ad98c5. Today I updated MIDAS and now all my watch functions are crashing. Attached I have a minimal example frontend of the problem.
> > 
> > In our software we have two functions one which sets up the ODB values of the frontend and another one which sets up all watch functions. So overall we connect two time to the ODB during fronend_init one time to create the values and one time to create the watch. In the example code a simple version of this setup is shown:
> > 
> > INT frontend_init() {
> > 
> >   cm_msg(MINFO, "frontend_init() setup", "Test FE");
> > 
> >   odb settings = {
> >     {"Test", 123},
> >     {"sub", {}}
> >   };
> >   settings.connect_and_fix_structure("/Equipment/Test FE/Settings");
> >   // settings.watch(watch); <-- this works without segmentation fault
> > 
> >   odb new_settings("/Equipment/Test FE/Settings");
> >   new_settings.watch(watch); // <-- here I am getting a segmentation fault
> > 
> >   return CM_SUCCESS;
> > }
> > 
> > When I directly set the watch everything runs fine however, when I create a new ODB object and use this one to set a watch I am getting the following segmentation fault:
> > 
> > Process 18474 stopped
> > * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x34)
> >     frame #0: 0x000000010004fa38 test_fe`midas::odb::watch_callback(hDB=<unavailable>, hKey=<unavailable>, index=0, info=0x00006000002001c0) at odbxx.cxx:96:25 [opt]
> >    93  	      if (po->m_data == nullptr)
> >    94  	         mthrow("Callback received for a midas::odb object which went out of scope");
> >    95  	      midas::odb *poh = search_hkey(po, hKey);
> > -> 96  	      poh->m_last_index = index;
> >    97  	      po->m_watch_callback(*poh);
> >    98  	      poh->m_last_index = -1;
> >    99  	   }
> > 
> > Best,
> > Marius
    Reply  16 Sep 2024, Stefan Ritt, Bug Report, Crash using ODB watch 
Well, the object *went* out of scope. For my code it‘s hard to realize this, so the error reporting is poor. Also the first object should have the same
problem. Just by accident that it does not crash.

Stefan 

> This is not the case here. Note that the error message: "Callback received for a midas::odb object which went out of scope" is not called! The segmentation fault happens later line 96.
> 
> > The answer is in the error message: „Object went out of scope“. When your frontent_init() exits, the odb objects are destroyed. When you get a callback, it‘s linked to the
> > destroyed object. This is like if you have a local string and pass a reference to that string in the return of the function.
> > 
> > Use a global object (bad) or use „new“ (potential memory leak). I would use a global structure which holds all odb objects.
> > 
> > Stefan
> >  
> > > 
> > > last week I was running MIDAS with the commit 3ad98c5. Today I updated MIDAS and now all my watch functions are crashing. Attached I have a minimal example frontend of the problem.
> > > 
> > > In our software we have two functions one which sets up the ODB values of the frontend and another one which sets up all watch functions. So overall we connect two time to the ODB during fronend_init one time to create the values and one time to create the watch. In the example code a simple version of this setup is shown:
> > > 
> > > INT frontend_init() {
> > > 
> > >   cm_msg(MINFO, "frontend_init() setup", "Test FE");
> > > 
> > >   odb settings = {
> > >     {"Test", 123},
> > >     {"sub", {}}
> > >   };
> > >   settings.connect_and_fix_structure("/Equipment/Test FE/Settings");
> > >   // settings.watch(watch); <-- this works without segmentation fault
> > > 
> > >   odb new_settings("/Equipment/Test FE/Settings");
> > >   new_settings.watch(watch); // <-- here I am getting a segmentation fault
> > > 
> > >   return CM_SUCCESS;
> > > }
> > > 
> > > When I directly set the watch everything runs fine however, when I create a new ODB object and use this one to set a watch I am getting the following segmentation fault:
> > > 
> > > Process 18474 stopped
> > > * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x34)
> > >     frame #0: 0x000000010004fa38 test_fe`midas::odb::watch_callback(hDB=<unavailable>, hKey=<unavailable>, index=0, info=0x00006000002001c0) at odbxx.cxx:96:25 [opt]
> > >    93  	      if (po->m_data == nullptr)
> > >    94  	         mthrow("Callback received for a midas::odb object which went out of scope");
> > >    95  	      midas::odb *poh = search_hkey(po, hKey);
> > > -> 96  	      poh->m_last_index = index;
> > >    97  	      po->m_watch_callback(*poh);
> > >    98  	      poh->m_last_index = -1;
> > >    99  	   }
> > > 
> > > Best,
> > > Marius
    Reply  16 Sep 2024, Marius Koeppel, Bug Report, Crash using ODB watch 
Okay, but this is then a big issue IMO. For Mu3e we do this in every frontend and I also checked again all of these watches are broken at the moment (with commit 3ad98c5 they worked).
 
In the old style we did for example (see https://bitbucket.org/tmidas/midas/src/develop/examples/crfe/crfe.cxx):

INT frontend_init()
{
   HNDLE hKey;

   // create Settings structure in ODB
   db_create_record(hDB, 0, "Equipment/Clock Reset/Settings", strcomb1(cr_settings_str).c_str());
   db_find_key(hDB, 0, "/Equipment/Clock Reset", &hKey);
   assert(hKey);

   db_watch(hDB, hKey, cr_settings_changed, NULL);

   /*
    * Set our transition sequence. The default is 500. Setting it
    * to 600 means we are called AFTER most other clients.
    */
   cm_set_transition_sequence(TR_START, 600);

   return CM_SUCCESS;
}

I thought this will be the same (under the hood) in the current odbxx way via:

odb settings("Equipment/Clock Reset/Settings");
settings.watch(cr_settings_changed);

Best,
Marius


> Well, the object *went* out of scope. For my code it‘s hard to realize this, so the error reporting is poor. Also the first object should have the same
> problem. Just by accident that it does not crash.
> 
> Stefan 
> 
> > This is not the case here. Note that the error message: "Callback received for a midas::odb object which went out of scope" is not called! The segmentation fault happens later line 96.
> > 
> > > The answer is in the error message: „Object went out of scope“. When your frontent_init() exits, the odb objects are destroyed. When you get a callback, it‘s linked to the
> > > destroyed object. This is like if you have a local string and pass a reference to that string in the return of the function.
> > > 
> > > Use a global object (bad) or use „new“ (potential memory leak). I would use a global structure which holds all odb objects.
> > > 
> > > Stefan
> > >  
> > > > 
> > > > last week I was running MIDAS with the commit 3ad98c5. Today I updated MIDAS and now all my watch functions are crashing. Attached I have a minimal example frontend of the problem.
> > > > 
> > > > In our software we have two functions one which sets up the ODB values of the frontend and another one which sets up all watch functions. So overall we connect two time to the ODB during fronend_init one time to create the values and one time to create the watch. In the example code a simple version of this setup is shown:
> > > > 
> > > > INT frontend_init() {
> > > > 
> > > >   cm_msg(MINFO, "frontend_init() setup", "Test FE");
> > > > 
> > > >   odb settings = {
> > > >     {"Test", 123},
> > > >     {"sub", {}}
> > > >   };
> > > >   settings.connect_and_fix_structure("/Equipment/Test FE/Settings");
> > > >   // settings.watch(watch); <-- this works without segmentation fault
> > > > 
> > > >   odb new_settings("/Equipment/Test FE/Settings");
> > > >   new_settings.watch(watch); // <-- here I am getting a segmentation fault
> > > > 
> > > >   return CM_SUCCESS;
> > > > }
> > > > 
> > > > When I directly set the watch everything runs fine however, when I create a new ODB object and use this one to set a watch I am getting the following segmentation fault:
> > > > 
> > > > Process 18474 stopped
> > > > * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x34)
> > > >     frame #0: 0x000000010004fa38 test_fe`midas::odb::watch_callback(hDB=<unavailable>, hKey=<unavailable>, index=0, info=0x00006000002001c0) at odbxx.cxx:96:25 [opt]
> > > >    93  	      if (po->m_data == nullptr)
> > > >    94  	         mthrow("Callback received for a midas::odb object which went out of scope");
> > > >    95  	      midas::odb *poh = search_hkey(po, hKey);
> > > > -> 96  	      poh->m_last_index = index;
> > > >    97  	      po->m_watch_callback(*poh);
> > > >    98  	      poh->m_last_index = -1;
> > > >    99  	   }
> > > > 
> > > > Best,
> > > > Marius
    Reply  16 Sep 2024, Mark Grimes, Bug Report, Crash using ODB watch 
Hi,
Maybe I've misunderstood the code, but odb::watch() creates a deep copy of itself to set the watch to.  The comment where this happens specifies that this is in case the current one goes out of scope.  See https://bitbucket.org/tmidas/midas/src/2878647fb73648474b35223ce53a125180f751b3/src/odbxx.cxx#lines-1393:1395
So as far as I can tell allowing the current odb instance to go out of scope is supported.

Thanks,

Mark.


> Okay, but this is then a big issue IMO. For Mu3e we do this in every frontend and I also checked again all of these watches are broken at the moment (with commit 3ad98c5 they worked).
>  
> In the old style we did for example (see https://bitbucket.org/tmidas/midas/src/develop/examples/crfe/crfe.cxx):
> 
> INT frontend_init()
> {
>    HNDLE hKey;
> 
>    // create Settings structure in ODB
>    db_create_record(hDB, 0, "Equipment/Clock Reset/Settings", strcomb1(cr_settings_str).c_str());
>    db_find_key(hDB, 0, "/Equipment/Clock Reset", &hKey);
>    assert(hKey);
> 
>    db_watch(hDB, hKey, cr_settings_changed, NULL);
> 
>    /*
>     * Set our transition sequence. The default is 500. Setting it
>     * to 600 means we are called AFTER most other clients.
>     */
>    cm_set_transition_sequence(TR_START, 600);
> 
>    return CM_SUCCESS;
> }
> 
> I thought this will be the same (under the hood) in the current odbxx way via:
> 
> odb settings("Equipment/Clock Reset/Settings");
> settings.watch(cr_settings_changed);
> 
> Best,
> Marius
> 
> 
> > Well, the object *went* out of scope. For my code it‘s hard to realize this, so the error reporting is poor. Also the first object should have the same
> > problem. Just by accident that it does not crash.
> > 
> > Stefan 
> > 
> > > This is not the case here. Note that the error message: "Callback received for a midas::odb object which went out of scope" is not called! The segmentation fault happens later line 96.
> > > 
> > > > The answer is in the error message: „Object went out of scope“. When your frontent_init() exits, the odb objects are destroyed. When you get a callback, it‘s linked to the
> > > > destroyed object. This is like if you have a local string and pass a reference to that string in the return of the function.
> > > > 
> > > > Use a global object (bad) or use „new“ (potential memory leak). I would use a global structure which holds all odb objects.
> > > > 
> > > > Stefan
> > > >  
> > > > > 
> > > > > last week I was running MIDAS with the commit 3ad98c5. Today I updated MIDAS and now all my watch functions are crashing. Attached I have a minimal example frontend of the problem.
> > > > > 
> > > > > In our software we have two functions one which sets up the ODB values of the frontend and another one which sets up all watch functions. So overall we connect two time to the ODB during fronend_init one time to create the values and one time to create the watch. In the example code a simple version of this setup is shown:
> > > > > 
> > > > > INT frontend_init() {
> > > > > 
> > > > >   cm_msg(MINFO, "frontend_init() setup", "Test FE");
> > > > > 
> > > > >   odb settings = {
> > > > >     {"Test", 123},
> > > > >     {"sub", {}}
> > > > >   };
> > > > >   settings.connect_and_fix_structure("/Equipment/Test FE/Settings");
> > > > >   // settings.watch(watch); <-- this works without segmentation fault
> > > > > 
> > > > >   odb new_settings("/Equipment/Test FE/Settings");
> > > > >   new_settings.watch(watch); // <-- here I am getting a segmentation fault
> > > > > 
> > > > >   return CM_SUCCESS;
> > > > > }
> > > > > 
> > > > > When I directly set the watch everything runs fine however, when I create a new ODB object and use this one to set a watch I am getting the following segmentation fault:
> > > > > 
> > > > > Process 18474 stopped
> > > > > * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x34)
> > > > >     frame #0: 0x000000010004fa38 test_fe`midas::odb::watch_callback(hDB=<unavailable>, hKey=<unavailable>, index=0, info=0x00006000002001c0) at odbxx.cxx:96:25 [opt]
> > > > >    93  	      if (po->m_data == nullptr)
> > > > >    94  	         mthrow("Callback received for a midas::odb object which went out of scope");
> > > > >    95  	      midas::odb *poh = search_hkey(po, hKey);
> > > > > -> 96  	      poh->m_last_index = index;
> > > > >    97  	      po->m_watch_callback(*poh);
> > > > >    98  	      poh->m_last_index = -1;
> > > > >    99  	   }
> > > > > 
> > > > > Best,
> > > > > Marius
    Reply  17 Sep 2024, Konstantin Olchanski, Bug Report, Crash using ODB watch 
> {
> odb new_settings("/Equipment/Test FE/Settings");
> new_settings.watch(watch); // <-- here I am getting a segmentation fault
> }

this code has a bug. "watch" is attached to object "new_settings" that is deleted
after the closing curly bracket.

I would say Stefan's odb API should not allow you to write code like this. an API defect.

K.O.
    Reply  18 Sep 2024, Marius Koeppel, Bug Report, Crash using ODB watch 
I created a PR to fix this issue https://bitbucket.org/tmidas/midas/pull-requests/42.
The crash happened since the change in commit 3ad98c5 always got the ODB via XML.
However, the creation from XML should only be used when a user wants to read fast (and when we are on a remote machine) so I added the flag use_from_xml to explicitly specify this.


> > {
> > odb new_settings("/Equipment/Test FE/Settings");
> > new_settings.watch(watch); // <-- here I am getting a segmentation fault
> > }
> 
> this code has a bug. "watch" is attached to object "new_settings" that is deleted
> after the closing curly bracket.

> I would say Stefan's odb API should not allow you to write code like this. an API defect.

As pointed out in the thread this feature is explicitly supported by odbxx.cxx:

void odb::watch(std::function<void(midas::odb &)> f) {
      if (m_hKey == 0 || m_hKey == -1)
         mthrow("watch() called for ODB key \"" + m_name +
                "\" which is not connected to ODB");

      // create a deep copy of current object in case it
      // goes out of scope
      midas::odb* ow = new midas::odb(*this);

      ow->m_watch_callback = f;
      db_watch(s_hDB, m_hKey, midas::odb::watch_callback, ow);

      // put object into watchlist
      g_watchlist.push_back(ow);
}

Also in the old way (see for example https://bitbucket.org/tmidas/midas/src/191d13f98626fae533cbca17b00df7ee361edf16/examples/crfe/crfe.cxx#lines-126) it was possible to create a watch in a scope without the user taking care that the "object" does not go out of scope.
I think this feature should be supported by the framework.

Best,
Marius
    Reply  20 Sep 2024, Stefan Ritt, Bug Report, Crash using ODB watch 
The problem has been fixed in the current version. Here is my analysis:

- the midas::odb object *can* go out of scope in the function, since the odb::watch() function creates a deep copy of the object. 
This does not cause a memory leak if one call odb::unwatch_all() at the end of a program.

- The creation from XML had a flaw where the ODB key handle ("hKey") is not initialized since it is not passed by the db_copy_xml() function.
I added code to db_copy_xml() to also fetch the key handle in the XML file, which now fixes the issue. Please note that you have to
update both the server and client side of midas to get this functionality if you are using it by a remote client.

- I saw the flag MK added on his pull request to the constructor of odb::odb(). This is a way to fight the symptoms (by creating an
object the "old" way if not otherwise needed, but how we have the cause cured. Nevertheless I added that parameter, but set to to true by default:

   odb::odb(const std::string &str, bool init_via_xml = true);

since this should be fully working now and should always be faster than the old method. I only keep it for debugging should we observe
another flaw in odb_from_xml(). 

Best regards,
Stefan
Entry  04 Jul 2012, Konstantin Olchanski, Bug Report, Crash after recursive use of rpc_execute() 
I am looking at a MIDAS kaboom when running out of space on the data disk - everything was freezing 
up, even the VME frontend crashed sometimes.

The freeze was traced to ROOT use in mlogger - it turns out that ROOT intercepts many signal handlers, 
including SIGSEGV - but instead of crashing the program as God intended, ROOT SEGV handler just hangs, 
and the rest of MIDAS hangs with it. One solution is to always build mlogger without ROOT support - 
does anybody use this feature anymore? Or reset the signal handlers back to the default setting somehow.

Freeze fixed, now I see a crash (seg fault) inside mlogger, in the newly introduced memmove() function 
inside the MIDAS RPC code rpc_execute(). memmove() replaced memcpy() in the same place and I am 
surprised we did not see this crash with memcpy().

The crash is caused by crazy arguments passed to memmove() - looks like corrupted RPC arguments 
data.

Then I realized that I see a recursive call to rpc_execute(): rpc_execute() calls tr_stop() calls cm_yield() calls 
ss_suspend() calls rpc_execute(). The second rpc_execute successfully completes, but leave corrupted 
data for the original rpc_execute(), which happily crashes. At the moment of the crash, recursive call to 
rpc_execute() is no longer visible.

Note that rpc_execute() cannot be called recursively - it is not re-entrant as it uses a global buffer for RPC 
argument processing. (global tls_buffer structure).

Here is the mlogger stack trace:

#0  0x00000032a8032885 in raise () from /lib64/libc.so.6
#1  0x00000032a8034065 in abort () from /lib64/libc.so.6
#2  0x00000032a802b9fe in __assert_fail_base () from /lib64/libc.so.6
#3  0x00000032a802bac0 in __assert_fail () from /lib64/libc.so.6
#4  0x000000000041d3e6 in rpc_execute (sock=14, buffer=0x7ffff73fc010 "\340.", convert_flags=0) at 
src/midas.c:11478
#5  0x0000000000429e41 in rpc_server_receive (idx=1, sock=<value optimized out>, check=<value 
optimized out>) at src/midas.c:12955
#6  0x0000000000433fcd in ss_suspend (millisec=0, msg=0) at src/system.c:3927
#7  0x0000000000429b12 in cm_yield (millisec=100) at src/midas.c:4268
#8  0x00000000004137c0 in close_channels (run_number=118, p_tape_flag=0x7fffffffcd34) at 
src/mlogger.cxx:3705
#9  0x000000000041390e in tr_stop (run_number=118, error=<value optimized out>) at 
src/mlogger.cxx:4148
#10 0x000000000041cd42 in rpc_execute (sock=12, buffer=0x7ffff73fc010 "\340.", convert_flags=0) at 
src/midas.c:11626
#11 0x0000000000429e41 in rpc_server_receive (idx=0, sock=<value optimized out>, check=<value 
optimized out>) at src/midas.c:12955
#12 0x0000000000433fcd in ss_suspend (millisec=0, msg=0) at src/system.c:3927
#13 0x0000000000429b12 in cm_yield (millisec=1000) at src/midas.c:4268
#14 0x0000000000416c50 in main (argc=<value optimized out>, argv=<value optimized out>) at 
src/mlogger.cxx:4431


K.O.
    Reply  04 Jul 2012, Konstantin Olchanski, Bug Report, Crash after recursive use of rpc_execute() 
>  ... I see a recursive call to rpc_execute(): rpc_execute() calls tr_stop() calls cm_yield() calls 
> ss_suspend() calls rpc_execute()
> ... rpc_execute() cannot be called recursively - it is not re-entrant as it uses a global buffer

It turns out that rpc_server_receive() also need protection against recursive calls - it also uses
a global buffer to receive network data.

My solution is to protect rpc_server_receive() against recursive calls by detecting recursion and returning SS_SUCCESS (to ss_suspend()).

I was worried that this would cause a tight loop inside ss_suspend() but in practice, it looks like ss_suspend() tries to call
us about once per second. I am happy with this solution. Here is the diff:


@@ -12813,7 +12815,7 @@
 
 
 /********************************************************************/
-INT rpc_server_receive(INT idx, int sock, BOOL check)
+INT rpc_server_receive1(INT idx, int sock, BOOL check)
 /********************************************************************\
 
   Routine: rpc_server_receive
@@ -13047,7 +13049,28 @@
    return status;
 }
 
+/********************************************************************/
+INT rpc_server_receive(INT idx, int sock, BOOL check)
+{
+  static int level = 0;
+  int status;
 
+  // Provide protection against recursive calls to rpc_server_receive() and rpc_execute()
+  // via rpc_execute() calls tr_stop() calls cm_yield() calls ss_suspend() calls rpc_execute()
+
+  if (level != 0) {
+    //printf("*** enter rpc_server_receive level %d, idx %d sock %d %d -- protection against recursive use!\n", level, idx, sock, check);
+    return SS_SUCCESS;
+  }
+
+  level++;
+  //printf(">>> enter rpc_server_receive level %d, idx %d sock %d %d\n", level, idx, sock, check);
+  status = rpc_server_receive1(idx, sock, check);
+  //printf("<<< exit rpc_server_receive level %d, idx %d sock %d %d, status %d\n", level, idx, sock, check, status);
+  level--;
+  return status;
+}
+
 /********************************************************************/
 INT rpc_server_shutdown(void)
 /********************************************************************\


ladd02:trinat~/packages/midas>svn info src/midas.c
Path: src/midas.c
Name: midas.c
URL: svn+ssh://svn@savannah.psi.ch/repos/meg/midas/trunk/src/midas.c
Repository Root: svn+ssh://svn@savannah.psi.ch/repos/meg/midas
Repository UUID: 050218f5-8902-0410-8d0e-8a15d521e4f2
Revision: 5297
Node Kind: file
Schedule: normal
Last Changed Author: olchanski
Last Changed Rev: 5294
Last Changed Date: 2012-06-15 10:45:35 -0700 (Fri, 15 Jun 2012)
Text Last Updated: 2012-06-29 17:05:14 -0700 (Fri, 29 Jun 2012)
Checksum: 8d7907bd60723e401a3fceba7cd2ba29

K.O.
    Reply  13 Jul 2012, Stefan Ritt, Bug Report, Crash after recursive use of rpc_execute() 
> Then I realized that I see a recursive call to rpc_execute(): rpc_execute() calls tr_stop() calls cm_yield() calls 
> ss_suspend() calls rpc_execute(). The second rpc_execute successfully completes, but leave corrupted 
> data for the original rpc_execute(), which happily crashes. At the moment of the crash, recursive call to 
> rpc_execute() is no longer visible.

This is really strange. I did not protect rpc_execute against recursive calls since this should not happen. rpc_server_receive() is linked to rpc_call() on the client side. So there cannot be 
several rpc_call() since there I do the recursive checking (also multi-thread checking) via a mutex. See line 10142 in midas.c. So there CANNOT be recursive calls to rpc_execute() because 
there cannot be recursive calls to rpc_server_receive(). But apparently there are, according to your stack trace.

So even if your patch works fine, I would like to know where the recursive calls to rpc_server_receive() come from. Since we have one subproces of mserver for each client, there should only 
be one client connected to each mserver process, and the client is protected via the mutex in rpc_call(). Can you please debug this? I would like to understand what is going on there. Maybe 
there is a deeper underlying problem, which we better solve, otherwise it might fall back on use in the future.

For debugging, you have to see what commands rpc_call() send and what rpc_server_receive() gets, maybe by writing this into a common file together with a time stamp.

SR
ELOG V3.1.4-2e1708b5