Back Midas Rome Roody Rootana
  Midas DAQ System, Page 67 of 152  Not logged in ELOG logo
    Reply  18 Feb 2021, Pintaudi Giorgio, Bug Report, Unexpected end-of-file Screenshot_from_2021-02-19_15-41-23.png
It appears that the issue is trigger by a nonexisting Event and Variable as shown 
in the attached picture. This issue can arise when restoring the ODB from a 
previous version or importing ODB values from other MIDAS instances.
It might be useful if the error message were more clear about the source of the 
problem.

> Hello!
> Sometimes when I mess around with the history plots I get the following error:
> 
> [mhttpd,ERROR] [history.cxx:97:xread,ERROR] Error: Unexpected end-of-file when 
> reading file "/home/wagasci-ana/Data/online/210219.hst"
> 
> I have tried the following without success:
> 
> - Remove the MIDAS history files
> - Restart mhttpd and mlogger
> 
> I do not know what triggers the error but when it triggers the above message is 
> printed hundres of times a second, completely spamming the message log.
> 
> It happened again today after I set the label of a frontend too long making 
> mlogger crash. After fixing the label length, the above message appeared and it 
> does not seem to go away.
    Reply  25 Feb 2021, Konstantin Olchanski, Bug Report, Unexpected end-of-file 
> > [mhttpd,ERROR] [history.cxx:97:xread,ERROR] Error: Unexpected end-of-file when 
> > reading file "/home/wagasci-ana/Data/online/210219.hst"

I am puzzled. We can try two things:

a) look inside the "bad" hst file, maybe we can see something. run "mhdump -L 
/home/wagasci-ana/Data/online/210219.hst". If there is anything wrong with the file, it 
will be probably at the end. You can also try to run it without "-L".

b) switch from "midas" history (.hst files) to "FILE" history (mh*.dat files), the 
"FILE" history code is newer and the file format is more robust, with luck it may 
survive whatever trouble is happening in your experiment. This is controlled in ODB 
/Logger/History/XXX/Active (set to "y/n").

c) the output of "mlogger -v" may give us some clue, it usually complains if something 
is not right with definitions of history data.

K.O.
Entry  25 Feb 2021, Isaac Labrie Boulay, Bug Report, Undefined client causing issues in transition. error_message.PNGundefined_client.PNG
Hi all,

I'm currently experiencing an issue during run transitions. It comes in the form 
of an alert saying "TypeError: Cannot read property 'length' of undefined" 
whenever I'm in the "transition" window on mhttpd. I have attached an image of 
what the transition window looks like when this happens. 

By the looks of it and by peering at the lines in transition.html where the 
error occurs, it's pretty obvious that there is some strange undefined client 
that the web page tries to access.

I don't know how to find what this client is. Is there a way to see it in the 
ODB? 

The issues happens in show_client() of transition.html (called by callback()). 
Here's the trace:

Uncaught (in promise) TypeError: Cannot read property 'length' of undefined
    at show_client (?cmd=Transition:227)
    at callback (?cmd=Transition:420)
    at ?cmd=Transition:430

Any help would be very appreciated!

Thanks so much.

Isaac
    Reply  25 Feb 2021, Konstantin Olchanski, Bug Report, Undefined client causing issues in transition. 
Clearly something goes wrong with the STARTABORT transition. Actually from your 
sceenshot, it is not clear why the STARTABORT transition was initiated.

Usually it is called after some client fails the "start run" transition to inform 
other clients that the run did not start after all. (mlogger uses this to close the 
output file, etc).

But in the screenshot, we do not see any client fail the transition (only rootana1 
was called, and it returned "green").

So, a puzzle. One possibility is that the transition code gets so confused
that it does not record correct transition data to ODB, then the web page
gets even more confused.

One way to see what happens, is to run the odbedit command "start now -v".

Can you try that? And attach all it's output here?

K.O.
    Reply  26 Feb 2021, Isaac Labrie Boulay, Bug Report, Undefined client causing issues in transition. start_now_-v_(1).PNGstop.PNG
> Clearly something goes wrong with the STARTABORT transition. Actually from your 
> sceenshot, it is not clear why the STARTABORT transition was initiated.
> 
> Usually it is called after some client fails the "start run" transition to inform 
> other clients that the run did not start after all. (mlogger uses this to close the 
> output file, etc).
> 
> But in the screenshot, we do not see any client fail the transition (only rootana1 
> was called, and it returned "green").
> 
> So, a puzzle. One possibility is that the transition code gets so confused
> that it does not record correct transition data to ODB, then the web page
> gets even more confused.
> 
> One way to see what happens, is to run the odbedit command "start now -v".
> 
> Can you try that? And attach all its output here?
> 
> K.O.

Thanks for getting back to me right away. I've attached two screenshots. The first one 
is the output after running "start now -v" (everything seemed to work nicely there), the 
second output is after using odbedit to stop the run with "stop". Notice that the DAQ 
never stops because it gets stuck in between transitions (You can see the run status 
being "stopping run" with the cancel transition button).

Thanks.

Isaac
    Reply  26 Feb 2021, Konstantin Olchanski, Bug Report, Undefined client causing issues in transition. 
So there is no error on run start anymore? To debug the stuck run stop, please use "stop -v" 
to see where it got stuck. You can also play with the RPC timeouts (the connect timeout and 
the response timeout), to make it get "unstuck" quicker. Definitely it should not be stuck 
forever, it should timeout at maximum of "rpc timeout * number of clients". K.O.
    Reply  26 Feb 2021, Isaac Labrie Boulay, Bug Report, Undefined client causing issues in transition. 
> So there is no error on run start anymore? To debug the stuck run stop, please use "stop -v" 
> to see where it got stuck. You can also play with the RPC timeouts (the connect timeout and 
> the response timeout), to make it get "unstuck" quicker. Definitely it should not be stuck 
> forever, it should timeout at maximum of "rpc timeout * number of clients". K.O.

You're right it does not stay stuck forever, it eventually gets unstuck. I forgot to mention this. 
I will try to play with these timeout parameters. It does not get stuck if I run my DAQ using the 
odbedit commands (start/stop). I don't know if this is relevant information that could help us 
identify the problem.

Thanks for all your help as always!

Isaac
    Reply  03 Mar 2021, Konstantin Olchanski, Bug Report, Undefined client causing issues in transition. 
> It does not get stuck if I run my DAQ using the odbedit commands (start/stop).

Interesting. Run start/stop from odbedit works but from mhttpd gets stuck.

I think they do not run the transition quite the same way. mhttpd uses the multithreaded transition.

So we can debug this using the "mtransition" program. Try:

- mtransition -v -d 1 START/STOP -- this should be same as odbedit
- mtransition -m -v -d 1 START/STOP -- this should be same as mhttpd

The "-v" and "-d 1" flags should cause lots of output, for failed transitions,
cut-and-paste it all into this elog here, should give us plenty of meat to debug.

K.O.
Entry  11 Jul 2006, Razvan Stefan Gornea, Forum, Tundra Universe CA91C042 
I am not using Midas but I need some help from somebody experienced with VME access using the Tundra Universe, so I thought here I have a chance ...

I have a GE Fanuc 7700 and use the vme_universe driver (ver. 3.3). In the past I programed for a DAQ board using A24/D16. Now I have a new board using A24/MB and I am really last!

So the board has some 64-bit registers and some 32-bit registers (all aligned on 64-bit) and a FIFO to read the main data. After reading the user manual for universe chip and the docs for the driver I am still confused about how things are supposed to work.

First my understanding is that for reading 64-bit I need anyway the multiplex block mode. But nowhere I could find if the multiplex mode supports 32-bit transfers. Should I map two windows on the same VME address range, one for A24/D32 and one for A24/MB? Or read everything with an unsigned long long and cast to unsigned int all 32-bit registers?

Second I don't know how to handle the FIFO which is in the middle of the address range. When the board has a trigger I have to read more than 100000 times this FIFO. If I simply read at the FIFO address 100000 times do I get the VME multiplex block mode (if the window has been mapped with A24/MB address modifier)? How does the chip/driver know not to send the address and just do the data cycle after the first read?

I also had the naive idea to have a master window mapped on the board address range to access all the registers except the FIFO and to create a DMA buffer for the FIFO (FIFO readout is where most of the work is anyway so I guess an advantage is that will free the CPU) but it seems to me that the dma_transfer function in the kernel module increments the address. I don't dare change this since I don't even understand the exact relationship between accesses to the mapped window and what's happening on the VME bus.

Thanks for any help!
Entry  29 Mar 2022, Hunter Lowe, Forum, Triggering without LAM signal - mcstd_libgpmc_camac driver 
Hello,

I have a question for anyone experienced with simple CAMAC systems.
 My understanding is that for a single ADC system you can use a gate to generate a
 LAM signal for triggering on ADC.
 The driver that I have "mcstd_libgpmc_camac" has LAM "not implemented" though,
 so I'm not sure how I should trigger DAQ. The frontend code that I have seems to use a TDC
 as trigger for ADC via "EQ_POLLED" type equipment setting. Should I simply plug in TDC in my
 system and use this as trigger? Is it as simple as TDC generates signal via gate and ADC performs job? 

Sorry if question is super basic, just confused how to trigger without LAM signal.

Thank you :)

Hunter Lowe
UNBC Grad Physics
Entry  02 Sep 2020, Ruslan Podviianiuk, Forum, Transition status message issue.png
Hello,

I got an error after start of run and it would be good to show this error (or 
errors) in UI that I am developing. I see this error in the Transition 
directory (please see the attached file). Is it possible to read the status 
message and error messages from the Transition directory using jsonrpc? If yes, 
could you please explain me how to do this.

Thank you.
Ruslan  
    Reply  02 Sep 2020, Ben Smith, Forum, Transition status message 
The information you want is in the ODB:
* "/System/Transition/status" is the overall integer status code.
* "/System/Transition/error" is the overall error message string.

There is also per-client status information in the ODB:
* "/System/Transition/Clients/<client_name>/status"
* "/System/Transition/Clients/<client_name>/error"
    Reply  02 Sep 2020, Ruslan Podviianiuk, Forum, Transition status message 
> The information you want is in the ODB:
> * "/System/Transition/status" is the overall integer status code.
> * "/System/Transition/error" is the overall error message string.
> 
> There is also per-client status information in the ODB:
> * "/System/Transition/Clients/<client_name>/status"
> * "/System/Transition/Clients/<client_name>/error"


Thank you so much, Ben!
    Reply  08 Sep 2020, Konstantin Olchanski, Forum, Transition status message 
> > The information you want is in the ODB:
> > * "/System/Transition/status" is the overall integer status code.
> > * "/System/Transition/error" is the overall error message string.
> > 
> > There is also per-client status information in the ODB:
> > * "/System/Transition/Clients/<client_name>/status"
> > * "/System/Transition/Clients/<client_name>/error"

You can also use web page .../resources/transition.html as an example of how
to read transition (and other) data from ODB into your own web page. example.html
may also be helpful.

K.O.
    Reply  08 Sep 2020, Ruslan Podviianiuk, Forum, Transition status message 
> > > The information you want is in the ODB:
> > > * "/System/Transition/status" is the overall integer status code.
> > > * "/System/Transition/error" is the overall error message string.
> > > 
> > > There is also per-client status information in the ODB:
> > > * "/System/Transition/Clients/<client_name>/status"
> > > * "/System/Transition/Clients/<client_name>/error"
> 
> You can also use web page .../resources/transition.html as an example of how
> to read transition (and other) data from ODB into your own web page. example.html
> may also be helpful.
> 
> K.O.

Thank you Konstantin!

Ruslan
Entry  05 Feb 2025, Andreas Suter, Forum, Transition from mana -> manalyzer 
Hi,

we are planning to migrate from mana to manalyzer. I started to have a look into it and realized that I have some lose ends.
Is there a clear migration docu somewhere?

Currently I understand it the following way (which might be wrong):
The class TARunObject is used to write analyzer modules which are registered by TAFactory. I hope this is right?

However, in mana there is an analyzer implemented by the user which binds the modules and has additional routines:
analyzer_init(), analyzer_exit(), analyzer_loop()
ana_begin_of_run(), ana_end_of_run(), ana_pause_run(), ana_resume_run()
which we are using.

This part I somehow miss in manalyzer, most probably due to lack of understanding, and missing documentation.

Could somebody please give me a boost?
    Reply  05 Feb 2025, Andrea Capra, Forum, Transition from mana -> manalyzer dsadc_analyzer_midas_elog.pdf
Hi Andreas,

please find in elog:2938/1 a short introduction that I wrote sometime ago.
I'm glad to offer additional support, if needed.

> Hi,
> 
> we are planning to migrate from mana to manalyzer. I started to have a look into it and realized that I have some lose ends.
> Is there a clear migration docu somewhere?
> 
> Currently I understand it the following way (which might be wrong):
> The class TARunObject is used to write analyzer modules which are registered by TAFactory. I hope this is right?
> 
> However, in mana there is an analyzer implemented by the user which binds the modules and has additional routines:
> analyzer_init(), analyzer_exit(), analyzer_loop()
> ana_begin_of_run(), ana_end_of_run(), ana_pause_run(), ana_resume_run()
> which we are using.
> 
> This part I somehow miss in manalyzer, most probably due to lack of understanding, and missing documentation.
> 
> Could somebody please give me a boost?
    Reply  06 Feb 2025, Konstantin Olchanski, Forum, Transition from mana -> manalyzer 
> Could somebody please give me a boost?

no need to shout into the void, it is pretty easy to identify the author of manalyzer and ask me directly.

> we are planning to migrate from mana to manalyzer. I started to have a look into it and realized that I have some lose ends.
> Is there a clear migration docu somewhere?

README.md and examples in the manalyzer git repository.

If something is missing, or unclear, please ask.

> Currently I understand it the following way (which might be wrong):
> The class TARunObject is used to write analyzer modules which are registered by TAFactory. I hope this is right?

Please read the README file. It explains what is going on there.

Design of manalyzer had 2 main goals:
a) lifetime of all c++ objects (and ROOT objects) is well defined (to fix a design defect in rootana)
b) event flow and data flow are documentable (problem in mfe.c frontends, etc)

> However, in mana there is an analyzer implemented by the user which binds the modules and has additional routines:
> analyzer_init(), analyzer_exit(), analyzer_loop()
> ana_begin_of_run(), ana_end_of_run(), ana_pause_run(), ana_resume_run()
> which we are using.

I have never used the mana analyzer, I wrote the c++ rootana analyzer very early on (first commit in 2006).

But the basic steps should all be there for you:
- initialization (create histograms, open files) can be done in the module constructor or in BeginRun()
- finalization (fit histograms, close files) should be done in EndRun()
- event processing (obviously) in Analyze()
- pause run, resume run and switch to next subrun file have corresponding methods
- all the "flow" and multithreading stuff you can ignore to first order.

To start the migration, I recommend you take manalyzer_example_root.cxx and start stuffing it with your code.

If you run into any problems, I am happy to help with them. Ask here or contact me directly.

> This part I somehow miss in manalyzer, most probably due to lack of understanding, and missing documentation.

True, I wrote a migration guide for the frontend mfe.c to c++ tmfe, because we do this migration
quite often. But I never wrote a migration guide from mana.c analyzer, because we never did such
migration. Most experiments at TRIUMF are post-2006 and use rootana in it's different incarnations.

P.S. I designed the C++ TMFe frontend after manalyzer and I think it came out quite better, I especially
value the design input from Stefan, Thomas, Pierre, Joseph and Ben.

P.P.S. Be free to ignore all this manalyzer business and write your own analyzer based
on the midasio library:

int main()
{
   TMReaderInteraface* f = TMNewReader(file.mid.gz");
   while (1) {
      TMEvent* e = TMReadEvent(f);
      dwim(e);
      delete e;
   }
   delete f;
}

For online processing I use the TMFe class, it has enough bits to be a frontend and an analyzer,
or you can use the older TMidasOnline from rootana.

Access to ODB is via the mvodb library, which is new in midas, but has been part of rootana
and my frontend toolkit since at least 2011 or earlier, inspired by Peter Green's even
older "myload" ODB access library.

K.O.
    Reply  06 Feb 2025, Konstantin Olchanski, Forum, Transition from mana -> manalyzer 
> > Is there a clear migration docu somewhere?

I can also give you links to the alpha-g analyzer (very complex) and the DarkLight trigger TDC analyzer (very simple),
there is also analyzer examples of in-between complexity.

K.O.
Entry  20 Nov 2013, Konstantin Olchanski, Bug Report, Too many bm_flush_cache() in mfe.c 
I was looking at something in the mserver and noticed that for remote frontends, for every periodic event, 
there are about 3 RPC calls to bm_flush_cache().

Sure enough, in mfe.c::send_event(), for every event sent, there are 2 calls to bm_flush_cache() (once for 
the buffer we used, second for all buffers). Then, for a good measure, the mfe idle loop calls 
bm_flush_cache() for all buffers about once per second (even if no events were generated).

So what is going on here? To allow good performance when processing many small events,
the MIDAS event buffer code (bm_send_event()) buffers small events internally, and only after this internal
buffer is full, the accumulated events are flushed into the shared memory event buffer,
where they become visible to the mlogger, mdump and other consumers.

Because of this internal buffering, infrequent small size periodic events can become
stuck for quite a long time, confusing the user: "my frontend is sending events, how come I do not
see them in mdump?"

To avoid this, mfe.c manually flushes these internal event buffers by calling bm_flush_buffer().

And I think that works just fine for frontends directly connected to the shared memory, one call to 
bm_flush_buffer() should be sufficient.

But for remote fronends connected through the mserver, it turns out there is a race condition between 
sending the event data on one tcp connection and sending the bm_flush_cache() rpc request on another 
tcp connection.

I see that the mserver always reads the rpc connection before the event connection, so bm_flush_cache() 
is done *before* the event is written into the buffer by bm_send_event(). So the newly
send event is stuck in the buffer until bm_flush_cache() for the *next* event shows up:

mfe.c: send_event1 -> flush -> ... wait until next event ... -> send_event2 -> flush
mserver: flush -> receive_event1 -> ... wait ... -> flush -> receive_event2 -> ... wait ...
mdump -> ... nothing ... -> ... nothing ... -> event1 -> ... nothing ...

Enter the 2nd call to bm_flush_cache in mfe.c (flush all buffers) - now because mserver seems to be 
alternating between reading the rpc connection and the event connection, the race condition looks like 
this:

mfe.c: send_event -> flush -> flush
mserver: flush -> receive_event -> flush
mdump: ... -> event -> ...

So in this configuration, everything works correctly, the data is not stuck anywhere - but by accident, and 
at the price of an extra rpc call.

But what about the periodic 1/second bm_flush_cache() on all buffers? I think it does not quite work
either because the race condition is still there: we send an event, and the first flush may race it and only 
the 2nd flush gets the job done, so the delay between sending the event and seeing it in mdump would be 
around 1-2 seconds. (no more than 2 seconds, I think). Since users expect their events to show up "right
away", a 2 second delay is probably not very good.

Because periodic events are usually not high rate, the current situation (4 network transactions to send 1 
event - 1x send event, 3x flush buffer) is probably acceptable. But this definitely sets a limit on the 
maximum rate to 3x (2x?) the mserver rpc latency - without the rpc calls to bm_flush_buffer() there
would be no limit - the events themselves are sent through a pipelined tcp connection without 
handshaking.

One solution to this would be to implement periodic bm_flush_buffer() in the mserver, making all calls to 
bm_flush_buffer() in mfe.c unnecessary (unless it's a direct connection to shared memory).

Another solution could be to send events with a special flag telling the mserver to "flush the buffer right 
away".

P.S. Look ma!!! A race condition with no threads!!!

K.O.
ELOG V3.1.4-2e1708b5