10 Aug 2020, Pierre-Andre Amaudruz, Bug Report, data missing in runXXXXXX.mid
|
> > > Dear all
> > >
> > > We just started our beam time at ILL and just found yesterday that for certain
> > > settings of our detector the data is not saved into the .mid files. Running "mdump
> > > -l 10" online we see the data coming in as they should. Nevertheless, if we run
> > > "mdump -x runXXXXXX.mid" offline, the data file has no events and the banks are
> > > missing. Any ideas where the data could go lost?
> > >
> > > Thanks in advance,
> > > Ivo
> >
> > Have you checked
> >
> > /Logger/Channels/0/Settings/Event ID = -1
> > /Logger/Channels/0/Settings/Trigger mask = -1
> >
> > If these settings are not -1, they filter the data stream for certain events and trigger
> > masks.
> >
> > Stefan
>
> Good morning Stefan
>
> Both set to -1. We only have one logging channel. If we run a sequence with a few runs and the
> same settings, sometimes data is in the .mid file and sometimes it is not.
>
> Best,
> Ivo
Hi,
If the online mdump is correct (by default using the -1, -1 filter), the data are in the main SYSTEM buffer.
Similar to the dump -
The fact that the analyzer doesn't see the banks would indicate a buffer handling issue as mentioned by Stefan.
To confirm, I would check at the end of a run that the sum of the equipment "events sent" matches the logger "Events written". |
10 Aug 2020, Pierre-Andre Amaudruz, Bug Report, data missing in runXXXXXX.mid
|
> Have you tried longer files? Maybe a few 100 MB or so. Maybe a buffer is not flushed correctly at the end of a run.
Hi,
If the online mdump is correct (by default using the -1, -1 filter), the data are in the main SYSTEM buffer.
by the way, similar to the dump - mdump -x file.mid -m raw -d x will show you the events.
The fact that the analyzer doesn't see the banks would indicate a buffer handling issue as mentioned by Stefan.
To confirm, I would check at the end of a run that the sum of the equipment "events sent" matches the logger "Events written". |
10 Aug 2020, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid
|
I have to reproduce the problem to fix it. Why don't you go and modify midas/examples/experiment/frontend.cxx in such a way that
it creates exactly the banks you have, just with random data. If you see the same problem, send me your frontend file so that I
can reproduce it. |
11 Aug 2020, Konstantin Olchanski, Bug Report, data missing in runXXXXXX.mid
|
> I have to reproduce the problem to fix it. Why don't you go and modify midas/examples/experiment/frontend.cxx in such a way that
> it creates exactly the banks you have, just with random data. If you see the same problem, send me your frontend file so that I
> can reproduce it.
It would be good to pin point there the data is lost. This is the sequence:
frontend user code -> mfe.c code -> SYSTEM buffer -> mlogger -> disk
To see if correct data arrives to the SYSTEM buffer, run:
mdump -z SYSTEM
To see if mlogger is receiving events from the SYSTEM buffer, run:
mlogger -v ### mlogger should report all events, history and data
To see if mlogger writes events to disk, examine the disk file (in this case, you already did, data is not there).
I would guess that your data does not make it out from the frontend (mdump shows "nothing"),
if data were to arrive into the SYSTEM buffer, it would make it to disk, unless
mlogger is misconfigured (but you already checked that).
If you have trouble with the frontend framework code, you can try to switch from the mfe.c frontend
to the newer c++ tmfe frontend (see progs/fetest_tmfe.cxx and progs/fetest_tmfe_thread.cxx).
K.O. |
11 Aug 2020, Ivo Schulthess, Bug Report, data missing in runXXXXXX.mid
|
> It would be good to pin point there the data is lost. This is the sequence:
>
> frontend user code -> mfe.c code -> SYSTEM buffer -> mlogger -> disk
>
> To see if correct data arrives to the SYSTEM buffer, run:
> mdump -z SYSTEM
>
> To see if mlogger is receiving events from the SYSTEM buffer, run:
> mlogger -v ### mlogger should report all events, history and data
>
> To see if mlogger writes events to disk, examine the disk file (in this case, you already did, data is not there).
>
> I would guess that your data does not make it out from the frontend (mdump shows "nothing"),
> if data were to arrive into the SYSTEM buffer, it would make it to disk, unless
> mlogger is misconfigured (but you already checked that).
>
> If you have trouble with the frontend framework code, you can try to switch from the mfe.c frontend
> to the newer c++ tmfe frontend (see progs/fetest_tmfe.cxx and progs/fetest_tmfe_thread.cxx).
>
> K.O.
Good evening
I tried to reproduce the behavior in a very simple FE but it did not work out. The next thing for me would be to take the FE that is producing this behavior, replace all the device communication and data with dummies. If the problem is still there I would start to simplify as much as possible.
Following the inputs of KO, I pin-pointed the data loss. The system buffer still gets the data but the mlogger does not write the data event. Then of course the data is also not anymore present in the data file. Therefore, I checked the logger settings again, Event ID and Trigger Mask still -1. Nothing else, at least from my point of view, that is misconfigured. Nevertheless, if it helps I can send my ODB settings.
When doing the tests just before I found something else that probably can give a hint to the problem. The data is only lost if the time between two runs is long (a few seconds). As an example: If I run a sequence with a loop and after the FE stops the run the loop ends and the next run is started automatically, then only the first run has no data, which is the one after a longer time of no data taking. When I add a "WAIT Seconds 5" after the run before starting the next, not data is written to the disk for any run. I also found this once when adding a sleep(1) at the end of the FE readout function but back then did not think about it any further.
Best, Ivo |
24 Mar 2022, Konstantin Olchanski, Bug Report, data missing in runXXXXXX.mid
|
> > It would be good to pin point there the data is lost. This is the sequence:
> >
> > frontend user code -> mfe.c code -> SYSTEM buffer -> mlogger -> disk
> >
> > To see if correct data arrives to the SYSTEM buffer, run:
> > mdump -z SYSTEM
> >
> > To see if mlogger is receiving events from the SYSTEM buffer, run:
> > mlogger -v ### mlogger should report all events, history and data
> >
> > To see if mlogger writes events to disk, examine the disk file (in this case, you already did, data is not there).
> >
> > I would guess that your data does not make it out from the frontend (mdump shows "nothing"),
> > if data were to arrive into the SYSTEM buffer, it would make it to disk, unless
> > mlogger is misconfigured (but you already checked that).
> >
> > If you have trouble with the frontend framework code, you can try to switch from the mfe.c frontend
> > to the newer c++ tmfe frontend (see progs/fetest_tmfe.cxx and progs/fetest_tmfe_thread.cxx).
> >
> > K.O.
>
> Good evening
>
> I tried to reproduce the behavior in a very simple FE but it did not work out.
> The next thing for me would be to take the FE that is producing this behavior,
> replace all the device communication and data with dummies. If the problem is still
> there I would start to simplify as much as possible.
>
> Following the inputs of KO, I pin-pointed the data loss. The system buffer still
> gets the data but the mlogger does not write the data event. Then of course the data
> is also not anymore present in the data file. Therefore, I checked the logger
> settings again, Event ID and Trigger Mask still -1. Nothing else, at least from my point of view,
> that is misconfigured. Nevertheless, if it helps I can send my ODB settings.
>
> When doing the tests just before I found something else that probably
> can give a hint to the problem. The data is only lost if the time between
> two runs is long (a few seconds). As an example: If I run a sequence with a loop
> and after the FE stops the run the loop ends and the next run is started automatically,
> then only the first run has no data, which is the one after a longer time of
> no data taking. When I add a "WAIT Seconds 5" after the run before starting
> the next, not data is written to the disk for any run. I also found this
> once when adding a sleep(1) at the end of the FE readout function
> but back then did not think about it any further.
>
Looks like this problem fell into the covid crack.
As far as I know, MIDAS does not lose any events between bm_send_event() and the shared memory
buffer. It does not lose any events in the mlogger (unless the "event request" is misconfigured).
(there is lots of opportunity to lose events in complicated frontends).
If you have some evidence otherwise, I would very much like to hear about
it and I want to fix all problems that cause it.
In your previous report I was under the impression that you lose random events here and there,
but your latest report is about mlogger not writing anything at all.
Which case is it?
If you can definitely say that all your events make it to the SYSTEM buffer
but mlogger sometimes does not see some of them and sometimes does not see all of them,
we should look very closely at bm_receive_event() and mlogger itself.
In the case where mlogger is not seeing any events at all (output file is empty), as this is
happening, I would like to see the output of mdump (to confirm events are written to SYSTEM
buffer with correct event_id and trigger_mask) and the output of (say)
"manalyzer_test.exe --dump run01161.mid.lz4" on your output file.
If the output is very long, you can email it to me directly instead of posting it here.
K.O. |
24 Mar 2022, Stefan Ritt, Bug Report, data missing in runXXXXXX.mid
|
One idea: we should have a look at mlogger::close_channels(). There the SYSTEM buffer is emptied through the cm_yield() call. Instrumenting this with some debugging code will enlighten us.
Another possible problem: If the frontend requested to be notified for a run stop AFTER the logger, then the problem might happen: Logger closes file, and THEN the frontend flushes events ending up in the SYSTEM buffer and being logged at the beginning of the next run. The mfe.cxx framework takes care of this by calling
cm_register_transition(TR_SOP, 500);
while the mlogger does
cm_register_transition(TR_STOP, tr_stop, 800);
and since 800 > 500 the logger will be called AFTER the frontend. If one use a framework different from mfe.cxx, this could however be different.
Stefan |
24 Mar 2022, Konstantin Olchanski, Bug Report, data missing in runXXXXXX.mid
|
> One idea: we should have a look at mlogger::close_channels().
> There the SYSTEM buffer is emptied through the cm_yield() call.
> Instrumenting this with some debugging code will enlighten us.
right. this will "last few events are lost at the end of run".
but that code in the mlogger was not touched in years, if there is a problem there,
we would have seen it by now, most experiments check that the number
of events in the data file is same as number of triggers generated, both
numbers are shown on the midas status page.
> Another possible problem: If the frontend requested to be notified for a run stop AFTER the logger, then the problem might happen: Logger closes file, and THEN the frontend flushes events ending up in the SYSTEM buffer and being logged at the beginning of the next run. The mfe.cxx framework takes care of this by calling
> cm_register_transition(TR_STOP, 500);
default sequence, both mfe.c frontend and c++ tmfe frontend:
start of run:
- mlogger first (configure history, open data file)
- frontends last
- (if any frontend fails, TR_STARTABORT is sent to mlogger to close the output file and "undo" the run start)
end of run:
- frontends first (must not send any events after after processing the TR_STOP RPC call, inside the TR_STOP handler, bm_flush_cache() takes care of the write cache)
- mlogger last
- (if any frontend fails, failure is ignored, run stops regardless)
wrong order will be only if they manually change it, and whatever order
they set, you see it on the midas transition page (and mtransition -v and odbedit stop now -v, etc).
K.O. |
19 Jan 2004, Konstantin Olchanski, , darwin aka macosx changes
|
I commited the final bits to make Midas build on Darwin aka macosx.
Here is the summary:
1) I treat Darwin as a funny linux, so OS_LINUX is always defined
2) OS_DARWIN is defined for places where the two differ
3) system dependant directory is "midas/darwin/{bin,lib}"
4) a few header files had to be moved around to dodge namespace pollution by Apple system
header files (i.e. one of the PowerPC header files #defines PVM- collision with PVM in mana.c,
another #defines Free(x)- collision with ROOT header files)
5) ss_thread_create() and ss_thread_kill() now use midas_thread_t. On Darwin ptherad_t is not
an "int".
6) the Makefile has no support for building the midas shared library on macosx.
7) on my Mac OS 10.2.8 machine, "make all" works, "odbedit" and "mhttpd" run. This is the
full extent of my testing. Status on Mac OS 10.3.x is unknown.
K.O. |
20 Oct 2008, Suzannah Daviel, Bug Report, custom web pages: customscript buttons and start/stop buttons generate errors
|
I am using an external Custom web page via a link in the ODB in /Custom, and
Javascript to add customscript button(s) and run start/stop buttons.
After executing these buttons, instead of returning to the custom page, or
to the Midas main status page, there is an error page generated:
Invalid custom page: NULL path
and the URL is
http://lxfred:8082/CS/
The behaviour is the same whether the custom page replaces the main status page
or not.
I am using
MIDAS version 2.0.0
mhttpd.c SVN Rev 4282
In an older version of mhttpd.c, buttons of this type used to return to the
Midas main status page regardless of whether the custom page replaced the status
page. I found this behaviour annoying, and I made a custom mhttpd.c that
returned to the custom page.
Would it be possible to fix this problem, and to return to the custom page after
pressing the buttons?
Here is the Javascript to add the buttons:
<script type="text/javascript">
var rstate = '<odb src="/runinfo/run state">'
if (rstate == 1) // stopped
document.write('<input name="cmd" value="Start" type="submit">')
else if (rstate == 2 // paused
document.write('<input name="cmd" value="Resume" type="submit">')
else // running
{
document.write('<input name="cmd" value="Stop" type="submit">')
document.write('<input name="cmd" value="Pause" type="submit">')
}
if (rstate == 1) // stopped
document.write('<input name="customscript" value="tri_config" type="submit">');
</script> |
29 Oct 2008, Stefan Ritt, Bug Report, custom web pages: customscript buttons and start/stop buttons generate errors
|
To fix this problem, do the following:
- Update to the current SVN revision 4368 of mhttpd.c
- Add following tag into your custom page:
<input type=hidden name="redir" value="name">
where "name" is the name of your custom page which follows the CS/ in the URL. Like
if you have a custom page which you access through httpd://localhost/CS/junk then the
tag would be
<input type=hidden name="redir" value="junk">
The "redir" parameter is now evaluated inside mhttpd and brings you back to the proper
custom page. You can also define another custom page as the target, if that makes
sense in your application.
Pierre: Would be nice to document this somewhere more officially. |
04 Nov 2008, Suannah Daviel, Bug Report, custom web pages: customscript buttons and start/stop buttons generate errors
|
Thanks Stefan.
Your fix works nicely with the start/stop buttons not returning to the same or to a
different web page.
However, it does not seem to have fixed the problem with the Customscript button. It does
not seem to pick up the redirect, nor do the Pause/Resume buttons (which are programmed to
appear when the run starts).
> To fix this problem, do the following:
>
> - Update to the current SVN revision 4368 of mhttpd.c
> - Add following tag into your custom page:
>
> <input type=hidden name="redir" value="name">
>
> where "name" is the name of your custom page which follows the CS/ in the URL. Like
> if you have a custom page which you access through httpd://localhost/CS/junk then the
> tag would be
>
> <input type=hidden name="redir" value="junk">
>
> The "redir" parameter is now evaluated inside mhttpd and brings you back to the proper
> custom page. You can also define another custom page as the target, if that makes
> sense in your application.
>
> Pierre: Would be nice to document this somewhere more officially. |
09 Nov 2008, Stefan Ritt, Bug Report, custom web pages: customscript buttons and start/stop buttons generate errors
|
> Thanks Stefan.
> Your fix works nicely with the start/stop buttons not returning to the same or to a
> different web page.
>
> However, it does not seem to have fixed the problem with the Customscript button. It does
> not seem to pick up the redirect, nor do the Pause/Resume buttons (which are programmed to
> appear when the run starts).
That has been fixed in rev. 4377 |
22 Jun 2018, Frederik Wauters, Forum, custom script on custom page
|
I am implementing buttons to launch scripts from a custom page.
The simple way works, i.e.
<input type=submit name=customscript value="run_script">
But I want to stay on the page. Copying "Customscript button without a page
reload" from https://midas.triumf.ca/MidasWiki/index.php/Custom_Page_Features
yields the following error:
Uncaught ReferenceError: XMLHttpRequestGeneric is not defined
at cs_button (Trend:165)
at HTMLInputElement.onclick (Trend:90)
I included <script src="mhttpd.js"></script> and call mhttpd_init on page load.
So why can`t it run this ajax request?
Or is there a better way to launch a script without messing up the page |
22 Jun 2018, Stefan Ritt, Forum, custom script on custom page
|
> Uncaught ReferenceError: XMLHttpRequestGeneric is not defined
> at cs_button (Trend:165)
> at HTMLInputElement.onclick (Trend:90)
That code was not written by me, so I'm must guessing here.
Probably the XMLHttpRequestGeneric() is some function hiding browser specialities to create
AJAX requests. These days most browser understand the standard request
XMLHttpRequest()
so why don't you try to just remove the "Generic"
Stefan |
25 Jun 2018, Frederik Wauters, Forum, custom script on custom page
|
> > Uncaught ReferenceError: XMLHttpRequestGeneric is not defined
> > at cs_button (Trend:165)
> > at HTMLInputElement.onclick (Trend:90)
>
> That code was not written by me, so I'm must guessing here.
>
> Probably the XMLHttpRequestGeneric() is some function hiding browser specialities to create
> AJAX requests. These days most browser understand the standard request
>
> XMLHttpRequest()
>
> so why don't you try to just remove the "Generic"
>
> Stefan
That removes the error, but script doesnt get called. It goes to the javascript function and
callback, but nothing happens.
When I change type=button to type=submit , the script gets called again, but with page refresh. |
27 Jan 2010, Suzannah Daviel, Forum, custom page - flashing filled area
|
Hi,
On a custom web page, can a "filled" area be made to flash (i.e. cycle between
two colours)? This area would have to update faster than the whole page update.
I have a custom page representing a gas system, and the users
want the heaters to flash when they are on, as is done in their EPICS page.
Thanks,
Suzannah |
09 Feb 2010, Stefan Ritt, Forum, custom page - flashing filled area
|
One possibility is to use small GIF images for each valve, which have several frames (called 'animated GIF'). Depending on the state you can use a static GIF or the flashing GIF. An alternate approach is to use a static background image, and display a valve with different color on top of the background in regular intervals using JavaScript. I tried that with the attached page. Just create a custom page
/Custom/Valve = valve.html
and put all three attachments into your mhttpd directory. The JavaScript displays the red valve on top of the background with a 3 Hz frequency. The only trick is to position the overlay image exactly on top of the background image. This is done using the 'absolute' position in the style sheet. It needs a bit playing to find the proper position, but then it works fine. |
07 Jun 2007, Randolf Pohl, Forum, crash when analyzing multiple runs offline
|
Hello,
I am having a problem with the root-based analyzer. It crashes when I try to
analyze multiple runs OFFLINE using the "-i run%05d.mid -o result%05d.root -r
1 2" feature.
I can reproduce the problem with the example experiment which comes with the
MIDAS distribution:
Running the analyzer ONLINE works fine: One can start and stop runs one after
the other, roody shows the histograms being reset and then filled again and
such.
But OFFLINE, the analyzer crashes when trying to analyze the SECOND run in a
sequence. So
./analyzer -i run%05d.mid -o result%05d.root -r 1 1 works (only run 1)
./analyzer -i run%05d.mid -o result%05d.root -r 1 3 dies on run 2
Output attached (I added printf's to the "init"-modules, but that's irrelevant
here)
My own analyzer shows the same effect. There I got the impression the segfault
happens on the first attempt to Fill/Reset/SetName etc. a histogram in the 2nd
run. But with the midas example it looks like the analyzer finishes filling
histos even for run 2, but then dies in eor.
Can you reproduce the problem?
I run MIDAS on an Intel Quadcore, 64 bit SuSE Linux 10.2.
pohl@lamb2:~/midas/examples/root> gcc --version
gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux)
(maybe 4.1.2 "PRERELEASE" is the problem? See message ID 344)
I am using midas rev. 3674 (April 19, 2007), but I got the impression there
has since not been a change relevant to this problem. Please correct me if I
am wrong, then I would try it with Rev HEAD.
(My version includes already the fix to the x86_64 segfault problem of message
ID 337)
Best regards,
Randolf |
08 Jun 2007, Stefan Ritt, Forum, crash when analyzing multiple runs offline
|
Unfortunately I don't have time right now to debug the problem, but I could see
roughly what it could be. The analyzer crashes inside CloseRootOutputFile:
#5 <signal handler called>
#6 0x00002b5f52ad5ee5 in free () from /lib64/libc.so.6
#7 0x000000000040c89b in CloseRootOutputFile () at src/mana.c:1489
in the line
free(tree_struct.event_tree[i].branch);
If a "free" crashes, it might indicate that the memory beyond the allocated space
got corrupted. The branch gets allocated in book_ttree(), once for each
analyze_request[i]. The branch gets filled in write_event_ttree():
/* fill tree both online and offline */
if (!exclude_all)
et->tree->Fill();
Maybe one should put printf debugging statements in these places to see what's
going on. |
|