ELOG Midas

Back Midas Rome Roody Rootana

Midas DAQ System, Page 35 of 138

Not logged in

Find | Login | Help

Full | Summary | Threaded | Show attachments

2748 Entries

Goto page Previous 1, 2, 3 ... 34, 35, 36 ... 136, 137, 138 Next

ID	Date	Author	Topic	Subject
2087	10 Feb 2021	Isaac Labrie Boulay	Forum	Javascript error during run transitions.
Hi all, I am encountering a Javascript error (TypeError: client.error is undefined) when I transition between run states. Does anybody have an idea of what my problem might be? I have pasted an example of what MIDAS logs during such sequences. Thanks for all the help! Isaac 09:24:08.611 2021/02/10 [mhttpd,INFO] Executing script "~/ANIS_20210106/scripts/start_daq.sh" from ODB "/Script/Start DAQ" 09:24:13.833 2021/02/10 [Logger,LOG] Program Logger on host localhost started 09:24:28.598 2021/02/10 [fevme,LOG] Program fevme on host localhost started 09:24:33.951 2021/02/10 [mhttpd,INFO] Run #234 started 09:26:30.970 2021/02/10 [mhttpd,ERROR] [midas.cxx:4260:cm_transition_call,ERROR] Client "Logger" transition 2 aborted while waiting for client "fevme": "/Runinfo/Transition in progress" was cleared 09:26:31.015 2021/02/10 [mhttpd,ERROR] [midas.cxx:5120:cm_transition,ERROR] transition STOP aborted: "/Runinfo/Transition in progress" was cleared 09:27:27.270 2021/02/10 [mhttpd,ERROR] [system.cxx:4937:ss_recv_net_command,ERROR] timeout receiving network command header 09:27:27.270 2021/02/10 [mhttpd,ERROR] [midas.cxx:12262:rpc_client_call,ERROR] call to "fevme" on "localhost" RPC "rc_transition": timeout waiting for reply
2086	08 Feb 2021	Stefan Ritt	Suggestion	mhttpd browser caching
> It seems that the only reliable way to bypass the browser cache is to add > a tag with a random number to the URL ("&ts=currenttime"). Indeed that's the only reliable way to avoid caching across browsers. An alternative is ("&r=" + Math.random()) to add a random number. > BTW, things like midas.js are also cached, and it is common to see problems > after updating midas, where status.html is newly loaded, but midas.js is an old > stale version from cache. Reloading JavaScript file NOT from the cache is really tricky these days. I added a special Google Chrome extension to clear my browser cache, which works reliably: https://chrome.google.com/webstore/detail/clear-cache/cppjkneekbjaeellbfkmgnhonkkjfpdn Stefan
2085	08 Feb 2021	Konstantin Olchanski	Suggestion	mhttpd browser caching
> r->rsprintf("Expires: %s\r\n", str); The best I can tell, none of this works in current browsers. with google-chrome, I see it cache pretty much everything regardless of "expires", "no cache", etc and anything else I tried. Things like shift-<reload>, etc used to work to refresh the cache, but not any more. So, I too, see confusing side-effects of caching, where I change something in ODB, but "nothing happens". Then I scratch my head for 30 minutes until I remember to open the javascript debugger where shift-<reload> (or is it ctrl-<reload>) actually works. It seems that the only reliable way to bypass the browser cache is to add a tag with a random number to the URL ("&ts=currenttime"). This is for HTTP GET requests. HTTP POST does not seem to be cached, so I do not worry about this nonsense for json-rpc requests. Perhaps we should do this random number trick for all user actions. User can press buttons only so fast, we should be able to sustain the rate. Anything loaded automatically or from a timer, we should allow caching. BTW, things like midas.js are also cached, and it is common to see problems after updating midas, where status.html is newly loaded, but midas.js is an old stale version from cache. Messy. K.O.
2084	08 Feb 2021	Konstantin Olchanski	Forum	poll_event() is very slow.
> I should mention that I was using midas/examples/Triumf/c++/fevme.cxx this is correct, the fevme frontend is written to do 100% CPU-busy polling. there is several reasons for this: - on our VME processors, we have 2 core CPUs, 1st core can poll the VME bus, 2nd core can run mfe.c and the ethernet transmitter. - interrupts are expensive to use (in latency and in cpu use) because kernel handler has to call use handler, return back etc - sub-millisecond sleep used to be expensive and unreliable (on 1-2GHz "core 1" and "core 2" CPUs running SL6 and SL7 era linux). As I understand, current linux and current 3+GHz CPUs can do reliable microsecond sleep. K.O.
2081	25 Jan 2021	Thomas Lindner	Suggestion	mhttpd browser caching
I tried reloading the pages. If I reloaded the actual elog page https://server.triumf.ca/?cmd=Elog then it bypassed the cache and got the correct updated page from mhttpd. However, if when I reloaded the status page https://server.triumf.ca/?cmd=Status and then clicked the Elog button then I just got the cached (old) page. Admittedly reloading the status page doesn't make so much sense (once I thought about it), but it is what I tried first (I'm good at modelling unexpected user behaviour); so there is some risk that the user will try reloading the wrong page and will be stuck not getting the external elog page (until 24 hours runs out). Anyway, I will update the documentation to note that you need to reload the elog page after changing this variable. That's probably an adequate solution. I certainly don't suggest getting rid of caching entirely. I was trying to think whether there was a set of pages where it would make sense to disable the cache (like the elog page). But maybe that will just cause more problems. > Let me first explain a bit why caching is there. Once we had the case that someone from > TRIUMF opened a midas custom page at T2K. It took about one minute (!) to load the page. > > When we looked at it, we found that the custom page pulled about 100 items with individual > HTTP requests from Japan, each taking about one second for the roundtrip. Then we redesigned > the custom page communication so that many ODB entries could be retrieved in one operation, > which improved the loading time from 100s to about 2s. > > With the buttons we will have to make the same compromise. If we do not cache anything, > loading the midas status page over the Pacific takes many seconds. If we cache all, any > change on the midas side will not be reflected on the web page. So there is a compromise > to be made. I thought I designed it such that the side menu is cached locally, but when > the user presses "reload", then the full menu is fetched from the server. Of course one > has to remember this, so changing the ELOG URL or other things on the menu require a > reload (or wait a certain time for the cache to expire). So try again if that's working > for you. If not, I can visit it again and check if there is any bug. > > If we go the route to disable the cache, better try this to T2K and see what you get before > we commit ourselves to that. Last time TRIUMF people were complaining a lot about long > load times. > > Best, > Stefan
2080	25 Jan 2021	Stefan Ritt	Suggestion	mhttpd browser caching
Let me first explain a bit why caching is there. Once we had the case that someone from TRIUMF opened a midas custom page at T2K. It took about one minute (!) to load the page. When we looked at it, we found that the custom page pulled about 100 items with individual HTTP requests from Japan, each taking about one second for the roundtrip. Then we redesigned the custom page communication so that many ODB entries could be retrieved in one operation, which improved the loading time from 100s to about 2s. With the buttons we will have to make the same compromise. If we do not cache anything, loading the midas status page over the Pacific takes many seconds. If we cache all, any change on the midas side will not be reflected on the web page. So there is a compromise to be made. I thought I designed it such that the side menu is cached locally, but when the user presses "reload", then the full menu is fetched from the server. Of course one has to remember this, so changing the ELOG URL or other things on the menu require a reload (or wait a certain time for the cache to expire). So try again if that's working for you. If not, I can visit it again and check if there is any bug. If we go the route to disable the cache, better try this to T2K and see what you get before we commit ourselves to that. Last time TRIUMF people were complaining a lot about long load times. Best, Stefan
2079	25 Jan 2021	Thomas Lindner	Suggestion	mhttpd browser caching
I have a more subtle point about the new ODB key for using an external elog I mentioned in [1]. I was very confused after changing the ODB "External Elog" because mhttpd still wasn't using my external elog URL. I started trying to debug mhttpd.cxx, but found a lot of bits of mhttpd didn't seem to be getting called. I eventually realized that my browser had been caching the responses for some (though not all) of the MIDAS navigation buttons. Clearing my browser cache fixed the problem and allowed me to use the MIDAS button to the external ELOG. This caching happens on my macbook for both Firefox 84.0.2 and Safari 13.1. Many of the requests to mhttpd end up going to send_fp(), where we explicitly set the cache time to 24 hours. // send HTTP cache control headers time_t now = time(NULL); now += (int) (3600 * 24); struct tm* gmt = gmtime(&now); const char* format = "%A, %d-%b-%y %H:%M:%S GMT"; char str[256]; strftime(str, sizeof(str), format, gmt); r->rsprintf("Expires: %s\r\n", str); Some other MIDAS buttons don't seem to be cached by the browser; for instance the response for the 'OldHistory' button doesn't get cached. Should we remove the cache instruction for at least some of the buttons? At least for the elog button where we want the link direction to get switched by an ODB key the caching seems a bad idea. [1] https://midas.triumf.ca/elog/Midas/2078
2078	21 Jan 2021	Thomas Lindner	Info	Using external ELOG with newer mhttpd
A warning, in case others have the same problem I had. In the past you could configure mhttpd so that the 'Elog' button would redirect to an external ELOG server; to do this you only needed to create and set the ODB variable '/Elog/URL' to the URL of your external ELOG server. But with the newer MIDAS you need to set two ODB variables: * "/Elog/URL" needs to be set to the URL of the external ELOG. * "/Elog/External Elog" needs to be set to 'y' I hadn't noticed this and was confused why my Elog button wasn't working after upgrading MIDAS. MIDAS documentation was updated to reflect this change: https://midas.triumf.ca/MidasWiki/index.php/Electronic_Logbook_(ELOG)
2077	15 Jan 2021	Isaac Labrie Boulay	Forum	poll_event() is very slow.
> > > > I'm currently trying to see if I can speed up polling in a frontend I'm testing. > > Currently it seems like I can't get 'lam's to happen faster than 120 times/second. > > There must be a way to make this faster. From what I understand, changing the poll > > time (500ms by default) won't affect the frequency of polling just the 'lam' > > period. > > > > Any suggestions? > > > > You could switch from the traditional midas mfe.c frontend to the C++ TMFE frontend, > where all this "lam" and "poll" business is removed. > > At the moment, there are two example programs using the C++ TMFE frontend, > single threaded (progs/fetest_tmfe.cxx) and multithreaed (progs/fetest_tmfe_thread.cxx). > > K.O. Ok. I did not know that there was a C++ OOD frontend example in MIDAS. I'll take a look at it. Is there any documentation on it works? Thanks for the support! Isaac
2076	14 Jan 2021	Isaac Labrie Boulay	Forum	poll_event() is very slow.
> Something must be wrong on your side. If you take the example frontend under > > midas/examples/experiment/frontend.cxx > > and let it run to produce dummy events, you get about 90 Hz. This is because we have a > > ss_sleep(10); > > in the read_trigger_event() routine to throttle things down. If you remove that sleep, > you get an event rate of about 500'000 Hz. So the framework is really quick. > > Probably your routine which looks for a 'lam' takes really long and should be fixed. > > Stefan Hi Stefan, I should mention that I was using midas/examples/Triumf/c++/fevme.cxx. I was trying to see the max speed so I had the 'lam' always = 1 with nothing else to add overhead in the poll_event(). I was getting <200 Hz. I am assuming that this is a bug. There is no ss_sleep() in that function. Thanks for your quick response! Isaac
2075	14 Jan 2021	Pintaudi Giorgio	Forum	poll_event() is very slow.
> Something must be wrong on your side. If you take the example frontend under > > midas/examples/experiment/frontend.cxx > > and let it run to produce dummy events, you get about 90 Hz. This is because we have a > > ss_sleep(10); > > in the read_trigger_event() routine to throttle things down. If you remove that sleep, > you get an event rate of about 500'000 Hz. So the framework is really quick. > > Probably your routine which looks for a 'lam' takes really long and should be fixed. > > Stefan Sorry if I am going off-topic but, because the ss_sleep function was mentioned here, I would like to take the chance and report an issue that I am having. In all my slow control frontends, the CPU usage for each frontend is close to 100%. This means that each frontend is monopolizing a single core. When I did some profiling, I noticed that 99% of the time is spent inside the ss_sleep function. Now, I would expect that the ss_sleep function should not require any CPU usage at all or very little. So my two questions are: Is this a bug or a feature? Would you able to check/reproduce this behavior or do you need additional info from my side?
2074	13 Jan 2021	Stefan Ritt	Forum	poll_event() is very slow.
Something must be wrong on your side. If you take the example frontend under midas/examples/experiment/frontend.cxx and let it run to produce dummy events, you get about 90 Hz. This is because we have a ss_sleep(10); in the read_trigger_event() routine to throttle things down. If you remove that sleep, you get an event rate of about 500'000 Hz. So the framework is really quick. Probably your routine which looks for a 'lam' takes really long and should be fixed. Stefan
Draft	13 Jan 2021	Pierre-Andre Amaudruz	Forum	poll_event() is very slow.
> Hi all, > > I'm currently trying to see if I can speed up polling in a frontend I'm testing. > Currently it seems like I can't get 'lam's to happen faster than 120 times/second. > There must be a way to make this faster. From what I understand, changing the poll > time (500ms by default) won't affect the frequency of polling just the 'lam' > period. > > Any suggestions? > > Thanks for your help! > > Isaac Hi, How many equipment do you have and of what type? What is the measured readout time of your equipment? As you mentioned the polling time define the maximum time you spend in the in polling call before checking other equipment and system activities. But as soon as you get a LAM during the polling loop, the event is readout. The readout time of this equipment is obviously to be considered as well. In case you have multiple equipment, the readout time of the other equipment is to be taken in account as you wont return to your polling prior the completion of them.
2072	13 Jan 2021	Konstantin Olchanski	Forum	poll_event() is very slow.
> > I'm currently trying to see if I can speed up polling in a frontend I'm testing. > Currently it seems like I can't get 'lam's to happen faster than 120 times/second. > There must be a way to make this faster. From what I understand, changing the poll > time (500ms by default) won't affect the frequency of polling just the 'lam' > period. > > Any suggestions? > You could switch from the traditional midas mfe.c frontend to the C++ TMFE frontend, where all this "lam" and "poll" business is removed. At the moment, there are two example programs using the C++ TMFE frontend, single threaded (progs/fetest_tmfe.cxx) and multithreaed (progs/fetest_tmfe_thread.cxx). K.O.
2071	13 Jan 2021	Isaac Labrie Boulay	Forum	poll_event() is very slow.
Hi all, I'm currently trying to see if I can speed up polling in a frontend I'm testing. Currently it seems like I can't get 'lam's to happen faster than 120 times/second. There must be a way to make this faster. From what I understand, changing the poll time (500ms by default) won't affect the frequency of polling just the 'lam' period. Any suggestions? Thanks for your help! Isaac Hi, What is the actual readout time, event size? Do you have multiple equipment and of what type if any? PAA
2070	08 Jan 2021	Stefan Ritt	Forum	history and variables confusion
We kind of agreed to rewrite the slow control system in C++. Each device will have its own driver derived from a common base class implementing the general communication. The reason we need a "system" and not only a "hand-written" driver is because we want: - glue many device drivers together for a single equipment - have a dedicated readout thread for every device, in order not to block other devices - have a common error reporting scheme working with several threads - being able to disable/enable individual devices without changing the history system each time - having a common naming scheme for all devices (like "enforce" /Equipment/<name>/Settings/Names xxx) which is needed by the history system - ... Will see when we have time for that. Stefan
2069	06 Jan 2021	Isaac Labrie Boulay	Info	Recovering a corrupted ODB using odbinit.
Hi all, I am currently trying to recover my corrupted ODB using odbinit and I am still getting issues after doing 'odbinit --cleanup' and trying to reload the saved ODB (last.json). Here is the output: ************************************************ (odbinit cleanup) Note* the ERROR in system.cxx ********************************************** [caendaq@cu332 ANIS]$ odbinit --cleanup Checking environment... experiment name is "ANIS", remote hostname is "" Checking command line... experiment "ANIS", cleanup 1, dry_run 0, create_exptab 0, create_env 0 Checking MIDASSYS....../home/caendaq/packages/midas Checking exptab... experiments defined in exptab file "/home/caendaq/ANIS/exptab ": 0: "ANIS" <-- selected experiment Checking exptab... selected experiment "ANIS", experiment directory "/home/caend aq/ANIS/" Checking experiment directory "/home/caendaq/ANIS/" Found existing ODB save file: "/home/caendaq/ANIS/.ODB.SHM" Checking shared memory... Deleting old ODB shared memory... [system.cxx:1052:ss_shm_delete,ERROR] shm_unlink(/1001_ANIS_ODB__home_caendaq_AN IS_) errno 2 (No such file or directory) Good: no ODB shared memory Deleting old ODB semaphore... Deleting old ODB semaphore... create status 1, delete status 1 Preserving old ODB save file /home/caendaq/ANIS/.ODB.SHM" to "/home/caendaq/ANIS /.ODB.SHM.1609951022" Checking ODB size... Requested ODB size is 0 bytes (0.00B) ODB size file is "/home/caendaq/ANIS//.ODB_SIZE.TXT" Saved ODB size from "/home/caendaq/ANIS//.ODB_SIZE.TXT" is 1048576 bytes (1.05MB ) We will initialize ODB for experiment "ANIS" on host "" with size 1048576 bytes (1.05MB) Creating ODB... Creating ODB... db_open_database() status 302 Saving ODB... Saving ODB... db_close_database() status 1 Connecting to experiment... Connected to ODB for experiment "ANIS" on host "" with size 1048576 bytes (1.05M B) Checking experiment name... status 1, found "ANIS" Disconnecting from experiment... Done ************************************ (Loading the last copy of my ODB) ********************************* [caendaq@cu332 data]$ odbedit [local:ANIS:S]/>load last.json [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback 11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback 11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback 11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback 11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback 11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback 11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback 11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback 11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback 11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback 11:38:12 [ODBEdit,INFO] Reloading RPC hosts access control list via hotlink callback ****************************************** (Now trying to run my frontend and analyzer) ******************************************* [caendaq@cu332 ANIS]$ ./start_daq.sh mlogger: no process found fevme: no process found manalyzer.exe: no process found manalyzer_example_cxx.exe: no process found roody: no process found [ODBEdit,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" is corrupted, mismatch of buffer name in shared memory "" 11:38:30 [ODBEdit,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" is corrupted, mismatch of buffer name in shared memory "" Becoming a daemon... Becoming a daemon... Please point your web browser to http://localhost:8081 To look at live histograms, run: roody -Hlocalhost Or run: mozilla http://localhost:8081 [caendaq@cu332 ANIS]$ Frontend name : fevme Event buffer size : 1048576 User max event size : 204800 User max frag. size : 1048576 # of events per buffer : 5 Connect to experiment ANIS... OK [fevme,ERROR] [midas.cxx:6616:bm_open_buffer,ERROR] Buffer "SYSTEM" is corrupted, mismatch of buffer name in shared memory "" [fevme,ERROR] [mfe.cxx:596:register_equipment,ERROR] Cannot open event buffer "SYSTEM" size 33554432, bm_open_buffer() status 219 Has anyone ever encountered these issues? Thanks for your time. Isaac
2068	06 Jan 2021	Isaac Labrie Boulay	Bug Report	Logger: Disk nearly full.
> The logger simple requests the disk free space level from the operating system in the same > way as the "df" command does. Can you do a "df" on your system? I have seen that some file > systems free up space not immediately if you delete files, but some times later (like 24h). > > Stefan Thanks Stefan. Yes the files were still held open by some processes. It's solved now. Cheers. Isaac
2067	06 Jan 2021	Stefan Ritt	Suggestion	Improving variable functionality in Sequencer?
I guess you use a wrong pattern here. There is no need to copy ODB values to local variables, then change them, then write them back. You can rather directly write values to the ODB. We run all our experiments in that way and we can do what we want. So most of our scripts have sections like ODBSUBDIR "/Equipment/Laser/Variables" ODBSET "Setting[*]", 0, 0 ODBSET "Output[1]", 0, 0 ODBSET "Output[2]", 1, 0 ODBSET "Output[3]", 0, 0 ODBSET "Output[4]", 1, 1 ENDODBSUBDIR Note that both the path and the indices can contain wild cards, making this pattern more flexible. Wildcards are however not (yet) supported for local variables, that's why we use directly the ODBSET directive. I attach a larger example from the MEG experiment here for your reference. Stefan
2066	06 Jan 2021	Stefan Ritt	Bug Report	Logger: Disk nearly full.
The logger simple requests the disk free space level from the operating system in the same way as the "df" command does. Can you do a "df" on your system? I have seen that some file systems free up space not immediately if you delete files, but some times later (like 24h). Stefan

Goto page Previous 1, 2, 3 ... 34, 35, 36 ... 136, 137, 138 Next

ELOG V3.1.4-2e1708b5