ELOG Midas

Back Midas Rome Roody Rootana

Midas DAQ System, Page 34 of 152

Not logged in

Find | Login | Help

Full | Summary | Threaded | Hide attachments

3034 Entries

Goto page Previous 1, 2, 3 ... 33, 34, 35 ... 150, 151, 152 Next

ID	Date	Author	Topic	Subject
2276	17 Sep 2021	Stefan Ritt	Forum	mhttpd crash
To limit the impact of the numerous crashes of mhttpd, I installed the monit tool at MEG at PSI (https://en.wikipedia.org/wiki/Monit). It monitors mhttpd, and if it cannot connect to it for a certain time, it kills the process and restarts it. This covers endless loops, simple crashes (caused by the known multi-threading issue in mongoose), and also cases where mhttpd develops a memory leak and becomes unresponsive. To configure monit for mhttpd, first install the package, make sure the daemon gets started automatically after reboot (typically "sysemctl enable monit"), and put the attached file into /etc/monit.d/mhttpd You have to adjust the <path-to-midas> according to your midas installation, and probably also the port under which mhttpd is listening (8082 in my case). Put set daemon 10 into /etc/monitrc if you want monit to check mhttpd every 10 seconds (default is 30 seconds). Then, every 10 seconds monit request "midas.css" from mhttpd, and if it cannot obtain it after 30 seconds, it kills mhttpd and restarts it. Loading long history plots taking more than 30 seconds should probably not be an issue since mhttpd is multi-threaded, but I haven't tested this in detail. Attached below is a typical status page produced by monit, which has its own built-in web server (normally listening at port 2812, accessible only from localhost by default). I hope this helps some of you. Stefan
Attachment 1: mhttpd
check process mhttpd matching "mhttpd" start program = "/bin/su -l meg -c '/<path-to-midas>/bin/mhttpd -D'" stop program = "/usr/bin/killall mhttpd" if failed host 127.0.0.1 port 8082 protocol http method GET request "/midas.css" with timeout 30 seconds then restart
Attachment 2: Screenshot_2021-09-17_at_21.11.15_.png

590	04 Jun 2009	bazinski	Bug Report	mhttpd command line experiment specifying
Hi Not sure how the rest of you specify mhttpd to work with multiple experiments on one machine, but it would seem not the same as me ;-) when executing mhttpd with mhttpd -e "experimentname" -p "experimentport" -D that experiment name is not transfered to transitions as cm_transition never specifies the experiment in the call to "transition STOP" etc. the only flag it sends is a -d for debug if selected. The result is that the stop and start button of the webinterface does not work, and transitions sit endlessly doing nothing but consuming all the processor, odbedit works fine though. Does everyone else use an apache reverse proxy and or explicit experiment choice in the url ? As an aside in mhttpd.c in the reply to -? it states 2 -h options the second should be a -e. line 13378. Thanks Sean
591	05 Jun 2009	Stefan Ritt	Bug Report	mhttpd command line experiment specifying
> Not sure how the rest of you specify mhttpd to work with multiple experiments on > one machine, but it would seem not the same as me ;-) Please note that there has been a change concerning multiple experiments inside mhttpd. From revision 4346 on, mhttpd can only connect to one single experiment, and the experiment name in the URL (aka ?exp=name) is not supported any more. So if you have several experiments, you start several instances of mhttpd now on different ports. > that experiment name is not transfered to transitions as cm_transition never > specifies the experiment in the call to "transition STOP" etc. > the only flag it sends is a -d for debug if selected. When connecting to an experiment, any midas client uses the ODB from that experiment so lives in that "namespace". So one client can never call any client from another experiment. So your problem must be something else. Of course there is not parameter "experiment" passed to cm_transition() since the experiment is implicitly defined by the ODB mhttpd is attached to. > The result is that the stop and start button of the webinterface does not work, > and transitions sit endlessly doing nothing but consuming all the processor, > odbedit works fine though. I guess you have to do some debugging there. Note that "detached" transitions have been implemented recently by Konstantin, so maybe your problem is related to that. In this case Konstantin should check what's wrong. > Does everyone else use an apache reverse proxy and or explicit experiment choice > in the url ? I use a ProxyPass /megon/ http://megon.psi.ch/ on our public web server to make an online machine accessible from outside the firewall, but just with a single experiment. > As an aside in mhttpd.c in the reply to -? it states 2 -h options the second > should be a -e. line 13378. Fixed in revision 4504.
592	05 Jun 2009	bazinski	Bug Report	mhttpd command line experiment specifying
Hi > > Not sure how the rest of you specify mhttpd to work with multiple experiments on > > one machine, but it would seem not the same as me ;-) > > Please note that there has been a change concerning multiple experiments inside > mhttpd. From revision 4346 on, mhttpd can only connect to one single experiment, > and the experiment name in the URL (aka ?exp=name) is not supported any more. So if > you have several experiments, you start several instances of mhttpd now on > different ports. That i do with : mhttpd -p xx -e experiment_name -D > > > that experiment name is not transfered to transitions as cm_transition never > > specifies the experiment in the call to "transition STOP" etc. > > the only flag it sends is a -d for debug if selected. > > When connecting to an experiment, any midas client uses the ODB from that > experiment so lives in that "namespace". So one client can never call any client > from another experiment. So your problem must be something else. Of course there is > not parameter "experiment" passed to cm_transition() since the experiment is > implicitly defined by the ODB mhttpd is attached to. Will have to look else where. > > > The result is that the stop and start button of the webinterface does not work, > > and transitions sit endlessly doing nothing but consuming all the processor, > > odbedit works fine though. > > I guess you have to do some debugging there. Note that "detached" transitions have > been implemented recently by Konstantin, so maybe your problem is related to that. > In this case Konstantin should check what's wrong. cm_transition does a "system(str)" on line 3243 inside the "if(async_flag == DETACH)" of line 3219, how does an external program know about the state of the originating mhttpd process ? Surely that str which executes "mtransition ......." should get a -e specifying the experiment explicitly ? probably a -h as well to be thorough. The only other way that mtransition.cxx will be able to pull in the experimentname is from the environment variable in its call to cm_get_environment(....) on its startup. Ok after some testing .... If i start the mhttpd with the environment variable MIDAS_EXPT_NAME set then its happy as mtransition inherits the environment of mhttpd so cm_get_environment(...) of mtransition picks up the experiment. Similarly if i insert "-e experimentname" into the string "str" that is passed in system(str) of line 3243. Then start and stop buttons work. Konstantin any comments. I suppose i can live with starting mhttpd with the environment set before running, but that kind of negates the command line argument to mhttpd. Thanks for the help Sean
593	05 Jun 2009	Konstantin Olchanski	Bug Report	mhttpd command line experiment specifying
> I guess you have to do some debugging there. Note that "detached" transitions have > been implemented recently by Konstantin, so maybe your problem is related to that. > In this case Konstantin should check what's wrong. Yes, I think there is a problem - cm_transition() starts the mtransition helper without the "-h expt" switch, so mtransition can only connect to the "default" experiment. Will fix. K.O.
596	18 Jun 2009	Konstantin Olchanski	Bug Report	mhttpd command line experiment specifying
> > I guess you have to do some debugging there. Note that "detached" transitions have > > been implemented recently by Konstantin, so maybe your problem is related to that. > > In this case Konstantin should check what's wrong. > > Yes, I think there is a problem - cm_transition() starts the mtransition helper without the "-h expt" switch, so > mtransition can only connect to the "default" experiment. Will fix. K.O. Fixed midas.c svn rev 4506: in cm_transition(), always pass "-e expt" to mtransition, if connected remotely, pass the "-h host:port". svn rev 4506 K.O.
376	21 May 2007	Konstantin Olchanski	Info	mhttpd changes to use /History/Tags data
I am slowly commiting the changes to the history code. This installement adds code to mhttpd to use the /History/Tags data (to be) generated by the mlogger. In the nutshell, the logger fills /History/Tags to "remember" what events, variables and tags exist in the history files. This replaces the old code that attempts to guess the contents of history files by looking at /Equipment tree. To ease the transition to the new system, I am leaving all the old code alive and active in the absense of "/History/Tags" entries. As soon as one starts using the new mlogger (to be commited), the new tags based mhttpd code will activate itself. K.O.
2364	23 Mar 2022	Konstantin Olchanski	Bug Fix	mhttpd bug fixed
the mhttpd bug should be fixed now (branch feature/buffer_mutex). simplest way to reproduce: wget http://localhost:8080/ quickly ctrl-C it wget http://localhost:8080/ inside mhttpd (by hook or crook) observe that the second wget got the data meant for the first wget. if you cannot ctrl-C the first wget quickly enough, put a sleep somewhere in the worker thread (in mongoose_write(), I think). this is what happens. 1st wget stops (by ctrl-C), socket is closed, mongoose frees it's mg_connection object (corresponding worker is still labouring, hmm... actually sleeping, and now has a stale nc pointer) 2nd wget starts, new socket is opened, mongoose allocates a new mg_connection object, but malloc() gives it back the same memory we just freed(), and the 1st wget's worker thread nc pointer is no longer stale, but points to 2nd wget's connection. so we think we are clever and we check the socket file descriptors. but same thing happens there, too. if 1st wget was file descriptor 7, it is closed, (1st wget worker now has a stale file handle), then reopened for the 2nd wget, per POSIX, we get back the same file descriptor 7. 1st wget worker now has the file handle for the 2nd wget tcp socket and the famous test/crash for "sending data to wrong socket" is defeated. now, worker thread for the 1st wget wants to send a reply, it has a valid nc pointer (points to 2nd wget's mg_connection object) and a valid file descriptor (points to 2nd wget's tcp socket), reply meant for the 1st wget is successfully sent to the 2nd wget, 2nd wget finishes, it's socket is closed, mg_connection object is free'ed. Now the worker thread for the 2nd wget has stale connection info, but this is okey, mongoose does not find a matching connection, 2nd wget worked thread reply goes nowhere, thread finishes silently (no memory leaks here, I checked). so, connection for 2nd wget completely impersonates the closed connection of 1st wget (I guess I could check the full socket address info, remote ip address, remote port number, etc, but...) in practice, this bug does not happen often because modern browsers tend to keep tcp sockets open for very long time. (not sure about sundry web proxies, etc). solution of course is very simple. match worker thread data to mongoose mg_connection objects using our own connection sequential number, which are unique and very easy to keep track of through the mongoose event handler. all this mess runs in the main thread, so no locking trouble here, small blessing. K.O.
2366	24 Mar 2022	Stefan Ritt	Bug Fix	mhttpd bug fixed
> 1st wget stops (by ctrl-C), socket is closed, mongoose frees it's mg_connection object > (corresponding worker is still labouring, hmm... actually sleeping, and now has a stale nc pointer) > > 2nd wget starts, new socket is opened, mongoose allocates a new mg_connection object, > but malloc() gives it back the same memory we just freed(), and the 1st wget's worker thread > nc pointer is no longer stale, but points to 2nd wget's connection. Why don't we CLEAR the memory (memset(object,0,sizeof(object)) before the free(), this way it cannot be mistakenly re-used by the next thread. Stefan
2368	24 Mar 2022	Konstantin Olchanski	Bug Fix	mhttpd bug fixed
> > 1st wget stops (by ctrl-C), socket is closed, mongoose frees it's mg_connection object > > (corresponding worker is still labouring, hmm... actually sleeping, and now has a stale nc pointer) > > > > 2nd wget starts, new socket is opened, mongoose allocates a new mg_connection object, > > but malloc() gives it back the same memory we just freed(), and the 1st wget's worker thread > > nc pointer is no longer stale, but points to 2nd wget's connection. > > Why don't we CLEAR the memory (memset(object,0,sizeof(object)) before the free(), this way it cannot be > mistakenly re-used by the next thread. > My description was unclear. I will try better now. When http replies are generated by worker threads, matching of reply to mg_connection is done by checking the address of the mg_connection object. (mongoose itself unhelpfully offers to send the reply to every mg_connection, see the responder to mg_broadcast() messages). This works for open/active connections, addresses of all mg_connections are unique. But if connection is closed and a new connection is opened, the address is reused (by malloc()/free() reusing memory blocks or by mongoose using a pool of mg_connection objects, does not matter). So matching http reply to mg_connection using only address of mg_connection can match the wrong connection. (contents of mg_connection object does not matter, only address is used by matching. so memzero() of mg_connection object does not help). I saw this during my testing - wrong data was sent to wrong browser often enough - but did not understand that the above problem is happening. Because I was unable to reliably reproduce the problem, I could not debug it. I tried to add a check for the tcp socket file descriptor number, in case there is a straight bug or multithread race or simple memory corruption. This replaced "we sent wrong data to wrong browser, poisoned browser cache, confused the user" with a crash. This "fix" seemed effective at the time. Maybe I should mention browser cache poisoning again. What happened is html pages and rpc replies were returned as responses to load things like CSS files, these bad responses are cached by the browser pretty much forever, so all subsequent midas pages will look wrong (bad css!) forever, until user manually clears browser cache. reload of page did not help, restart of browser did not help (I think). So a very bad bug. Unfortunately, the check for file descriptor was not effective because file descriptors are also reused. And I did see wrong data returned by mhttpd, but even more rarely. And everybody (myself included) complained about mhttpd crashes. Now, matching of responses to connections is done by connection sequential/serial number, which is unique 32-bit counter. Mismatch of reply to connection should not happen again. P.S. Latest version of the mongoose web server library does not help with this problem, the example code for matching reply to connection in their multithread example looks bogus: https://github.com/cesanta/mongoose/blob/master/examples/multi-threaded/main.c K.O.
2371	24 Mar 2022	Stefan Ritt	Bug Fix	mhttpd bug fixed
I see, now I understand. As for the browser cache problem: This Chrome extension is your friend: https://chrome.google.com/webstore/detail/clear-cache/cppjkneekbjaeellbfkmgnhonkkjfpdn?hl=en I use it all the time I change the CSS or a JS file. Having the "Developer Tools" open in Chrome helps as well (cache is then turned off). Firefox has similar extensions. Stefan
2374	24 Mar 2022	Konstantin Olchanski	Bug Fix	mhttpd bug fixed
> As for the browser cache problem: This Chrome extension is your friend ... for google chrome, it is easy, open the javascript debugger (left-click "inspect"), the reload button becomes a left-click menu, one left-click option is "clear cache and reload". (there is no button for "clear cookies and reload", re recent elog cookie problem). but this does not help me personally any. if midas web pages get confused, I will also get confused, too, and I will spend hours debugging mhttpd before thinking "hmm... maybe I should clear the browser cache!" not sure about firefox, safari, microsoft edge and opera. if I ever need it, I google it. K.O.
2079	25 Jan 2021	Thomas Lindner	Suggestion	mhttpd browser caching
I have a more subtle point about the new ODB key for using an external elog I mentioned in [1]. I was very confused after changing the ODB "External Elog" because mhttpd still wasn't using my external elog URL. I started trying to debug mhttpd.cxx, but found a lot of bits of mhttpd didn't seem to be getting called. I eventually realized that my browser had been caching the responses for some (though not all) of the MIDAS navigation buttons. Clearing my browser cache fixed the problem and allowed me to use the MIDAS button to the external ELOG. This caching happens on my macbook for both Firefox 84.0.2 and Safari 13.1. Many of the requests to mhttpd end up going to send_fp(), where we explicitly set the cache time to 24 hours. // send HTTP cache control headers time_t now = time(NULL); now += (int) (3600 * 24); struct tm* gmt = gmtime(&now); const char* format = "%A, %d-%b-%y %H:%M:%S GMT"; char str[256]; strftime(str, sizeof(str), format, gmt); r->rsprintf("Expires: %s\r\n", str); Some other MIDAS buttons don't seem to be cached by the browser; for instance the response for the 'OldHistory' button doesn't get cached. Should we remove the cache instruction for at least some of the buttons? At least for the elog button where we want the link direction to get switched by an ODB key the caching seems a bad idea. [1] https://midas.triumf.ca/elog/Midas/2078
2080	25 Jan 2021	Stefan Ritt	Suggestion	mhttpd browser caching
Let me first explain a bit why caching is there. Once we had the case that someone from TRIUMF opened a midas custom page at T2K. It took about one minute (!) to load the page. When we looked at it, we found that the custom page pulled about 100 items with individual HTTP requests from Japan, each taking about one second for the roundtrip. Then we redesigned the custom page communication so that many ODB entries could be retrieved in one operation, which improved the loading time from 100s to about 2s. With the buttons we will have to make the same compromise. If we do not cache anything, loading the midas status page over the Pacific takes many seconds. If we cache all, any change on the midas side will not be reflected on the web page. So there is a compromise to be made. I thought I designed it such that the side menu is cached locally, but when the user presses "reload", then the full menu is fetched from the server. Of course one has to remember this, so changing the ELOG URL or other things on the menu require a reload (or wait a certain time for the cache to expire). So try again if that's working for you. If not, I can visit it again and check if there is any bug. If we go the route to disable the cache, better try this to T2K and see what you get before we commit ourselves to that. Last time TRIUMF people were complaining a lot about long load times. Best, Stefan
2081	25 Jan 2021	Thomas Lindner	Suggestion	mhttpd browser caching
I tried reloading the pages. If I reloaded the actual elog page https://server.triumf.ca/?cmd=Elog then it bypassed the cache and got the correct updated page from mhttpd. However, if when I reloaded the status page https://server.triumf.ca/?cmd=Status and then clicked the Elog button then I just got the cached (old) page. Admittedly reloading the status page doesn't make so much sense (once I thought about it), but it is what I tried first (I'm good at modelling unexpected user behaviour); so there is some risk that the user will try reloading the wrong page and will be stuck not getting the external elog page (until 24 hours runs out). Anyway, I will update the documentation to note that you need to reload the elog page after changing this variable. That's probably an adequate solution. I certainly don't suggest getting rid of caching entirely. I was trying to think whether there was a set of pages where it would make sense to disable the cache (like the elog page). But maybe that will just cause more problems. > Let me first explain a bit why caching is there. Once we had the case that someone from > TRIUMF opened a midas custom page at T2K. It took about one minute (!) to load the page. > > When we looked at it, we found that the custom page pulled about 100 items with individual > HTTP requests from Japan, each taking about one second for the roundtrip. Then we redesigned > the custom page communication so that many ODB entries could be retrieved in one operation, > which improved the loading time from 100s to about 2s. > > With the buttons we will have to make the same compromise. If we do not cache anything, > loading the midas status page over the Pacific takes many seconds. If we cache all, any > change on the midas side will not be reflected on the web page. So there is a compromise > to be made. I thought I designed it such that the side menu is cached locally, but when > the user presses "reload", then the full menu is fetched from the server. Of course one > has to remember this, so changing the ELOG URL or other things on the menu require a > reload (or wait a certain time for the cache to expire). So try again if that's working > for you. If not, I can visit it again and check if there is any bug. > > If we go the route to disable the cache, better try this to T2K and see what you get before > we commit ourselves to that. Last time TRIUMF people were complaining a lot about long > load times. > > Best, > Stefan
2085	08 Feb 2021	Konstantin Olchanski	Suggestion	mhttpd browser caching
> r->rsprintf("Expires: %s\r\n", str); The best I can tell, none of this works in current browsers. with google-chrome, I see it cache pretty much everything regardless of "expires", "no cache", etc and anything else I tried. Things like shift-<reload>, etc used to work to refresh the cache, but not any more. So, I too, see confusing side-effects of caching, where I change something in ODB, but "nothing happens". Then I scratch my head for 30 minutes until I remember to open the javascript debugger where shift-<reload> (or is it ctrl-<reload>) actually works. It seems that the only reliable way to bypass the browser cache is to add a tag with a random number to the URL ("&ts=currenttime"). This is for HTTP GET requests. HTTP POST does not seem to be cached, so I do not worry about this nonsense for json-rpc requests. Perhaps we should do this random number trick for all user actions. User can press buttons only so fast, we should be able to sustain the rate. Anything loaded automatically or from a timer, we should allow caching. BTW, things like midas.js are also cached, and it is common to see problems after updating midas, where status.html is newly loaded, but midas.js is an old stale version from cache. Messy. K.O.
2086	08 Feb 2021	Stefan Ritt	Suggestion	mhttpd browser caching
> It seems that the only reliable way to bypass the browser cache is to add > a tag with a random number to the URL ("&ts=currenttime"). Indeed that's the only reliable way to avoid caching across browsers. An alternative is ("&r=" + Math.random()) to add a random number. > BTW, things like midas.js are also cached, and it is common to see problems > after updating midas, where status.html is newly loaded, but midas.js is an old > stale version from cache. Reloading JavaScript file NOT from the cache is really tricky these days. I added a special Google Chrome extension to clear my browser cache, which works reliably: https://chrome.google.com/webstore/detail/clear-cache/cppjkneekbjaeellbfkmgnhonkkjfpdn Stefan
2163	12 May 2021	Mathieu Guigue	Bug Report	mhttpd WebServer ODBTree initialization
Hi, Using midas version 12-2020, I am trying to run mhttpd from within a docker container using docker-compose. Starting from an empty ODB, I simply run `mhttpd` and this is the output I have: midas_hatfe_1 \| <Warning> Starting mhttpd... midas_hatfe_1 \| [mhttpd,INFO] ODB subtree /Runinfo corrected successfully midas_hatfe_1 \| MVOdb::SetMidasStatus: Error: MIDAS db_find_key() at ODB path "/WebServer/Host list" returned status 312 midas_hatfe_1 \| Mongoose web server will not use password protection midas_hatfe_1 \| Mongoose web server will not use the hostlist, connections from anywhere will be accepted midas_hatfe_1 \| Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF midas_hatfe_1 \| [mhttpd,ERROR] [mhttpd.cxx:19160:mongoose_listen,ERROR] Cannot mg_bind address "[::1]:8080" According to the documentation, the WebServer tree should be created automatically when starting the mhttpd; but it seems not as it doesn't find the entry "/WebServer/Host list". If I create it by end (using "create STRING /WebServer/Host list"), I still get the error message that mhttpd didn't bind properly to the local port 8080. I am not sure what it wrong, as mhttpd is working perfectly well in this exact container for midas 03-2020. Any idea what difference makes it not possible anymore to run into these container? Thanks very much for your help. Cheers Mathieu
2164	12 May 2021	Ben Smith	Bug Report	mhttpd WebServer ODBTree initialization
> midas_hatfe_1 \| Mongoose web server listening on http address "localhost:8080", passwords OFF, hostlist OFF > midas_hatfe_1 \| [mhttpd,ERROR] [mhttpd.cxx:19160:mongoose_listen,ERROR] Cannot mg_bind address "[::1]:8080" It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false.
2165	12 May 2021	Stefan Ritt	Bug Report	mhttpd WebServer ODBTree initialization
> It looks like mhttpd managed to bind to the IPv4 address (localhost), but not the IPv6 address (::1). If you don't need it, try setting "/Webserver/Enable IPv6" to false. We had this issue already several times. This info should be put into the documentation at a prominent location. Stefan

Goto page Previous 1, 2, 3 ... 33, 34, 35 ... 150, 151, 152 Next

ELOG V3.1.4-2e1708b5