18 Jan 2024, Andreas Suter, Forum, mhttpd eqtable
|
I have two more questions related to Units, Format for Equipment/Settings:
1) It looks as if I can have units per channel only for the Input/Output channels but not for Demand/Measured channels.
For instance we do have HV FE which collect devices with kV and V demand settings. It looks like this is not possible (see attachment) to have per channel units.
Is this right, or do I miss something here?
2) This new functionality needs entries under /Equipment/<eq-name>/Settings. The class driver generates the necessary structures if they are missing at the startup
of the scfe. It would be nice that the new, additional entries would be generate as well: Editable, Unit Input, Unit Format, etc. Perhaps optionally, if a DD is providing it?
Best,
Andreas |
18 Jan 2024, Stefan Ritt, Forum, mhttpd eqtable
|
I fixed both in the current version, so please give it a try.
Stefan |
23 Sep 2006, Konstantin Olchanski, Bug Report, mhttpd elog corruption via double-edit
|
Aparently the mhttpd elog will corrupt the elog files if two (or more?) elog entries are being edited at the
same time. K.O. |
24 Sep 2006, Stefan Ritt, Bug Report, mhttpd elog corruption via double-edit
|
K.O. wrote: | Aparently the mhttpd elog will corrupt the elog files if two (or more\?) elog entries are being edited at the same time. K.O. |
That's strange. Since mhttpd is single threaded, there should not be any multi-thread/process conflict there, since the elog files cannot be written simultaneously from two different browser sessions. If entries are edited at the same time, they get then submitted one after the other. Of course it is possible to edit the same entry, in which case the second submission "wins", overwriting the first one without notification. Withing the standalone elog server there is the option to lock entries ("use lock = 1") to prevent this, but this feature is not present in the mhttpd elog. |
27 Sep 2006, Konstantin Olchanski, Bug Report, mhttpd elog corruption via double-edit
|
[quote="Stefan Ritt"][Quote="K.O.]Aparently the mhttpd elog will corrupt the
elog files if two (or more\?) elog entries are being edited at the same time.
K.O.[/quote]
The corruption is very simple. mhttpd elog indexes the elog entries by the elog
file and offset inside the file, i.e. "http://ladd00:8088/EL/060927.318",
"060927" corresponds to log file "060927.log", "318" is the offset inside the
file where the message is located.
During "edit", the code "remembers" the offset of the original message and in
el_submit() blindly writes the edited message into the file at the remembered
offset.
If another message was edited before the edit of the first message is submitted,
the remembered offset becomes invalid (messages have shifted inside the file)
and el_submit() writes the edited text into the wrong place in the file,
corrupting it.
I have now added a check for this and we crash instead of corrupting the elog
file (midas.c rev 3340).
I do not know how to "properly" fix this bug without changing the indexing
scheme to something similar to what is used by elogd- message numbers instead of
file indices. In the existing scheme, message editing also breaks URLs shown in
the email notifications (they contain file indices that point to the wrong
places after messages are moved around by editing) and "reply threading" links.
Here is how I reproduce this bug:
1) start with an empty elog
2) create two messages
3) "edit" the second message, but do not submit it yet.
4) "edit" the first message, change the text to make sure the message size
becomes different; submit this change.
5) submit the "edit" of the first message. !!BOOM!!
K.O. |
28 Sep 2006, Stefan Ritt, Bug Report, mhttpd elog corruption via double-edit
|
> I do not know how to "properly" fix this bug without changing the indexing
> scheme to something similar to what is used by elogd- message numbers instead of
> file indices. In the existing scheme, message editing also breaks URLs shown in
> the email notifications (they contain file indices that point to the wrong
> places after messages are moved around by editing) and "reply threading" links.
Well, the development of elogd with it's message numbers was actually stimulated by
the problem you mentioned. After that all those problems went away. Another
incarnation of that problem is if you edit an mhttpd log file manually. Afterwards
the file offsets are different and the system gets corrupted. To fix this properly,
one would have to backport the el_xxx functions from elogd to mhttpd, or, even
simpler, remove the elog functionality in mhttpd and "force" everybody to use elogd
(after doing elconv to convert the files into the new format). |
03 Jul 2019, Lukas Gerritzen, Bug Report, mhttpd crashes when including nonexistent script in msequencer
|
Hi,
the subject line describes the project already
Suppose you have a file foo.msl. Somewhere in the file, you have the line
INCLUDE bar.msl
Once you click save in the sequencer page, mhttpd crashes:
$ mhttpd
free(): double free detected in tcache 2
[1] 27590 abort (core dumped) mhttpd
GDB helps shed some light on the problem:
#0 0x00007ffff76b057f in raise () from /lib64/libc.so.6
#1 0x00007ffff769a895 in abort () from /lib64/libc.so.6
#2 0x00007ffff76f39d7 in __libc_message () from /lib64/libc.so.6
#3 0x00007ffff76fa2ec in malloc_printerr () from /lib64/libc.so.6
#4 0x00007ffff76fbdf5 in _int_free () from /lib64/libc.so.6
#5 0x00000000004b8b41 in mxml_parse_entity (buf=buf@entry=0x7fffffffc2c8,
file_name=file_name@entry=0x7fffffffc710
"/home/luk/packages/mutrig_daq/online/foo.xml",
error=error@entry=0x7fffffffcd24 "XML read error in file
\"/home/luk/packages/mutrig_daq/online/foo.xml\", line 2: bar.msl.xml is
missing", error_size=error_size@entry=256,
error_line=error_line@entry=0x7fffffffce24) at ../mxml/mxml.c:1996
#6 0x00000000004b966d in mxml_parse_file
(file_name=file_name@entry=0x7fffffffc710
"/home/luk/packages/mutrig_daq/online/foo.xml",
error=error@entry=0x7fffffffcd24 "XML read error in file
\"/home/luk/packages/mutrig_daq/online/foo.xml\", line 2: bar.msl.xml is
missing", error_size=error_size@entry=256,
error_line=error_line@entry=0x7fffffffce24) at ../mxml/mxml.c:2041
#7 0x000000000041d9c2 in init_sequencer () at src/mhttpd.cxx:14321
#8 0x000000000040c2b6 in main (argc=<optimized out>, argv=<optimized out>) at
src/mhttpd.cxx:18028
Cheers
Lukas
P. S. This problem reminds me of the old joke: A man goes to his doctor and says
"Doc, it hurts when I do this" to which the doctor replies "Then don't do that".
However, I think, mhttpd should not crash even if you're not supposed to include
non-existent scripts in msequencer. |
10 Jul 2019, Stefan Ritt, Bug Report, mhttpd crashes when including nonexistent script in msequencer
|
The bug has been fixed. It was actually in the mxml library. So you have to go to the midas/mxml
subdirectory and update that one via "git pull origin master".
Stefan
> Hi,
> the subject line describes the project already
> Suppose you have a file foo.msl. Somewhere in the file, you have the line
> INCLUDE bar.msl
>
> Once you click save in the sequencer page, mhttpd crashes:
> $ mhttpd
> free(): double free detected in tcache 2
> [1] 27590 abort (core dumped) mhttpd
>
>
> GDB helps shed some light on the problem:
>
> #0 0x00007ffff76b057f in raise () from /lib64/libc.so.6
> #1 0x00007ffff769a895 in abort () from /lib64/libc.so.6
> #2 0x00007ffff76f39d7 in __libc_message () from /lib64/libc.so.6
> #3 0x00007ffff76fa2ec in malloc_printerr () from /lib64/libc.so.6
> #4 0x00007ffff76fbdf5 in _int_free () from /lib64/libc.so.6
> #5 0x00000000004b8b41 in mxml_parse_entity (buf=buf@entry=0x7fffffffc2c8,
> file_name=file_name@entry=0x7fffffffc710
> "/home/luk/packages/mutrig_daq/online/foo.xml",
> error=error@entry=0x7fffffffcd24 "XML read error in file
> \"/home/luk/packages/mutrig_daq/online/foo.xml\", line 2: bar.msl.xml is
> missing", error_size=error_size@entry=256,
> error_line=error_line@entry=0x7fffffffce24) at ../mxml/mxml.c:1996
> #6 0x00000000004b966d in mxml_parse_file
> (file_name=file_name@entry=0x7fffffffc710
> "/home/luk/packages/mutrig_daq/online/foo.xml",
> error=error@entry=0x7fffffffcd24 "XML read error in file
> \"/home/luk/packages/mutrig_daq/online/foo.xml\", line 2: bar.msl.xml is
> missing", error_size=error_size@entry=256,
> error_line=error_line@entry=0x7fffffffce24) at ../mxml/mxml.c:2041
> #7 0x000000000041d9c2 in init_sequencer () at src/mhttpd.cxx:14321
> #8 0x000000000040c2b6 in main (argc=<optimized out>, argv=<optimized out>) at
> src/mhttpd.cxx:18028
>
> Cheers
> Lukas
>
> P. S. This problem reminds me of the old joke: A man goes to his doctor and says
> "Doc, it hurts when I do this" to which the doctor replies "Then don't do that".
> However, I think, mhttpd should not crash even if you're not supposed to include
> non-existent scripts in msequencer. |
11 Aug 2003, Konstantin Olchanski, , mhttpd crash on corrupted ODB /RunInfo
|
Invalid values of ODB /RunInfo/State cause mhttpd crash in
show_status_page() because of an out of bounds access to the array of state
names. Suggest this fix: remove array of state names, use existing ladder of
if/else statements to explicitely set state name. Verified the fix works for
TWIST. Will commit this into MIDAS CVS unless get feedback.
src/mhttpd.c:show_status_page() {
...
rsprintf("<tr align=center><td>Run #%d", runinfo.run_number);
if (runinfo.state == STATE_STOPPED)
rsprintf("<td colspan=1 bgcolor=#FF0000>Stopped");
else if (runinfo.state == STATE_PAUSED)
rsprintf("<td colspan=1 bgcolor=#FFFF00>Paused");
else if (runinfo.state == STATE_RUNNING)
rsprintf("<td colspan=1 bgcolor=#00FF00>Running");
else
rsprintf("<td colspan=1 bgcolor=#FFFFFF>Unknown");
if (runinfo.requested_transition)
...
K.O. |
10 Oct 2003, Konstantin Olchanski, , mhttpd crash on corrupted ODB /RunInfo
|
There was no feedback. This code has been commited. K.O.
> Invalid values of ODB /RunInfo/State cause mhttpd crash in
> show_status_page() because of an out of bounds access to the array of state
> names. Suggest this fix: remove array of state names, use existing ladder of
> if/else statements to explicitely set state name. Verified the fix works for
> TWIST. Will commit this into MIDAS CVS unless get feedback.
>
> src/mhttpd.c:show_status_page() {
> ...
> rsprintf("<tr align=center><td>Run #%d", runinfo.run_number);
>
> if (runinfo.state == STATE_STOPPED)
> rsprintf("<td colspan=1 bgcolor=#FF0000>Stopped");
> else if (runinfo.state == STATE_PAUSED)
> rsprintf("<td colspan=1 bgcolor=#FFFF00>Paused");
> else if (runinfo.state == STATE_RUNNING)
> rsprintf("<td colspan=1 bgcolor=#00FF00>Running");
> else
> rsprintf("<td colspan=1 bgcolor=#FFFFFF>Unknown");
>
> if (runinfo.requested_transition)
> ...
>
> K.O. |
06 Sep 2021, Andreas Suter, Forum, mhttpd crash
|
midas version used: midas-2019-05-cxx-1461-g906be8b
I find in the systemd log every couple of days/weeks the following error message related to the mhttpd:
[mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!
with various socket numbers of course.
Can anybody hint me what is going wrong here?
The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.
Help would be very much appreciated!
Andreas |
06 Sep 2021, Konstantin Olchanski, Forum, mhttpd crash
|
> [mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!
> Can anybody hint me what is going wrong here?
> The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.
This is my code. I am the culprit. I had a bit of discussion about this with Stefan.
Bottom line is something is rotten in the multithreading code inside mhttpd and under conditions unknown,
it sends the wrong data into the wrong socket. This causes midas web pages to be really confused (RPC replies
processed as CSS file, HTML code processed at RPC replies, a mess), this wrong data is cached by the browser,
so restarting mhttpd does not fix the web pages. So a mess.
I find this is impossible to replicate, and so cannot debug it, cannot fix it. Best I was able to do
is to add a check for socket numbers, and thankfully it catches the condition before web browser caches
become poisoned. So, broken web pages replaced by mhttpd crash.
This situation reinforces my opinion that multi-threading and C++ classes "do not mix" (like H2 and O2 do not mix).
If you write a multithreaded C++ program and it works, good for you, if there is a malfunction, good luck with it,
C++ just does not have any built-in support for debugging typical multithreading problems. I think others have come
to the same conclusion and invented all these new "safe" programming languages, like Rust and Go.
Back to your troubles.
1) If you see a way to replicate this crash, or some way to reliably cause
the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
and I wish to fix this problem very much.
2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
are useless for debugging the present problem. There is just no way to examine the state of each
thread and of each http request using gdb by hand.
3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
the "wrong socket" check, but most likely you will not like the result).
4) run mhttpd inside a script: "while (1) { start mhttpd; sleep 1 sec; rinse, repeat; }" (run mhttpd without "-D", yes?)
In other news, the mongoose web server library have a new version available, they again changed their
multithreading scheme (I think it is an improvement). If I update mhttpd to this new version, it is very
likely the code with the "wrong socket" bug will be deleted. (with new bugs added to replace old bugs, of course).
K.O. |
07 Sep 2021, Andreas Suter, Forum, mhttpd crash
|
Dear Konstantin,
thanks for the prompt response, this helps a lot!
> 1) If you see a way to replicate this crash, or some way to reliably cause
> the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
> and I wish to fix this problem very much.
I wished I could! This happens 3-4 times per year only, so close to impossible to trigger.
> 2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
> are useless for debugging the present problem. There is just no way to examine the state of each
> thread and of each http request using gdb by hand.
>
> 3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
> other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
> to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
> the "wrong socket" check, but most likely you will not like the result).
>
I changed now to exit() rather than abort on the production machine. Perhaps this should be the default?
Andreas |
17 Sep 2021, Stefan Ritt, Forum, mhttpd crash
|
To limit the impact of the numerous crashes of mhttpd, I installed the monit tool at MEG at PSI
(https://en.wikipedia.org/wiki/Monit). It monitors mhttpd, and if it cannot connect to it for a certain
time, it kills the process and restarts it. This covers endless loops, simple crashes (caused by the
known multi-threading issue in mongoose), and also cases where mhttpd develops a memory leak and becomes
unresponsive.
To configure monit for mhttpd, first install the package, make sure the daemon gets started automatically
after reboot (typically "sysemctl enable monit"), and put the attached file into
/etc/monit.d/mhttpd
You have to adjust the <path-to-midas> according to your midas installation, and probably also the port
under which mhttpd is listening (8082 in my case). Put
set daemon 10
into /etc/monitrc if you want monit to check mhttpd every 10 seconds (default is 30 seconds). Then, every
10 seconds monit request "midas.css" from mhttpd, and if it cannot obtain it after 30 seconds, it kills
mhttpd and restarts it.
Loading long history plots taking more than 30 seconds should probably not be an issue since mhttpd is
multi-threaded, but I haven't tested this in detail.
Attached below is a typical status page produced by monit, which has its own built-in web server (normally
listening at port 2812, accessible only from localhost by default).
I hope this helps some of you.
Stefan |
04 Jun 2009, bazinski, Bug Report, mhttpd command line experiment specifying
|
Hi
Not sure how the rest of you specify mhttpd to work with multiple experiments on
one machine, but it would seem not the same as me ;-)
when executing mhttpd with
mhttpd -e "experimentname" -p "experimentport" -D
that experiment name is not transfered to transitions as cm_transition never
specifies the experiment in the call to "transition STOP" etc.
the only flag it sends is a -d for debug if selected.
The result is that the stop and start button of the webinterface does not work,
and transitions sit endlessly doing nothing but consuming all the processor,
odbedit works fine though.
Does everyone else use an apache reverse proxy and or explicit experiment choice
in the url ?
As an aside in mhttpd.c in the reply to -? it states 2 -h options the second
should be a -e. line 13378.
Thanks
Sean |
05 Jun 2009, Stefan Ritt, Bug Report, mhttpd command line experiment specifying
|
> Not sure how the rest of you specify mhttpd to work with multiple experiments on
> one machine, but it would seem not the same as me ;-)
Please note that there has been a change concerning multiple experiments inside
mhttpd. From revision 4346 on, mhttpd can only connect to one single experiment,
and the experiment name in the URL (aka ?exp=name) is not supported any more. So if
you have several experiments, you start several instances of mhttpd now on
different ports.
> that experiment name is not transfered to transitions as cm_transition never
> specifies the experiment in the call to "transition STOP" etc.
> the only flag it sends is a -d for debug if selected.
When connecting to an experiment, any midas client uses the ODB from that
experiment so lives in that "namespace". So one client can never call any client
from another experiment. So your problem must be something else. Of course there is
not parameter "experiment" passed to cm_transition() since the experiment is
implicitly defined by the ODB mhttpd is attached to.
> The result is that the stop and start button of the webinterface does not work,
> and transitions sit endlessly doing nothing but consuming all the processor,
> odbedit works fine though.
I guess you have to do some debugging there. Note that "detached" transitions have
been implemented recently by Konstantin, so maybe your problem is related to that.
In this case Konstantin should check what's wrong.
> Does everyone else use an apache reverse proxy and or explicit experiment choice
> in the url ?
I use a
ProxyPass /megon/ http://megon.psi.ch/
on our public web server to make an online machine accessible from outside the
firewall, but just with a single experiment.
> As an aside in mhttpd.c in the reply to -? it states 2 -h options the second
> should be a -e. line 13378.
Fixed in revision 4504. |
05 Jun 2009, bazinski, Bug Report, mhttpd command line experiment specifying
|
Hi
> > Not sure how the rest of you specify mhttpd to work with multiple experiments on
> > one machine, but it would seem not the same as me ;-)
>
> Please note that there has been a change concerning multiple experiments inside
> mhttpd. From revision 4346 on, mhttpd can only connect to one single experiment,
> and the experiment name in the URL (aka ?exp=name) is not supported any more. So if
> you have several experiments, you start several instances of mhttpd now on
> different ports.
That i do with :
mhttpd -p xx -e experiment_name -D
>
> > that experiment name is not transfered to transitions as cm_transition never
> > specifies the experiment in the call to "transition STOP" etc.
> > the only flag it sends is a -d for debug if selected.
>
> When connecting to an experiment, any midas client uses the ODB from that
> experiment so lives in that "namespace". So one client can never call any client
> from another experiment. So your problem must be something else. Of course there is
> not parameter "experiment" passed to cm_transition() since the experiment is
> implicitly defined by the ODB mhttpd is attached to.
Will have to look else where.
>
> > The result is that the stop and start button of the webinterface does not work,
> > and transitions sit endlessly doing nothing but consuming all the processor,
> > odbedit works fine though.
>
> I guess you have to do some debugging there. Note that "detached" transitions have
> been implemented recently by Konstantin, so maybe your problem is related to that.
> In this case Konstantin should check what's wrong.
cm_transition does a "system(str)" on line 3243 inside the "if(async_flag == DETACH)" of
line 3219, how does an external program know about the state of the originating mhttpd
process ? Surely that str which executes "mtransition ......." should get a -e
specifying the experiment explicitly ? probably a -h as well to be thorough.
The only other way that mtransition.cxx will be able to pull in the experimentname is
from the environment variable in its call to cm_get_environment(....) on its startup.
Ok after some testing ....
If i start the mhttpd with the environment variable MIDAS_EXPT_NAME set then its happy
as mtransition inherits the environment of mhttpd so cm_get_environment(...) of
mtransition picks up the experiment. Similarly if i insert "-e experimentname" into the
string "str" that is passed in system(str) of line 3243. Then start and stop buttons work.
Konstantin any comments.
I suppose i can live with starting mhttpd with the environment set before running, but
that kind of negates the command line argument to mhttpd.
Thanks for the help
Sean |
05 Jun 2009, Konstantin Olchanski, Bug Report, mhttpd command line experiment specifying
|
> I guess you have to do some debugging there. Note that "detached" transitions have
> been implemented recently by Konstantin, so maybe your problem is related to that.
> In this case Konstantin should check what's wrong.
Yes, I think there is a problem - cm_transition() starts the mtransition helper without the "-h expt" switch, so
mtransition can only connect to the "default" experiment. Will fix. K.O. |
18 Jun 2009, Konstantin Olchanski, Bug Report, mhttpd command line experiment specifying
|
> > I guess you have to do some debugging there. Note that "detached" transitions have
> > been implemented recently by Konstantin, so maybe your problem is related to that.
> > In this case Konstantin should check what's wrong.
>
> Yes, I think there is a problem - cm_transition() starts the mtransition helper without the "-h expt" switch, so
> mtransition can only connect to the "default" experiment. Will fix. K.O.
Fixed midas.c svn rev 4506: in cm_transition(), always pass "-e expt" to mtransition, if connected remotely, pass the
"-h host:port".
svn rev 4506
K.O. |
21 May 2007, Konstantin Olchanski, Info, mhttpd changes to use /History/Tags data
|
I am slowly commiting the changes to the history code. This installement adds
code to mhttpd to use the /History/Tags data (to be) generated by the mlogger.
In the nutshell, the logger fills /History/Tags to "remember" what events,
variables and tags exist in the history files.
This replaces the old code that attempts to guess the contents of history files
by looking at /Equipment tree.
To ease the transition to the new system, I am leaving all the old code alive
and active in the absense of "/History/Tags" entries.
As soon as one starts using the new mlogger (to be commited), the new tags based
mhttpd code will activate itself.
K.O. |
|