15 Mar 2021, Frederik Wauters, Forum, INT INT32 in experim.h
|
works!
> Ok, I added
>
> /* define integer types with explicit widths */
> #ifndef NO_INT_TYPES_DEFINE
> typedef unsigned char UINT8;
> typedef char INT8;
> typedef unsigned short UINT16;
> typedef short INT16;
> typedef unsigned int UINT32;
> typedef int INT32;
> typedef unsigned long long UINT64;
> typedef long long INT64;
> #endif
>
> to cover all new types. If there is a collision with user defined types, compile your program with -DNO_INT_TYPES_DEFINE and you remove the
> above definition. I hope there are no other conflicts.
>
> Stefan |
30 Mar 2021, Konstantin Olchanski, Forum, INT INT32 in experim.h
|
> >
> > /* define integer types with explicit widths */
> > #ifndef NO_INT_TYPES_DEFINE
> > typedef unsigned char UINT8;
> > typedef char INT8;
> > typedef unsigned short UINT16;
> > typedef short INT16;
> > typedef unsigned int UINT32;
> > typedef int INT32;
> > typedef unsigned long long UINT64;
> > typedef long long INT64;
> > #endif
> >
NIH at work. In C and C++ the standard fixed bit length data types are available
in #include <stdint.h> as uint8_t, uint16_t, uint32_t, uint64_t & co.
BTW, the definition of UINT32 as "unsigned int" is technically incorrect, on 16-bit machines
"int" is 16-bit wide and on some 64-bit machines "int" is 64-bit wide.
K.O. |
12 Apr 2021, Isaac Labrie Boulay, Forum, Client gets immediately removed when using a script button.
|
Hi all,
I'm running into a curious problem when I try to run a program using my custom
script button. I have been using a script button to start my DAQ, this button
has always worked. It starts by exporting an absolute path to scripts and then
runs scripts, my frontend, my analyzer, and mlogger relative to this path.
I recently added a line of code to run a new script "logic_controller". If I run
the script_daq from my terminal (./start_daq), mhttpd accepts the client and the
program works as intended. But, if I use the script button, the logic_controller
program is immediately deleted by MIDAS. It can be seen appearing in the status
page clients list and then immediately gets deleted. This is a client that runs
on the local experiment host.
What might be the issue? What is the difference between running the script
through the terminal as opposed to running it through the mhttpd button?
I have added a picture of my simple script and the logic_controller code.
Any help would be greatly appreciated.
Cheers.
Isaac |
12 Apr 2021, Ben Smith, Forum, Client gets immediately removed when using a script button.
|
> if I use the script button, the logic_controller program is immediately deleted by MIDAS.
This is indeed very curious, and I can't reproduce it on my test experiment. Can you redirect stdout and stderr from the logic_controller program into a file, to see how far the program gets? If it gets to the while loop at the end, then it would be useful to add some debug statements to see what condition causes it to exit the loop.
Are there any relevant messages in the midas message log about the program being killed? What's the value of "/Programs/logic_controller/Watchdog timeout"? |
12 Apr 2021, Isaac Labrie Boulay, Forum, Client gets immediately removed when using a script button.
|
> > if I use the script button, the logic_controller program is immediately deleted by MIDAS.
>
> This is indeed very curious, and I can't reproduce it on my test experiment. Can you redirect stdout and stderr from the logic_controller program into a file, to see how far the program gets? If it gets to the while loop at the end, then it would be useful to add some debug statements to see what condition causes it to exit the loop.
I have redirected stdout and stderr into a text file and I have attached it to this entry. From what the stdout says, it seems that the lambda
function gets called 4 times before the program disconnects from the experiment. Somehow the status must become SS_ABORT or RPC_SHUTDOWN.
> Are there any relevant messages in the midas message log about the program being killed? What's the value of "/Programs/logic_controller/Watchdog timeout"?
There are no interesting messages in the midas.log and "/Programs/logic_controller/Watchdog timeout" is 10000 when I run the command from the terminal window.
What happens when you run it on your test experiment?
I'll try some more debugging.
Thanks for helping me out! Cheers.
Isaac |
12 Apr 2021, Ben Smith, Forum, Client gets immediately removed when using a script button.
|
I think it would be useful to find the minimal example that exhibits this behaviour.
What happens if your logic controller code is simply the 17 lines below? What happens if you create another script button that only starts the logic controller, not any of the other programs? etc. Gradually re-add features until you hit the problem (or scream in horror if it breaks with 17 lines of C++ and a 1 line shell script).
#include "midas.h"
#include "stdio.h"
int main() {
cm_connect_experiment("", "", "logic_controller", NULL);
do {
int status = cm_yield(100);
printf("cm_yield returned %d\n", status);
if (status == SS_ABORT || status == RPC_SHUTDOWN)
break;
} while (!ss_kbhit());
cm_disconnect_experiment();
return 0;
} |
13 Apr 2021, Isaac Labrie Boulay, Forum, Client gets immediately removed when using a script button.
|
> I think it would be useful to find the minimal example that exhibits this behaviour.
>
> What happens if your logic controller code is simply the 17 lines below? What happens if you create another script button that only starts the logic controller, not any of the other programs? etc. Gradually re-add features until you hit the problem (or scream in horror if it breaks with 17 lines of C++ and a 1 line shell script).
>
Hi Ben,
I have followed your suggestions and the program still stops immediately. My status as returned from "cm_yield(100)" is always 412 (SS_TIMEOUT) which is fine.
The issue is that, when run with the script button, the do-wile loop stops immediately because the !ss_kbhit() always evaluates to FALSE.
My temporary solution has been to let the loop run forever :)
Let me know what think. Thanks again!
Isaac
>
>
> #include "midas.h"
> #include "stdio.h"
>
> int main() {
> cm_connect_experiment("", "", "logic_controller", NULL);
>
> do {
> int status = cm_yield(100);
> printf("cm_yield returned %d\n", status);
> if (status == SS_ABORT || status == RPC_SHUTDOWN)
> break;
> } while (!ss_kbhit());
>
> cm_disconnect_experiment();
>
> return 0;
> } |
13 Apr 2021, Stefan Ritt, Forum, Client gets immediately removed when using a script button.
|
> I have followed your suggestions and the program still stops immediately. My status as returned from "cm_yield(100)" is always 412 (SS_TIMEOUT) which is fine.
> The issue is that, when run with the script button, the do-wile loop stops immediately because the !ss_kbhit() always evaluates to FALSE.
>
> My temporary solution has been to let the loop run forever :)
Ahh, could be that ss_kbhit() misbehaves if there is no keyboard, meaning that it is started in the background as a script.
We never had the issue before, since all "standard" midas programs like mlogger, mhttpd etc. also use ss_kbhit() and they
can be started in the background via the "-D" flag, but maybe the stdin is then handled differentlhy.
So just remove the ss_kbhit(), but keep the break, so that you can stop your program via the web page, like
#include "midas.h"
#include "stdio.h"
int main() {
cm_connect_experiment("", "", "logic_controller", NULL);
do {
int status = cm_yield(100);
printf("cm_yield returned %d\n", status);
if (status == SS_ABORT || status == RPC_SHUTDOWN)
break;
} while (TRUE);
cm_disconnect_experiment();
return 0;
} |
05 May 2021, Zaher Salman, Forum, m is not defined error
|
We had the same issue here, which comes from mhttpd.js line 2395 on the current git version. This seems to happen mostly when there is an alarm triggered or when there is an error message.
Anyway, the easiest solution for us was to define m at the beginning of mhttpd_message function
let m;
and replace line 2395 with
if (m !== undefined) {
> > I see this mhttpd error starting MSL-script:
> > Uncaught (in promise) ReferenceError: m is not defined
> > at mhttpd_message (VM2848 mhttpd.js:2304)
> > at VM2848 mhttpd.js:2122
>
> your line numbers do not line up with my copy of mhttpd.js. what version of midas
> do you run?
>
> please give me the output of odbedit "ver" command (GIT revision, looks like this:
> IT revision: Wed Feb 3 11:47:02 2021 -0800 - midas-2020-08-a-84-g78d18b1c on
> branch feature/midas-2020-12).
>
> same info is in the midas "help" page (GIT revision).
>
> to decipher the git revision string:
>
> midas-2020-08-a-84-g78d18b1c means:
> is commit 78d18b1c
> which is 84 commits after git tag midas-2020-08-a
>
> "on branch feature/midas-2020-12" confirms that I have the midas-2020-12 pre-
> release version without having to do all the decoding above.
>
> if you also have "-dirty" it means you changed something in the source code
> and warranty is voided. (just joking! we can debug even modified midas source
> code)
>
> K.O. |
06 May 2021, Stefan Ritt, Forum, m is not defined error
|
Thanks for reporting and pointing to the right location.
I fixed and committed it.
Best,
Stefan |
08 Jul 2021, Francesco Renga, Forum, Problem with python file reader
|
Dear experts,
while trying to readout a MIDAS file from a python script. I get the error below at the very first event. Any hint?
Thank you very much,
Francesco
File "/home/cygno/DAQ/offline/file_reader.py", line 9, in <module>
for event in mfile:
File "/home/cygno/DAQ/python/midas/file_reader.py", line 159, in __next__
ev = self.read_next_event()
File "/home/cygno/DAQ/python/midas/file_reader.py", line 264, in read_next_event
return self.read_this_event_body()
File "/home/cygno/DAQ/python/midas/file_reader.py", line 307, in read_this_event_body
self.event.unpack_body(body_data, 0, self.use_numpy)
File "/home/cygno/DAQ/python/midas/event.py", line 648, in unpack_body
bank.fill_header_from_bytes(bank_header_data, self.is_bank_32(), self.is_bank_data_64bit_aligned())
File "/home/cygno/DAQ/python/midas/event.py", line 298, in fill_header_from_bytes
self.name = "".join(x.decode('ascii') for x in unpacked[:4])
File "/home/cygno/DAQ/python/midas/event.py", line 298, in <genexpr>
self.name = "".join(x.decode('ascii') for x in unpacked[:4])
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc8 in position 0: ordinal not in range(128) |
09 Jul 2021, Ben Smith, Forum, Problem with python file reader
|
Hi Francesco,
Can you send me an example file to look at please? Either attached to the elog or sent directly to bsmith@triumf.ca
Thanks,
Ben |
06 Sep 2021, Andreas Suter, Forum, mhttpd crash
|
midas version used: midas-2019-05-cxx-1461-g906be8b
I find in the systemd log every couple of days/weeks the following error message related to the mhttpd:
[mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!
with various socket numbers of course.
Can anybody hint me what is going wrong here?
The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.
Help would be very much appreciated!
Andreas |
06 Sep 2021, Konstantin Olchanski, Forum, mhttpd crash
|
> [mhttpd,ERROR] [mhttpd.cxx:18886:on_work_complete,ERROR] Should not send response to request from socket 28 to socket 26, abort!
> Can anybody hint me what is going wrong here?
> The bad thing on the crash is, that sometimes it is leading to a "chain-reaction" killing multiple midas frontends, which essentially stop the experiment.
This is my code. I am the culprit. I had a bit of discussion about this with Stefan.
Bottom line is something is rotten in the multithreading code inside mhttpd and under conditions unknown,
it sends the wrong data into the wrong socket. This causes midas web pages to be really confused (RPC replies
processed as CSS file, HTML code processed at RPC replies, a mess), this wrong data is cached by the browser,
so restarting mhttpd does not fix the web pages. So a mess.
I find this is impossible to replicate, and so cannot debug it, cannot fix it. Best I was able to do
is to add a check for socket numbers, and thankfully it catches the condition before web browser caches
become poisoned. So, broken web pages replaced by mhttpd crash.
This situation reinforces my opinion that multi-threading and C++ classes "do not mix" (like H2 and O2 do not mix).
If you write a multithreaded C++ program and it works, good for you, if there is a malfunction, good luck with it,
C++ just does not have any built-in support for debugging typical multithreading problems. I think others have come
to the same conclusion and invented all these new "safe" programming languages, like Rust and Go.
Back to your troubles.
1) If you see a way to replicate this crash, or some way to reliably cause
the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
and I wish to fix this problem very much.
2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
are useless for debugging the present problem. There is just no way to examine the state of each
thread and of each http request using gdb by hand.
3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
the "wrong socket" check, but most likely you will not like the result).
4) run mhttpd inside a script: "while (1) { start mhttpd; sleep 1 sec; rinse, repeat; }" (run mhttpd without "-D", yes?)
In other news, the mongoose web server library have a new version available, they again changed their
multithreading scheme (I think it is an improvement). If I update mhttpd to this new version, it is very
likely the code with the "wrong socket" bug will be deleted. (with new bugs added to replace old bugs, of course).
K.O. |
07 Sep 2021, Andreas Suter, Forum, mhttpd crash
|
Dear Konstantin,
thanks for the prompt response, this helps a lot!
> 1) If you see a way to replicate this crash, or some way to reliably cause
> the crash within 5-10 minutes after starting mhttpd, please let me know. I can work with that
> and I wish to fix this problem very much.
I wished I could! This happens 3-4 times per year only, so close to impossible to trigger.
> 2) My "wrong socket" check calls abort() to produce a core dump. In my experience these core dumps
> are useless for debugging the present problem. There is just no way to examine the state of each
> thread and of each http request using gdb by hand.
>
> 3) this abort() causes linux to write a core dump, this takes a long time and I think it causes
> other MIDAS program to stop, timeout and die. You can try to fix this by disabling core dumps (set "enable core dumps"
> to "false" in ODB and set core dump size limit to 0), or change abort() to exit(). (You can also disable
> the "wrong socket" check, but most likely you will not like the result).
>
I changed now to exit() rather than abort on the production machine. Perhaps this should be the default?
Andreas |
17 Sep 2021, Stefan Ritt, Forum, mhttpd crash
|
To limit the impact of the numerous crashes of mhttpd, I installed the monit tool at MEG at PSI
(https://en.wikipedia.org/wiki/Monit). It monitors mhttpd, and if it cannot connect to it for a certain
time, it kills the process and restarts it. This covers endless loops, simple crashes (caused by the
known multi-threading issue in mongoose), and also cases where mhttpd develops a memory leak and becomes
unresponsive.
To configure monit for mhttpd, first install the package, make sure the daemon gets started automatically
after reboot (typically "sysemctl enable monit"), and put the attached file into
/etc/monit.d/mhttpd
You have to adjust the <path-to-midas> according to your midas installation, and probably also the port
under which mhttpd is listening (8082 in my case). Put
set daemon 10
into /etc/monitrc if you want monit to check mhttpd every 10 seconds (default is 30 seconds). Then, every
10 seconds monit request "midas.css" from mhttpd, and if it cannot obtain it after 30 seconds, it kills
mhttpd and restarts it.
Loading long history plots taking more than 30 seconds should probably not be an issue since mhttpd is
multi-threaded, but I haven't tested this in detail.
Attached below is a typical status page produced by monit, which has its own built-in web server (normally
listening at port 2812, accessible only from localhost by default).
I hope this helps some of you.
Stefan |
30 Sep 2021, Francesco Renga, Forum, OPC client within MIDAS
|
Dear all,
I need to integrate in my MIDAS project the communication with an OPC UA
server. My plan is to develop an OPC UA client as a "device" in
midas/drivers/device.
Two questions:
1) Is anybody aware of some similar effort for some other project, so that I can
get some example?
2) What could be the more appropriate driver's class to be used? generic.cxx?
multi.cxx?
Thank you for your help,
Francesco |
11 Oct 2021, Konstantin Olchanski, Forum, test
|
test, no email. K.O. |
11 Oct 2021, Konstantin Olchanski, Forum, test
|
> test, no email. K.O.
test reply, no email. K.O. |
11 Oct 2021, Konstantin Olchanski, Forum, test
|
> > test, no email. K.O.
>
> test reply, no email. K.O.
test attachment, no email. K.O. |
|