30 Jun 2021, Stefan Ritt, Suggestion, ODB Load in Sequencer
|
I quickly checked the pull request and could not find any obvious problem, so I merged it. |
13 Jul 2021, Stefan Ritt, Info, MidasConfig.cmake usage
|
Thanks for the contribution of MidasConfig.cmake. May I kindly ask for one extension:
Many of our frontends require inclusion of some midas-supplied drivers and libraries
residing under
$MIDASSYS/drivers/class/
$MIDASSYS/drivers/device
$MIDASSYS/mscb/src/
$MIDASSYS/src/mfe.cxx
I guess this can be easily added by defining a MIDAS_SOURCES in MidasConfig.cmake, so
that I can do things like:
add_executable(my_fe
myfe.cxx
$(MIDAS_SOURCES}/src/mfe.cxx
${MIDAS_SOURCES}/drivers/class/hv.cxx
...)
Does this make sense or is there a more elegant way for that?
Stefan |
05 Aug 2021, Stefan Ritt, Bug Report, mhttpd WebServer ODBTree initialization
|
Well, we all see it here at PSI, so this is enough reason to turn this off by default. Shall
I do it? |
20 Aug 2021, Stefan Ritt, Bug Report, select() FD_SETSIZE overrun
|
> I am looking at the mlogger in the ALPHA anti-hydrogen experiment at CERN. It is
> mysteriously misbehaving during run start and stop.
>
> The problem turns out to be with the select() system call.
>
> The corresponding FD_SET(), FD_ISSET() & co operate on a an array of fixed size
> FD_SETSIZE, value 1024, in my case. But the socket number is 1409, so we overrun
> the FD_SET() array. Ouch.
>
> I see that all uses of select() in midas have no protection against this.
>
> (we should probably move away from select() to newer poll() or whatever it is)
>
> Why does mlogger open so many file descriptors? The usual, scaling problems in the
> history. The old midas history does not reuse file descriptors, so opens the same
> 3 history files (.hst, .idx, etc) for each history event. The new FILE history
> opens just one file per history event. But if the number of events is bigger than
> 1024, we run into same trouble.
>
> (BTW, the system limit on file descriptors is 4096 on the affected machine, 1024
> on some other machines, see "limit" or "ulimit -a").
>
> K.O.
I cannot imagine that you have more than 1024 different events in ALPHA. That wouldn't
fit on your status page.
I have some other suspicion: The logger opens a history file on access, then closes it
again after writing to it. In the old days we had a case where we had a return from the
write function BEFORE the file has been closed. This is kind of a memory leak, but with
file descriptors. After some time of course you run out of file descriptors and crash.
Now that bug has been fixed many years ago, but it sounds to me like there is another
"fd leak" somewhere. You should add some debugging in the history code to print the
file descriptors when you open a file and when you leave that routine. The leak could
however also be somewhere else, like writing to the message file, ODB dump, ...
The right thing of course would be to rewrite everything with std::ofstream which
closes automatically the file when the object gets out of scope.
Stefan |
24 Aug 2021, Stefan Ritt, Bug Fix, changes in history plots
|
One addition I would be in favour of is to remove the "Order" and replace it with drag&drop handles, because this is what people are more
used to today. Only the old guys like us remember the /etc/init.d/xx_yy scheme where one uses an integer number in the file name to
determine an order.
See for example: https://jsbin.com/hijetos/edit?js,output
But instead of relying on a foreign library, I would rather implement that myself, since I need the same thing later for the to-be-
implemented ODB editor (next year? next lockdown?)
Stefan |
17 Sep 2021, Stefan Ritt, Forum, mhttpd crash
|
To limit the impact of the numerous crashes of mhttpd, I installed the monit tool at MEG at PSI
(https://en.wikipedia.org/wiki/Monit). It monitors mhttpd, and if it cannot connect to it for a certain
time, it kills the process and restarts it. This covers endless loops, simple crashes (caused by the
known multi-threading issue in mongoose), and also cases where mhttpd develops a memory leak and becomes
unresponsive.
To configure monit for mhttpd, first install the package, make sure the daemon gets started automatically
after reboot (typically "sysemctl enable monit"), and put the attached file into
/etc/monit.d/mhttpd
You have to adjust the <path-to-midas> according to your midas installation, and probably also the port
under which mhttpd is listening (8082 in my case). Put
set daemon 10
into /etc/monitrc if you want monit to check mhttpd every 10 seconds (default is 30 seconds). Then, every
10 seconds monit request "midas.css" from mhttpd, and if it cannot obtain it after 30 seconds, it kills
mhttpd and restarts it.
Loading long history plots taking more than 30 seconds should probably not be an issue since mhttpd is
multi-threaded, but I haven't tested this in detail.
Attached below is a typical status page produced by monit, which has its own built-in web server (normally
listening at port 2812, accessible only from localhost by default).
I hope this helps some of you.
Stefan |
19 Sep 2021, Stefan Ritt, Bug Fix, Chat working again
|
Not sure how many people are using it, but the Chat facility in midas was broken
for some time now and got fixed today again.
Just for your information: Chat can be used like WhatsApp & Co, and connects all
people who access a midas experiment through their browser. It's good to
communicate between shift crew members located at different places. One advantage
is that the chat messages can get 'spoken' by the text-to-speech engine of your
browser, so it can be used to "wake up" shifters. Can be configured through the
"Config" page.
Stefan |
28 Sep 2021, Stefan Ritt, Bug Report, Install clash between MIDAS 2020-08 and mscb
|
> 1) git clone https://bitbucket.org/tmidas/midas --recursive
> 2) cd midas
> 3) git checkout release/midas-2020-08
> 4) mkdir build
> 5) cd build
> 6) cmake ..
> 7) make
When you do step 3), you get
~/tmp/midas$ git checkout release/midas-2020-08
warning: unable to rmdir 'manalyzer': Directory not empty
warning: unable to rmdir 'midasio': Directory not empty
M mjson
M mscb
M mvodb
M mxml
The 'M' in front of the submodules like mscb tell you that you
have an older version of midas (namely midas-2020-08), but the
*current* submodules, which won't match. So you have to roll back
also the submodules with:
3.5) git submodule update --recursive
This fetched those versions of the submodules which match the
midas version 2020-08. See here for details:
https://git-scm.com/book/en/v2/Git-Tools-Submodules
From where did you get the command
git checkout release/xxxx ???
If you tell me the location of that documentation, I will take
care that it will be amended with the command
git submodule update --recursive
Best,
Stefan |
29 Sep 2021, Stefan Ritt, Bug Report, nstall clash between MIDAS 2020-08 and mscb
|
> Thank you, Stefan.
>
> I found these instructions under
> 1) The changelog: https://midas.triumf.ca/MidasWiki/index.php/Changelog#2020-12
> 2) Konstantin's elog announcements (e.g. https://midas.triumf.ca/elog/Midas/2089)
>
> I do see reference to updating the submodules under the TRIUMF install
> instructions
> (https://midas.triumf.ca/MidasWiki/index.php/Setup_MIDAS_experiment_at_TRIUMF#Inst
> all_MIDAS) although perhaps it can be clarified.
>
> Cheers,
> Richard
Hi Richard,
I updated the documentation at
https://midas.triumf.ca/MidasWiki/index.php/Changelog#Updating_midas
by putting the submodule update command everywhere.
Best,
Stefan |
11 Oct 2021, Stefan Ritt, Info, Modification in the history logging system
|
A requested change in the history logging system has been made today. Previously, history values were
logged with a maximum frequency (usually once per second) but also with a minimum frequency, meaning
that values were logged for example every 60 seconds, even if they did not change. This causes a problem.
If a frontend is inactive or crashed which produces variables to be logged, one cannot distinguish between
a crashed or inactive frontend program or a history value which simply did not change much over time.
The history system was designed from the beginning in a way that values are only logged when they actually
change. This design pattern was broken since about spring 2021, see for example this issue:
https://bitbucket.org/tmidas/midas/issues/305/log_history_periodic-doesnt-account-for
Today I modified the history code to fix this issue. History logging is now controlled by the value of
common/Log history in the following way:
* Common/Log history = 0 means no history logging
* Common/Log history = 1 means log whenever the value changes in the ODB
* Common/Log history = N means log whenever the value changes in the ODB and
the previous write was more than N seconds ago
So most experiments should be happy with 0 or 1. Only experiments which have fluctuating values due to noisy
sensors might benefit from a value larger than 1 to limit the history logging. Anyhow this is not the preferred
way to limit history logging. This should be done by the front-end limiting the updates to the ODB. Most of the
midas slow control drivers have a “threshold” value. Only if the input changes by more then the threshold are
written to the ODB. This allows a per-channel “dead band” and not a per-event limit on history logging
as ‘log history’ would do. In addition, the threshold reduces the write accesses to the ODB, although that is
only important for very large experiments.
Stefan |
15 Oct 2021, Stefan Ritt, Suggestion, Adding (or improving discoverability) of TID for odbset
|
> Creating an ODB key requires users to know the Type ID that are defined in
> https://bitbucket.org/tmidas/midas/src/develop/include/midas.h starting at line 320.
>
> I can't find any information on the Midas Wiki about these values or how to find
> them.
>
> Am I missing something obvious? Is there a way to improve how to find these values?
> Or is this not the best way to interact with the ODB?
Well, you found them in midas.h, so where is the problem?
If you want a more detailed description, just look in the midas documentation (RTFM):
https://midas.triumf.ca/MidasWiki/index.php/Midas_Data_Types
If you want a more modern interface to the ODB without these data types, look here:
https://midas.triumf.ca/MidasWiki/index.php/Odbxx
Best regards,
Stefan |
22 Oct 2021, Stefan Ritt, Forum, mhttpd error
|
> Enable IPv6 y
Probably the IPv6 problem, see here elog:2269
I asked to turn off IPv6 by default, or at least mention this in the documentation,
but unfortunately nothing happened.
Stefan |
25 Oct 2021, Stefan Ritt, Forum, Logger crash
|
The short term solution would be to increase the logger timeout in the ODB under
/Programs/Logger/Watchdog timeout
and set it to 6000 (one minute). But that is curing just the symptoms. It would be
interesting to understand the cause of this error. Probably the logger takes more than 10
seconds to start or stop the run. The reason could be that the history grow too big (what
we have right now in MEG II), or some disk problems. But that needs detailed debugging on
the logger side.
Stefan |
10 Nov 2021, Stefan Ritt, Forum, Issue in data writing speed
|
Midas uses various buffers (in the frontend, at the server side before the SYSTEM buffer, the SYSTEM buffer itself, on the
logger before writing to disk. All these buffers are in RAM and have fast access, so you can fill them pretty quickly. When
they are full, the logger writes to disk, which is slower. So I believe at 2 Hz your disk can keep up with your writing
speed, but at 4 Hz (2x8MBx4=32 MB/sec) your disk starts slowing down the writing process. Now 32MB/s is pretty slow for
a disk, so I presume you have turned compression on which takes quite some time.
To verify this, disable logging. The disable compression and keep logging. Then report back here again.
> Dear all,
> I've a frontend writing a quite big bunch of data into a MIDAS bank (16bit output from a 4MP photo camera).
> I'm experiencing a writing speed problem that I don't understand. When the photo camera is triggered at a low rate (< 2 Hz)
> writing into the bank takes a very short time for each event (indeed, what I measure is the time to write and go back
> into the polling function). If I increase the rate to 4 Hz, I see that writing the first two events takes a sort time,
> but the third event takes a very long time (hundreds of ms), then again the fourth and fifth events are very fast, and
> the sixth is very slow. If I further increase the rate, every other event is very slow. The problem is not in the readout
> of the camera, because if I just remove the bank writing and keep the camera readout, the problem disappears. Can you
> explain this behavior? Is there any way to improve it?
>
> Below you can also find the code I use to copy the data from the camera buffer into the bank. If you have any suggestion
> to improve it, it would be really appreciated.
>
> Thank you very much,
> Francesco
>
>
>
> const char* pSrc = (const char*)bufframe.buf;
>
> for(int y = 0; y < bufframe.height; y++ ){
>
> //Copy one row
> const unsigned short* pDst = (const unsigned short*)pSrc;
>
> //go through the row
> for(int x = 0; x < bufframe.width; x++ ){
>
> WORD tmpData = *pDst++;
>
> *pdata++ = tmpData;
>
> }
>
> pSrc += bufframe.rowbytes;
>
> }
> |
02 Dec 2021, Stefan Ritt, Bug Report, Off-by-one in sequencer documentation
|
> The documentation for the sequencer loop says:
>
> <quote>
> LOOP [name ,] n ... ENDLOOP To execute a loop n times. For infinite loops, "infinite"
> can be specified as n. Optionally, the loop variable running from 0...(n-1) can be accessed
> inside the loop via $name.
> </quote>
>
> In fact the loop variable runs from 1...n, as can be seen by running this exciting
> sequencer code:
>
> 1 COMMENT "Figuring out MSL"
> 2
> 3 LOOP n,4
> 4 MESSAGE $n,1
> 5 ENDLOOP
Indeed you're right. The loop variable runs from 1...n. I fixed that in the documentation.
Stefan |
02 Dec 2021, Stefan Ritt, Forum, Sequencer error with ODB Inc
|
Thanks for reporting that bug. Indeed there was a problem in the sequencer code which I fixed now. Please try the updated develop branch.
Stefan |
26 Jan 2022, Stefan Ritt, Bug Report, Off-by-one in sequencer documentation
|
> Shades/ghosts of FORTRAN. c/c++/perl/python loops loop from 0 to n-1.
for (i=1 ; i<=10 ; i++); ;-) |
28 Jan 2022, Stefan Ritt, Bug Report, Writting MIDAS Events via FPGAs
|
I finally got the dummy program working. There were several issues:
- event_buffer_size was defined as 10000 * 32 MB = 320 GB, exceeding the RAM of the computer
- SERIAL number starting with 1. Actually in midas, event serial numbers always started with zero, but this was wrong in the documentation at
https://midas.triumf.ca/MidasWiki/index.php/Event_Structure, so I also fixed the documentation
- the event header time stamp must be seconds since 1.1.1970, and thus the function ss_time() should be used to set it
- calling set_equipment_status() for each event slows down the event collection considerably, since this function access the ODB each time
- dma_buf_dummy is defined inside the event loop, so it gets allocated and de-allocated on the stack for each event. Of course this might vanish
when the real FPGA buffer will be used.
- The line pdata+=sizeof(dma_buf_dummy); is wrong. pdata is pointer to uint32_t, but the sizeof() operation returns the size of the
dma_buf_dummy in bytes. Therefore, pdata gets incremented by four times the size of dma_buf_dummy
- Instead the call to std::this_thread::sleep_for(std::chrono::milliseconds(2000)); one can call the standard midas call ss_sleep(2000); which
is a bit shorter
- Finally, sending many events to the ring buffer triggered a bug in the midas ring buffer functions which were lingering there since 2007. I'm
glad that this happened and now could be fixed. Not sure if other experiments where affected in the last decade by that. This could have
manifested itself in lost events or crashing front-ends. Anyhow, now it's fixed. You need to update midas to get the fix.
I attached a working version of the dummy program for your reference. Banks a different but the principle should become clear.
Stefan |
10 Feb 2022, Stefan Ritt, Bug Fix, ODBINC/Sequencer Issue
|
I tried following script:
ODBSET /Equipment/ArduinoTestStation/Variables/_S_, 10
LOOP 10
WAIT seconds, 3
ODBINC /Equipment/ArduinoTestStation/Variables/_S_
ENDLOOP
and it worked as expected. So I conclude the problem must be in your script. Probably a typo in
the ODB path pointing to a 32-byte string instead to a 4-byte float.
Stefan |
10 Feb 2022, Stefan Ritt, Bug Report, History plots deceiving users into thinking data is still logging
|
The problem has been fixed on commit 825935dc on Oct. 2021 and runs fine since then at PSI. If TRIUMF people
agree, we can close that issue and proceed.
Stefan |
|