ID |
Date |
Author |
Topic |
Subject |
1855
|
16 Mar 2020 |
Konstantin Olchanski | Info | mhttpd mongoose 6.16 update | the update of mhttpd to mongoose version 6.16 was committed to the develop branch of midas. If you do not want to use this
updated code or if it causes problems, please use the mhttpd6 executable or midas from the midas-2020-03 release branch.
new features:
- IPv6 support
- built-in http proxy
- fine grain locking - serving "resource" files (html, css, etc) and serving json-rpc requests no longer takes the global lock
- reduced number of DNS queries when checking host list access (DNS replies are cached)
- (I decided to not implement caching of password requests and dynamic reload of password file - it is too hard).
internal changes:
Recent versions of the mongoose web server library have removed all their internal multithreading,
leaving the library fully single-threaded. This resulted in major simplification of many things. An improvement.
(the civetweb fork of mongoose retains the old multithreading code, that model seems to work better
which used inside ROOT). As implemented in mhttpd, all network connections are handled by the main thread,
all midas http requests are handled by worker threads that are started on the as-needed basis.
The old mongoose 6.4 based mhttpd code survived almost without changes - as a compile-time
option - so now I build 2 mhttpd executables: mhttpd with the new code and mhttpd6 with the old code
so people have something to run in case the new code bombs.
http proxy:
Experiments that use private networks usually configure the apache httpd as a web proxy to allow
access from the outside to the web-controlled devices on the private network. Making changes
to this proxy requires root access, requires restarting httpd, etc. To make things simpler, mhttpd now
includes a web proxy (almost the complete implementation is provided by the mongoose library). Configuration
is done from ODB, restarting mhttpd is not needed.
improved multithreading:
Since most of the MIDAS library is now thread-safe, mhttpd no longer needs to take the "big midas lock"
to service most web requests. Access to files, access to ODB, etc is now fully threaded. Some parts
of MIDAS are not thread-safe, i.e. access to history and log files, so a flag was added to the mjsonrpc library
to mark which RPC methods are not thread-safe.
Note that despite these improvements, mhttpd still suffers from "http head-of-queue blocking"
https://en.wikipedia.org/wiki/Head-of-line_blocking
because (i.e. the google chrome web browser) tends to use just 1 TCP connection for all JSONRPC requests,
after a request for a history read (can take a long time), all subsequent requests for web page updates, etc
will have to wait until it completes, causing unresponsive user experience. (it looks as if mhttpd is single-threaded!).
A solution for this problem is HTTP/2, which is not yet implemented by mongoose and is not quite yet available
for apache httpd.
More later...
K.O. |
1854
|
16 Mar 2020 |
Konstantin Olchanski | Release | midas-2020-03-a | midas-2020-03-a is here.
Accumulated changes and bug fixes since last tag midas-2019-09-i.
After this release, expect some instability on the develop branch as I commit the update of mhttpd to mongoose web server library
version 6.16. More on that later.
To obtain this release, either checkout the top of branch release/midas-2020-03 (recommended)
or checkout the tag midas-2020-03-a.
K.O. |
1853
|
16 Mar 2020 |
Pintaudi Giorgio | Info | MIDAS will use C++11 | About the boost library, that is exactly
what I did for a project of mine (the
calibration software for the WAGASCI
experiment). It turned out not so easy to
mantain because different Linux distros
package different versions of boost.
The reason I went down the "c++11 plus
boost" road is that the official T2K OS
is CentOS7 as well.
Looking back I think that using c++17 and
requiring a more recent version of the
compiler is much easier to maintain than
the combo c++11 + boost. In CentOS is
just a matter of installing a recent
devtool package ...
Another solution might be too repackage
boost into MIDAS so you have full control
of the environment.
> > After much discussion, and following
the MIDAS workshop at TRIUMF, we made the
decision to use C++11 in MIDAS.
> >
> > There are many benefits, and only one
drawback - no c++11 compilers in the
default OS install on older computers
(i.e.
> > RHEL/SL/CentOS before el7). (the same
applies to our use of cmake).
> >
>
> It turns out that support for the c++11
"regex" feature is missing on el7
(CentOS-7, our most common platform at
TRIUMF).
>
> According to
https://stackoverflow.com/questions/12530
406/is-gcc-4-8-or-earlier-buggy-about-
regular-expressions
> gcc 4.9.0 is the first one to implement
c++11 regular expressions. el7 comes with
gcc-4.8.5 and I confirm
> that examples of using
std::regex_replace() do not compile. I
was looking to use std::regex_replace to
implement URL rewriting
> in the reverse proxy code in mhttpd.
>
> I do not need this feature immediately,
but I am surprised that such a thing can
happen, thought others should know.
>
> K.O. |
1852
|
16 Mar 2020 |
Konstantin Olchanski | Info | MIDAS will use C++11 | > After much discussion, and following the MIDAS workshop at TRIUMF, we made the decision to use C++11 in MIDAS.
>
> There are many benefits, and only one drawback - no c++11 compilers in the default OS install on older computers (i.e.
> RHEL/SL/CentOS before el7). (the same applies to our use of cmake).
>
It turns out that support for the c++11 "regex" feature is missing on el7 (CentOS-7, our most common platform at TRIUMF).
According to https://stackoverflow.com/questions/12530406/is-gcc-4-8-or-earlier-buggy-about-regular-expressions
gcc 4.9.0 is the first one to implement c++11 regular expressions. el7 comes with gcc-4.8.5 and I confirm
that examples of using std::regex_replace() do not compile. I was looking to use std::regex_replace to implement URL rewriting
in the reverse proxy code in mhttpd.
I do not need this feature immediately, but I am surprised that such a thing can happen, thought others should know.
K.O. |
1851
|
10 Mar 2020 |
Konstantin Olchanski | Info | MIDAS vs JSROOT web pages | Just FYI, I am looking at the ROOT web programming component JSROOT and I notice that the RPC mechanism quite different from the JSON-
RPC I implemented for MIDAS.
https://github.com/root-project/jsroot/blob/master/docs/HttpServer.md (explanation of JSROOT RPC and server side machinery)
https://github.com/root-project/jsroot/blob/master/docs/JSROOT.md (explanation of JSROOT javascript library)
Then I looked at the dates:
MIDAS mjsonrpc was done at the end of 2013
JSROOT main development started at the end of 2014.
The web server component in both projects is (almost) the same - vanilla mongoose in mhttpd
and civetweb, a fork of an older version of mongoose, in ROOT/JSROOT.
The web server in both projects is partially multithreaded:
- ROOT THttpServer/TCivetWeb uses multiple threads to handle the network connections and some file access,
but interaction with ROOT is done in the main thread of ROOT. (The main thread must periodically call ProcessRequests()).
- mhttpd uses a single thread to multiplex the network connections (it is a change from old mongoose/civetweb to current mongoose 6.16),
but all requests are farmed to a pool of threads and execute in parallel (unless not thread-safe, i.e. accessing history files).
Both implementations suffer from "head of queue" blocking, a "slow" request i.e. a slow file read, will
delay subsequent quick requests, see https://en.wikipedia.org/wiki/Head-of-line_blocking#In_HTTP
Solution for this problem is to use HTTP/2 when it becomes supported in mongoose/civetweb/apache httpd (in el7).
It will be interesting to see which on of the two systems works better for building "user facing" web pages... especially
hybrid pages that have to pull data both from midas (using mjsonrpc) and from online ROOT analyzers (using jsroot).
K.O. |
1850
|
08 Mar 2020 |
Konstantin Olchanski | Forum | RPC error | I do not see this error, but there was one more report (they did not clearly say what http errors
they see) https://bitbucket.org/tmidas/midas/issues/209/get-rid-of-mjsonrpc-dialogs-put-it-to-
the
To debug this, I need to know: what version of MIDAS, what version of what web browser, what
computer is mhttpd running on? (so I can go look at the log files).
Also can you say more when you see these errors? Every time from every midas web page, or only
some pages or only when you do something specific (push some button, etc?).
> I ported a bunch of frontends to C++ and now I'm occasionally getting this RPC
> error message:
>
> http error: readyState: 4, HTTP status: 502 (Proxy Error), batch request: method:
> "db_get_values", params: [object Object], id: 1583456958869 method: "get_alarms",
> params: null, id: 1583456958869 method: "cm_msg_retrieve", params: [object
> Object], id: 1583456958869 method: "cm_msg_retrieve", params: [object Object],
> id: 1583456958869
>
> I'm assuming I'm doing wrong something somewhere, but does this message contain
> information where to look? Does the ID mean something?
It is unlikely that this error has anything to do with the frontends: usually web page interaction
goes through: web browser - network - apache httpd - localhost - mhttpd - midas odb.
http error 502 is very generic, does not tell us much about what happened, there may be more
information in the httpd log files.
the json-rpc request "id" is generated by midas code in the web browser and it currently is a
timestamp. it is not used for anything. but it is required by the json-rpc standard.
K.O.
K.O. |
1849
|
06 Mar 2020 |
Lars Martin | Forum | RPC error | I ported a bunch of frontends to C++ and now I'm occasionally getting this RPC
error message:
http error: readyState: 4, HTTP status: 502 (Proxy Error), batch request: method:
"db_get_values", params: [object Object], id: 1583456958869 method: "get_alarms",
params: null, id: 1583456958869 method: "cm_msg_retrieve", params: [object
Object], id: 1583456958869 method: "cm_msg_retrieve", params: [object Object],
id: 1583456958869
I'm assuming I'm doing wrong something somewhere, but does this message contain
information where to look? Does the ID mean something? |
1848
|
03 Mar 2020 |
Berta Beltran | Bug Report | Compiling Midas in OS 10.15 Catalina | Thanks Konstantin,
I will keep an eye for the next release so that I can update my Midas to include ssl libraries.
Thanks
> > > [midas/build] $ cmake -D NO_SSL=1 ..
> > If I run the compilation with the flag NO_SSL it works just fine. ...
>
> FYI, the mongoose616 branch now has support for using the mbedtls https library,
> this library seems to build easily from sources and removes our dependency
> on where/how/which openssl library is installed. I hope to have this included
> in the next release of midas.
>
> K.O. |
1847
|
28 Feb 2020 |
Konstantin Olchanski | Bug Report | Compiling Midas in OS 10.15 Catalina | > > [midas/build] $ cmake -D NO_SSL=1 ..
> If I run the compilation with the flag NO_SSL it works just fine. ...
FYI, the mongoose616 branch now has support for using the mbedtls https library,
this library seems to build easily from sources and removes our dependency
on where/how/which openssl library is installed. I hope to have this included
in the next release of midas.
K.O. |
Draft
|
23 Feb 2020 |
Marius Koeppel | Forum | Writting Midas Events via FPGAs | > > We also agree and found the problem now.
>
> Good. what was wrong?
>
> > - Own DMA engine since we are doing burst writing DMA with PCIe 3.0.
> > - Own device driver
>
> Scary stuff.
>
> > - no interrupts
>
> Right. Best I can tell, interrupts no longer useful in Linux - interrupt handler cannot do any real work, has to hand off to a kernel thread, resulting
> in so much latency and overhead that one might as well poll for the data... And for DMA data transfers, the data rate is well known,
> so easy to predict how long the DMA will run for and sleep for that amount of time instead of waiting for an interrupt.
>
> K.O.
So the problem was that we assumed that the bank (with the header) needs to be 64bit aligned. Even more we aligned the hole Midas event to 256bit in the fpga since we have a 250mhz x 256 Bit interface for PCIe. But then we saw that you align the bank data to 64bit -> crash of mdump etc. For now we generate the data on the FPGA in the „old“ Midas format. So having a flag for changing to a different alignment would be actually really nice.
Cheers,
Marius |
1845
|
21 Feb 2020 |
Stefan Ritt | Forum | Writting Midas Events via FPGAs | > Hi, Stefan - is this our famous 64-bit misalignement? Where we have each alternating bank aligned and misaligned at 64 bits? Without changing the data
> format, one can always store data in 64-bit aligned banks by inserting a dummy banks between real banks:
>
> event header
> bank header
> bank1 --- 64-bit aligned --- with data
> bank2 --- misaligned, no data
> bank3 --- 64-bit aligned --- with data
> bank4 --- misaligned, no data
> ...
>
> for sure, wastes space for bank2, bank4, etc, but at 12 bytes per bank, maybe this is negligible overhead compared to total event size.
>
> BTW, aligned-to-64-bit is old news. The the PWB FPGA, I have 128-bit data paths to DDR RAM, the data has to be aligned to 128 bits, or else!
Ok, so what about the following: When we do a bk_init32, we add a parameter "alignment", which might be 1,4,8,16 and "old". We store this alignment in the bank header, so the
decoding works correctly. Now "old" means the current encoding, which is screwed up and produces the results you mention above, but we have to keep it (actually make it the
default!) for backward compatibility. But then we can ask for 64-bit alignment or even 128-bit alignment if that helps the DAQ speed.
The only problem I see is if one writes data with the new library using 128-bit alignment for example, and wants to read it back with old code. Then it would explode. So if we
make this modification, we have to announce it carefully and also adjust all ROOTANA & Co libraries to read back any midas data.
Stefan |
1844
|
21 Feb 2020 |
Konstantin Olchanski | Forum | Writting Midas Events via FPGAs | > We also agree and found the problem now.
Good. what was wrong?
> - Own DMA engine since we are doing burst writing DMA with PCIe 3.0.
> - Own device driver
Scary stuff.
> - no interrupts
Right. Best I can tell, interrupts no longer useful in Linux - interrupt handler cannot do any real work, has to hand off to a kernel thread, resulting
in so much latency and overhead that one might as well poll for the data... And for DMA data transfers, the data rate is well known,
so easy to predict how long the DMA will run for and sleep for that amount of time instead of waiting for an interrupt.
K.O. |
1843
|
21 Feb 2020 |
Konstantin Olchanski | Forum | Writting Midas Events via FPGAs | Hi, Stefan - is this our famous 64-bit misalignement? Where we have each alternating bank aligned and misaligned at 64 bits? Without changing the data
format, one can always store data in 64-bit aligned banks by inserting a dummy banks between real banks:
event header
bank header
bank1 --- 64-bit aligned --- with data
bank2 --- misaligned, no data
bank3 --- 64-bit aligned --- with data
bank4 --- misaligned, no data
...
for sure, wastes space for bank2, bank4, etc, but at 12 bytes per bank, maybe this is negligible overhead compared to total event size.
BTW, aligned-to-64-bit is old news. The the PWB FPGA, I have 128-bit data paths to DDR RAM, the data has to be aligned to 128 bits, or else!
K.O.
> Actually the cause of all of the is a real bug in the midas functions. We want each bank 8-byte aligned, so there is code in bk_close like:
>
> midas.cxx:14788:
> ((BANK_HEADER *) event)->data_size += sizeof(BANK32) + ALIGN8(pbk32->data_size);
>
> While the old sizeof(BANK)=8, the extended sizeof(BANK32)=12, so not 8-byte aligned. This code should rather be:
>
> ((BANK_HEADER *) event)->data_size += ALIGN8(sizeof(BANK32) +pbk32->data_size);
>
> But if we change that, it would break every midas data file on this planet!
>
> The only chance I see is to use the "flags" in the BANK_HEADER to distinguish a current bank from a "correct" bank.
> So we could introduce a flag BANK_FORMAT_ALIGNED which distinguishes between the two pieces of code above.
> Then bk_iterate32 would look at that flag and do the right thing.
>
> Any thoughts?
>
> Best,
> Stefan |
1842
|
20 Feb 2020 |
Stefan Ritt | Forum | Writting Midas Events via FPGAs | Actually the cause of all of the is a real bug in the midas functions. We want each bank 8-byte aligned, so there is code in bk_close like:
midas.cxx:14788:
((BANK_HEADER *) event)->data_size += sizeof(BANK32) + ALIGN8(pbk32->data_size);
While the old sizeof(BANK)=8, the extended sizeof(BANK32)=12, so not 8-byte aligned. This code should rather be:
((BANK_HEADER *) event)->data_size += ALIGN8(sizeof(BANK32) +pbk32->data_size);
But if we change that, it would break every midas data file on this planet!
The only chance I see is to use the "flags" in the BANK_HEADER to distinguish a current bank from a "correct" bank.
So we could introduce a flag BANK_FORMAT_ALIGNED which distinguishes between the two pieces of code above.
Then bk_iterate32 would look at that flag and do the right thing.
Any thoughts?
Best,
Stefan |
1841
|
20 Feb 2020 |
Marius Koeppel | Forum | Writting Midas Events via FPGAs |
We also agree and found the problem now. Since we build everything (MIDAS Event Header, Bank Header, Banks etc.) in the FPGA we had some struggle with the MIDAS data format (http://lmu.web.psi.ch/docu/manuals/bulk_manuals/software/midas195/html/AppendixA.html). We thought that only the MIDAS Event needs to be aligned to 64 bit but as it turned out also the bank data (Stefan updated the wiki page already) needs to be aligned. Since we are using the BANK32 it was a bit unclear for us since the bank header is not 64 bit aligned. But we managed this now by adding empty data and the system is running now.
Our setup looks like this:
Software:
- mfe.cxx multithread equipment
- mfe readout thread grabs pointer from dma ring buffer
- since the dma buffer is volatile we do copy_n for transforming the data to MIDAS
- the data is already in the MIDAS format so done from our side :)
- mfe readout thread increments the ring buffer
- mfe main thread grabs events from ring buffer, sends them to the mserver
Firmware:
- Arria 10 development board
- Altera PCIe block
- Own DMA engine since we are doing burst writing DMA with PCIe 3.0.
- Own device driver
- no interrupts
If you have more questions fell free to ask. |
Draft
|
20 Feb 2020 |
Marius Koeppel | | |
We also agree and found the problem now. Since we build everything (MIDAS Event Header, Bank Header, Banks etc.) in the FPGA we had some struggle with the MIDAS data format (http://lmu.web.psi.ch/docu/manuals/bulk_manuals/software/midas195/html/AppendixA.html). We thought that only the MIDAS Event needs to be aligned to 64 bit but as it turned out also the bank data (Stefan updated the wiki page already) needs to be aligned. Since we are using the BANK32 it was a bit unclear for us since the bank header is not 64 bit aligned. But we managed this now by adding empty data and the system is running now.
Our setup looks like this:
- mfe.cxx multithread equipment
- mfe readout thread grabs pointer from dma ring buffer
- since the dma buffer is volatile we do copy_n for transforming the data to MIDAS
- the data is already in the MIDAS format so done from our side :)
- mfe readout thread increments the ring buffer
- mfe main thread grabs events from ring buffer, sends them to the mserver
From the firmware side we have an Arria 10 development board and
But now I am curious, which DMA controller you use? The Altera or Xilinx PCIe block with the vendor supplied DMA driver? Or you do DMA on an ARM SoC FPGA? (no PCI/PCIe,
different DMA controller, different DMA driver).
I am curious because we will be implementing pretty much what you do on ARM SoC FPGAs pretty soon, so good to know
if there is trouble to expect.
But I will probably use the tmfe.h c++ frontend and a "pure c++" ring buffer instead of mfe.cxx and the midas "rb" ring buffer.
(I did not look at your code at all, there could be a bug right there, this ring buffer stuff is tricky. With luck there is no bug
in your dma driver. The dma drivers for our vme bridges did do have bugs).
K.O. |
1839
|
20 Feb 2020 |
Konstantin Olchanski | Forum | Difference between "Event Data Size" and "All Bank Size" | > Thanks for pointing out this error. The "All Bank Size" contains the size of all banks including their
> bank headers, but NOT the global bank header itself. I modified the documentation accordingly.
>
> If you want to study the C code which tells you how to fill these headers, look at midas.cxx line
> 14788.
Also take a look at the midas event parser in ROOTANA midasio.cxx, the code is pretty clean c++
https://bitbucket.org/tmidas/rootana/src/master/libMidasInterface/midasio.cxx
But Stefan's code in midas.cxx and in the documentation is the authoritative information.
K.O. |
1838
|
20 Feb 2020 |
Konstantin Olchanski | Forum | Writting Midas Events via FPGAs | > rb_xxx functions are midas event agnostic. The receiving side in mfe.cxx (lines 1418 in receive_trigger_event) however pulls one event at a time. If you
> have some inconsistency I would put some debugging code there.
I agree with Stefan, I do not think there is any bugs in the ring buffer code.
But. I do not think we ever did DMA the data directly into the ring buffer. Hmm...
I just checked, this is what we do (and this worked in the ALPHA Si-strip DAQ system for 10 years now):
- mfe.cxx multithread equipment
- mfe readout thread grabs pointer from ring buffer
- mfe creates event headers, etc
- calls our read_event() function
- creates data bank
- DMA data into the data bank (this is the DMA from VME block reads, using DMA controller inside the UniverseII and tsi148 VME-to-PCI bridges)
- close data bank
- return to mfe
- mfe readout thread increments the ring buffer
- mfe main thread grabs events from ring buffer, sends them to the mserver
So there could be trouble:
a) the ring buffer code does not have the required "volatile" (ahem, "atomic") annotations, so DMA may have a bad interaction with compiler optimizations (values stored in registers
instead of in memory, etc)
b) the DMA driver must doctor the memory settings to (1) mark the DMA target memory uncachable or (1b) invalidate the cache after DMA completes, (2) mark the DMA target
memory unswappable.
So I see possibilities for the ring buffer to malfunction.
But now I am curious, which DMA controller you use? The Altera or Xilinx PCIe block with the vendor supplied DMA driver? Or you do DMA on an ARM SoC FPGA? (no PCI/PCIe,
different DMA controller, different DMA driver).
I am curious because we will be implementing pretty much what you do on ARM SoC FPGAs pretty soon, so good to know
if there is trouble to expect.
But I will probably use the tmfe.h c++ frontend and a "pure c++" ring buffer instead of mfe.cxx and the midas "rb" ring buffer.
(I did not look at your code at all, there could be a bug right there, this ring buffer stuff is tricky. With luck there is no bug
in your dma driver. The dma drivers for our vme bridges did do have bugs).
K.O. |
1837
|
20 Feb 2020 |
Konstantin Olchanski | Bug Report | RPC Error: ACK or other control chars from "db_get_values" | > The unexpected token is \0x6
> RPC Error json parser exception: SyntaxError: JSON.parse: bad control character in string literal at line 80 column 30 of the JSON data, method: "db_get_valus", params: [object Object], id: 1582020074098.
Yes, there is a problem.
Traditionally, midas strings in ODB have no restriction on the content (I think even the '\0x0' char is permitted).
But web browser javascript strings are supposed to be valid unicode (UTF-16, if I read this right: https://tc39.es/ecma262/#sec-ecmascript-language-types-string-type).
The collision between the two happens when ODB values are json-encoded by midas, then json-decoded by the web browser.
The midas json encoder (mjson.h, mjson.cxx) encodes ODB strings according to JSON rules, but does not ensure that the result is valid UTF-8. (valid UTF-8 is not required, if I read the specs correctly http://www.ecma-
international.org/publications/files/ECMA-ST/ECMA-404.pdf and https://www.json.org/json-en.html)
The web browser json decoder requires valid UTF-8 and throws exceptions if it does not like something. Different browsers it slightly differently, so we have an error handler for this in the mjsonrpc results processor.
What does this mean in practice?
Now that MIDAS is very web oriented, MIDAS strings must be web browser friendly, too:
a) all ODB key names (subdirectory names, link names, etc) must be UTF-8 unicode, and this has been enforced by ODB for some time now.
b) all ODB string values must be valid UTF-8 unicode. This is not enforced right now.
Historically, it was okey to use ODB TID_STRING to store arbitrary binary data, but now, I think, we must deprecate this,
at least for any ODB entries that could be returned to a web browser (which means all of them, after we implement a fully
html+javascript odb editor). For storing binary data, arrays of TID_CHAR, TID_DWORD & co are probably a better match.
The MIDAS and ROOTANA json decoders (the same mjson.h, mjson.cxx) do not care about UTF-8, so ODB dumps
in JSON format are not affected by any of this. (But I am not sure about the JSON decoder in ROOT).
Bottom line:
I think db_validate() should check for invalid UTF-8 in ODB key names and in TID_STRING values
and at least warn the user. (I am not sure if invalid UTF-8 can be fixed automatically). db_create()
should reject key names that are not valid UTF-8 (it already does this, I think). db_set_value(TID_STRING) should
probably reject invalid UTF-8 strings, this needs to be discussed some more.
https://bitbucket.org/tmidas/midas/issues/215/everything-in-odb-must-be-valid-utf-8
K.O. |
1836
|
18 Feb 2020 |
Stefan Ritt | Bug Report | RPC Error: ACK or other control chars from "db_get_values" | You are the first one reporting this error, so it must be due to your values in the ODB. Can you track it down to specific ODB contents? If so, can you post it so that I can reproduce your error?
Stefan |
|