ELOG Midas

Back Midas Rome Roody Rootana

Midas DAQ System, Page 112 of 152

Not logged in

Find | Login | Help

Full | Summary | Threaded | Show attachments

3028 Entries

Goto page Previous 1, 2, 3 ... 111, 112, 113 ... 150, 151, 152 Next

ID	Date	Author	Topic	Subject
2195	02 Jun 2021	Konstantin Olchanski	Suggestion	Have a list of 'users responsible' in Alarms and Programs odb entries
> This list of responsible being attached to alarm message strings ... This is a great idea. But I think we do not need to artificially limit ourselves to string and array lengths. The code in alarm.c should be changes to use std::string and std::vector<std::string> (STRING_LIST #define), db_get_record() should be replaced with individual ODB reads (that's what it does behind the scenes, but in a non-type and -size safe way). I think the web page code will work correctly, it does not care about string lengths. K.O.
2196	02 Jun 2021	Konstantin Olchanski	Bug Report	Wrong location for mysql.h on our Linux systems
> % mariadb_config --cflags > -I/usr/include/mariadb -I/usr/include/mariadb/mysql I get similar, both .../include and .../include/mysql are in my include path, so both #include "mysql/mysql.h" and #include "mysql.h" work. I added a message to cmake to report the MySQL CFLAGS and libraries, so next time this is a problem, we can see what happened from the cmake output: 4ed0:midas olchansk$ make cmake \| grep MySQL ... -- MIDAS: Found MySQL version 10.4.16 -- MIDAS: MySQL CFLAGS: -I/opt/local/include/mariadb-10.4/mysql;-I/opt/local/include/mariadb- 10.4/mysql/mysql and libs: -L/opt/local/lib/mariadb-10.4/mysql/ -lmariadb K.O.
2197	02 Jun 2021	Konstantin Olchanski	Info	label ordering in history plot
> is there any way to order the labels in the history plot legend? In the old > system there was the “order” column in the config panel, but I can not find it > in the new system. Thanks in advance for the support. correct, for reasons unknown, the function to reorder and to delete individual entries was removed from the history panel editor. K.O.
2198	02 Jun 2021	Konstantin Olchanski	Info	label ordering in history plot
> > is there any way to order the labels in the history plot legend? In the old > > system there was the “order” column in the config panel, but I can not find it > > in the new system. Thanks in advance for the support. > > correct, for reasons unknown, the function to reorder and to delete individual > entries was removed from the history panel editor. > > K.O. https://bitbucket.org/tmidas/midas/issues/284/history-panel-editor-reordering-of K.O.
2199	02 Jun 2021	Konstantin Olchanski	Bug Report	Bug "is of type"
> Hi, > > I am running a simple FE executable that is supposed to define a PRAW DWORD bank. > The issue is that, right after the start of the run, the logger crashes without messages. > Then the FE reports this error, which is rather confusing. > ``` > 12:59:29.140 2021/05/24 [feTestDatastruct,ERROR] [odb.cxx:6986:db_set_data1,ERROR] "/Equipment/Trigger/Variables/PRAW" is of type UINT32, not UINT32 > ``` I think this is fixed in latest midas. There was a typo in this message, the same tid was printed twice, with result you report "mismatch UINT32 and UINT32", instead of "mismatch of UINT32 vs what is actually there". This fixes the message, after that you have to manually fix the mismatch in the data type in ODB (delete old one, I guess). K.O.
2200	02 Jun 2021	Konstantin Olchanski	Bug Report	mhttpd WebServer ODBTree initialization
> > Thanks a lot, this solved my issue! > > ... or we should turn IPv6 off by default, since not many people use this right now. IPv6 certainly works and is used at CERN. But I am not sure why people see this message. I do not see it on any machines at TRIUMF, even those with IPv6 turned off. K.O.
2201	02 Jun 2021	Konstantin Olchanski	Bug Report	History formula not correctly managed
> OS: OSX 10.14.6 Mojave > MIDAS: Downloaded from repo on April 2021. > > I have a slow control frontend doing the command/readout of a MPOD HV/LV. Since I am reading out the current that are in nA (after updating snmp), I wanted to multiply the number by 1e9. > > I noticed the new "Formula" field (introduced in 2019 it seems) instead of the "Factor/Offset" I was used to. None of my entries seems to be accepted (after hitting save, when coming back thee field is empty). > > Looking in ODB in "/History/Display/MPOD/HV (Current)/", the field "Formula" is a string of size 32 (even if I have multiple plots in that display). I noticed that the fields "Factor" and "Offset" are still existing and they are arrays with the correct size. However, changing the values does not seem to do anything. > > Deleting "Formula" by hand and creating a new field as an array of string (of correct length) seems to do the trick: the formula is displayed in the History display config, and correctly used. I see this, too. Problem is that the history plot code must be compatible with both the old scheme (factor/offset) and the new scheme (formula). But something goes wrong somewhere. https://bitbucket.org/tmidas/midas/issues/307/history-plot-config-incorrect-in-odb Why? - new code cannot to "3 year" plots, old code has no problem with it - old experiments (alpha1, etc) have only the old-style history plot definitions, and both old and new plotting code should be able to show them (there is nobody to convert this old stuff to the "new way", but we still desire to be able to look at it!) K.O.
2202	02 Jun 2021	Konstantin Olchanski	Info	bitbucket build truncated
I truncated the bitbucket build to only build on ubuntu LTS 20.04. Somehow all the other build targets - centos-7, centos-8, ubuntu-18 - have an obsolete version of cmake. I do not know where the bitbucket os images get these obsolete versions of cmake - my centos-7 and centos-8 have much more recent versions of cmake. If somebody has time to figure it out, please go at it, I would like very much to have centos-7 and centos-8 builds restored (with ROOT), also to have a ubuntu LTS 20.04 build with ROOT. (For me, debugging bitbucket builds is extremely time consuming). Right now many midas cmake files require cmake 3.12 (released in late 2018). I do not know why that particular version of cmake (I took the number from the tutorials I used). I do not know what is the actual version of cmake that MIDAS (and ROOTANA) require/depend on. I wish there were a tool that would look at a cmake file, examine all the features it uses and report the lowest version of cmake that supports them. K.O.
2204	04 Jun 2021	Konstantin Olchanski	Bug Report	cmake with CMAKE_INSTALL_PREFIX fails
> cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas good timing, I am working on cmake for manalyzer and rootana and I have not tested the install prefix business. now I know to test it for all 3 packages. I will also change find_package(Midas) slightly, (see my other message here), I hope you can confirm that I do not break it for you. K.O.
2205	04 Jun 2021	Konstantin Olchanski	Info	MidasConfig.cmake usage
> find_package(Midas) I am testing find_package(Midas). There is a number of problems: 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe". This seem to be an incomplete list of all libraries build by midas (rmana is missing). This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc): - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred) - midas-c-compat is for building python interfaces, not for linking midas programs - mfe contains a main() function, it will collide with the user main() function So I think this should be changed to just "midas" and midas linking dependancy libraries (-lutil, -lrt, -lpthread) should also be added to this list. Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time) 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand. Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy. I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)), this second method seems to be much simpler, everything is exported automatically into one file, and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)). So how much time should I spend in fixing find_package(Midas) to make it generally usable? - include path is incomplete - library list is nonsense - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc) - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc) K.O.
2206	04 Jun 2021	Konstantin Olchanski	Bug Report	cmake with CMAKE_INSTALL_PREFIX fails
> cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas > Is the cmake setup not relocatable? This is new and was working until recently: Indeed. Not relocatable. This is because we do not install the header files. When you use the CMAKE_INSTALL_PREFIX, you get MIDAS "installed" in: prefix/lib prefix/bin $MIDASSYS/include <-- this is the source tree and so not "relocatable"! Before, this was kludged and cmake did not complain about it. Now I changed cmake to handle the include path "the cmake way", and now it knows to complain about it. I am not sure how to fix this: we have a conflict between: - our normal way of using midas (include $MIDASSYS/include, link $MIDASSYS/lib, run $MIDASSYS/bin) - the cmake way (packages must be installed or else! but I do like install(EXPORT)!) - and your way (midas include files are in $MIDASSYS/include, everything else is in your special location) I think your case is strange. I am curious why you want midas libraries to be in prefix/lib instead of in $MIDASSYS/lib (in the source tree), but are happy with header files remaining in the source tree. K.O.
2207	04 Jun 2021	Konstantin Olchanski	Info	MidasConfig.cmake usage
> > find_package(Midas) > > So how much time should I spend in fixing find_package(Midas) to make it generally usable? > > - include path is incomplete > - library list is nonsense > - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc) > - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc) > I think I give up on find_package(Midas). It seems like a lot of work to straighten all this out, when install(EXPORT) does it all automatically and is easier to use for building user frontends and analyzers. K.O.
2210	08 Jun 2021	Konstantin Olchanski	Bug Report	cmake with CMAKE_INSTALL_PREFIX fails
> > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas > > > Is the cmake setup not relocatable? This is new and was working until recently: > > Not relocatable. This is because we do not install the header files. > > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you mean by this? you install midas in a secret location to prevent somebody from linking to it? > If I think an all other packages I am working with, e.g. ROOT, the includes are also installed under CMAKE_INSTALL_PREFIX. cmake and other frameworks tend to be like procrustean beds (https://en.wikipedia.org/wiki/Procrustes), pre-cmake packages never quite fit perfectly, and either the legs or the heads get cut off. post-cmake packages are constructed to fit the bed, whether it makes sense or not. given how this situation is known since antiquity, I doubt we will solve it today here. (I exercise my freedom of speech rights to state that I object being put into such situations. And I would like to have it clear that I hate cmake (ask me why)). > > Up until recently there was no issue to work with CMAKE_INSTALL_PREFIX, accepting that the includes stay under > $MIDASSYS/include, even though this is not quite the standard way, but no problem here. > I think a solution would be to add install rules for include files. There will be a bit of trouble, normal include path is $MIDASSYS/include,$MIDASSYS/mxml,$MIDASSYS/mjson,etc, after installing it will be $CMAKE_INSTALL_PREFIX/include (all header files from different git submodules all dumped into one directory). I do not know what problems will show up from that. I think if midas is used as a subproject of a bigger project, this is pretty much required (and I have seen big experiments, like STAR and ND280, do this type of stuff with CMT, another horror and the historical precursor of cmake) The problem is that we do not have any super-project like this here, so I cannot ever be sure that I have done everything correctly. cmake itself can be helpful, like in the current situation where it told us about a problem. but I will never trust cmake completely, I see cmake do crazy and unreasonable things way too often. One solution would be for you or somebody else to contribute such a cmake super-project, that would build midas as a subproject, install it with a CMAKE_INSTALL_PREFIX and try to link some trivial frontend or analyzer to check that everything is installed correctly. It would become an example for "how to use midas as a subproject"). Ideally, it should be usable in a bitbucket automatic build (assuming bitbucket has correct versions of cmake, which it does not half the time). P.S. I already spent half-a-week tinkering with cmake rules, only to discover that I broke a kludge that allows you to do something strange (if I have it right, the CMAKE_PREFIX_INSTALL code is your contribution). This does not encourage me to tinker with cmake even more. who knows against what other kludge I bump into. (oh, yes, I know, I already bumped into the nonsense find_package(Midas) implementation). K.O.
2213	10 Jun 2021	Konstantin Olchanski	Bug Report	cmake with CMAKE_INSTALL_PREFIX fails
> > > > > cmake ../ -DCMAKE_INSTALL_PREFIX=/usr/local/midas > > > > > Is the cmake setup not relocatable? This is new and was working until recently: > > > > Not relocatable. This is because we do not install the header files. > > > > > > We do it this way, since the lib and bin needs to be in a place where standard users have no access to. > > > > hmm... i did not get this. "needs to be in a place where standard users have no access to". what do you > > mean by this? you install midas in a secret location to prevent somebody from linking to it? > > > > This was a wrong wording from my side. We do not want the the users have write access to the midas installation libs and bins. > I have submitted the pull request which should resolve this without interfere with your usage. > Hope this will resolve the issue. Excellent. I think it is good to have midas "install" in a sane manner. But I still struggle to understand what you do. Presumably you can "install" midas in the "midas account", which is not writable by the experiment and user accounts. Then it does not matter if you "install" it in it's build directory (like we do) or in some other location (like you do now). This does not work of course if you only have one account, so do you build midas as root? or install it as root? I do ask because in the current computing world, doing things as root requires a certain amount of trust, which may not be there anymore, see the recent "supply side" attacks against python packages, solar winds hack, linux kernel malicious patches from umn, etc. Personally, I do not want to answer questions "is midas safe to run as root?", "can I trust the midas install scripts to run as root?" and certainly I do not want to hear about "I installed midas and 100 other packages as root and got hacked 7 days later". (and running midas as root was never safe. neither mhttpd nor mserver will pass a security audit). Anyhow, looks like I will look at cmake again next week. Right now I have a major breakthrough in the ALPHA-g experiment, my big 96-port Juniper switch suddenly has working ethernet flow control and I can record data at 600 Mbytes/sec without any UDP packet loss. Above that, my event builder explodes. I want to fix it and get it up to 1000 Mbytes/sec, the limit of my 10gige network link. (In this system I do not have the disk subsystem to record data at this rate, but I have build 8-disk ZFS arrays that would sink it, no problem). And the day has come when I ran out of CPU cores. The UDP packet receivers are multithreaded, the event builder is multithreaded and I am using all 4 of the available cores (intel cpu). As soon as I can get a rackmounted AMD Ryzen or Threadripper machine, we will likely upgrade. (need at least one more CPU core to run the online analyzer!). Exciting. K.O.
2215	15 Jun 2021	Konstantin Olchanski	Info	blog - convert tmfe_rev0 event builder to develop-branch tmfe c++ framework
Now we are converting the alpha-g event builder from rev0 tmfe (midas-2020-xx) to the new tmfe c++ framework in midas-develop. Earlier, I followed the steps outlined in this blog to convert this event builder from mfe.c framework to rev0 tmfe. - get latest midas-develop - examine progs/tmfe_example_everything.cxx - open feevb.cxx - comment-out existing main() function - from tmfe_example_everything.cxx, copy class FeEverything and main() to the bottom of feevb.cxx - comment-out old main() - make sure we include the correct #include "tmfe.h" - rename example frontend class FeEverything to FeEvb - rename feevb's "rpc handler" and "periodic handler" class EvbEq to EqEvb - update class declaration and constructor of EqEvb from EqEverything in example_everything: EqEvb extends TMFeEquipment, EqEvb constructor calls constructor of base class (c++ bogosity), keep the bits of the example that initialize the equipment "common" - in EqEvb, remove data members fMfe and fEq: fMfe is now inherited from the base class, fEq is now "this" - in FeEvb constructor, wire-in the EqEvb constructor: FeSetName("feevb") and FeAddEquipment(new EqEvb("EVB",__FILE__)) - migrate function names: - fEq->SendEvent() with EqSendEvent() - fEq->SetStatus() with EqSetStatus() - fEq->ZeroStatistics() with EqZeroStatistics() -- can be removed, taken care of in the framework - fEq->WriteStatistics() with EqWriteStatistics() -- can be removed, taken care of in the framework - (my feevb.o now compiles, but will not work, yet, keep going:) - EqEvb - update prototypes of all HandleFoo() methods per example_everything.cxx or per tmfe.h: otherwise the framework will not call them. c++ compiler will not warn about this! - migrate old main(): - restore initialization of "common" and other things done in the old main(): - TMFeCommon was merged into TMFeEquipment, move common->Foo = ... to the EqEvb constructor, consult tmfe.h and tmfe.md for current variable names. - consider adding "fEqConfReadConfigFromOdb = false;" (see tmfe.md) - if EqEvb has a method Init() called from old main(), change it's name to HandleInit() with correct arguments. - split EqEvb constructor: leave initialization of "common" in the constructor, move all functions, etc into HandleInit() - move fMfe->SetTransitionSequenceFoo() calls to HandleFrontendInit() - move fMfe->DeregisterTransition{Pause,Resume}() to HandleFrontendInit() - old main should be empty now - remove linking tmfe_rev0.o from feevb Makefile, now it builds! - try to run it! - it works! - done. K.O.
2216	15 Jun 2021	Konstantin Olchanski	Info	1000 Mbytes/sec through midas achieved!
I am sure everybody else has 10gige and 40gige networks and are sending terabytes of data before breakfast. Myself, I only have one computer with a 10gige network link and sufficient number of daq boards to fill it with data. Here is my success story of getting all this data through MIDAS. This is the anti-matter experiment ALPHA-g now under final assembly at CERN. The main particle detector is a long but thin cylindrical TPC. It surrounds the magnetic bottle (particle trap) where we make and study anti-hydrogen. There are 64 daq boards to read the TPC cathode pads and 8 daq boards to read the anode wires and to form the trigger. Each daq board can produce data at 80-90 Mbytes/sec (1gige links). Data is sent as UDP packets (no jumbo frames). Altera FPGA firmware was done here at TRIUMF by Bryerton Shaw, Chris Pearson, Yair Lynn and myself. Network interconnect is a 96-port Juniper switch with a 10gige uplink to the main daq computer (quad core Intel(R) Xeon(R) CPU E3-1245 v6 @ 3.70GHz, 64 GBytes of DDR4 memory). MIDAS data path is: UDP packet receiver frontend -> event builder -> mlogger -> disk -> lazylogger -> CERN EOS cloud storage. First chore was to get all the UDP packets into the main computer. "U" in UDP stands for "unreliable", and at first, UDP packets have been disappearing pretty much anywhere they could. To fix this, in order: - reading from the udp socket must be done in a dedicated thread (in the midas context, pauses to write statistics or check alarms result in lost udp packets) - udp socket buffer has to be very big - maximum queue sizes must be enabled in the 10gige NIC - ethernet flow control must be enabled on the 10gige link - ethernet flow control must be enabled in the switch (to my surprise many switches do not have working end-to-end ethernet flow control and lose UDP packets, ask me about this. our big juniper switch balked at first, but I got it working eventually). - ethernet flow control must be enabled on the 1gige links to each daq module - ethernet flow control must be enabled in the FPGA firmware (it's a checkbox in qsys) - FPGA firmware internally must have working back pressure and flow control (avalon and axi buses) - ideally, this back-pressure should feed back to the trigger. ALPHA-g does not have this (it does not need it). Next chore was to multithread the UDP receiver frontend and to multithread the event builder. Stock single-threaded programs quickly max out with 100% CPU use and reach nowhere near 10gige data speeds. Naive multithreading, with two threads, reader (read UDP packet, lock a mutex, put it into a deque, unlock, repeat) and sender (lock a mutex, get a packet from deque, unlock, bm_send_event(), repeat) spends all it's time locking and unlocking the mutex and goes nowhere fast (with 1500 byte packets, about 600 kHz of lock/unlock at 10gige speed). So one has to do everything in batches: reader thread: accumulate 1000 udp packets in an std::vector, lock the mutex, dump this batch into a deque, unlock, repeat; sender thread: lock mutex, get 1000 packets from the deque, unlock, stuff the 1000 packets into 1 midas event, bm_send_event(), repeat. It takes me 5 of these multithreaded udp reader frontends to keep up with a 10gige link without dropping any UDP packets. My first implementation chewed up 500% CPU, that's all of it, there is only 4 CPU cores available, leaving nothing for the event builder (and mlogger, and ...) I had to: a) switch from plain socket read() to socket recvmmsg() - 100000 udp packets per syscall vs 1 packet per syscall, and b) switch from plain bm_send_event() to bm_send_event_sg() - using a scatter-gather list to avoid a memcpy() of each udp packet into one big midas event. Next is the event builder. The event builder needs to read data from the 5 midas event buffers (one buffer per udp reader frontend, each midas event contains 1000 udp packets as indovidual data banks), examine trigger timestamps inside each udp packet, collect udp packets with matching timestamps into a physics event, bm_send_event() it to the SYSTEM buffer. rinse and repeat. Initial single threaded implementation maxed out at about 100-200 Mbytes/sec with 100% busy CPU. After trying several threading schemes, the final implementation has these threads: - 5 threads to read the 5 event buffers, these threads also examine the udp packets, extract timestamps, etc - 1 thread to sort udp packets by timestamp and to collect them into physics events - 1 thread to bm_send_event() physics events to the SYSTEM buffer - main thread and rpc handler thread (tmfe frontend) (Again, to reduce lock contention, all data is passed between threads in large batches) This got me up to about 800 Mbytes/sec. To get more, I had to switch the event builder from old plain bm_send_event() to the scatter-gather bm_send_event_sg(), and I had to reduce CPU use by other programs, see steps (a) and (b) above. So, at the end, success, full 10gige data rate from daq boards to the MIDAS SYSTEM buffer. (But wait, what about the mlogger? In this experiment, we do not have a disk storage array to sink this much data. But it is an already-solved problem. On the data storage machines I built for GRIFFIN - 8 SATA NAS HDDs using raidz2 ZFS - the stock MIDAS mlogger can easily sink 1000 Mbytes/sec from SYSTEM buffer to disk). Lessons learned: - do not use UDP. dealing with packet loss will cost you a fortune in headache medicines and hair restorations. - use jumbo frames. difference in per-packet overhead between 1500 byte and 9000 byte packets is almost a factor of 10. - everything has to be done in bulk to reduce per-packet overheads. recvmmsg(), batched queue push/pop, etc - avoid memory allocations (I has a per-packet std::string, replaced it with char[5]) - avoid memcpy(), use writev(), bm_send_event_sg() & co K.O. P.S. Let's counting the number of data copies in this system: x udp reader frontend: - ethernet NIC DMA into linux network buffers - recvmmsg() memcpy() from linux network buffer to my memory - bm_send_event_sg() memcpy() from my memory to the MIDAS shared memory event buffer x event builder: - bm_receive_event() memcpy() from MIDAS shared memory event buffer to my event buffer - my memcpy() from my event buffer to my per-udp-packet buffers - bm_send_event_sg() memcpy() from my per-udp-packet buffers to the MIDAS shared memory event buffer (SYSTEM) x mlogger: - bm_receive_event() memcpy() from MIDAS SYSTEM buffer - memcpy() in the LZ4 data compressor - write() syscall memcpy() to linux system disk buffer - SATA interface DMA from linux system disk buffer to disk. Would a monolithic massively multithreaded daq application be more efficient? ("udp receiver + event builder + logger"). Yes, about 4 memcpy() out of about 10 will go away. Would I be able to write such a monolithic daq application? I think not. Already, at 10gige data rates, for all practical purposes, it is impossible to debug most problems, especially subtle trouble in multithreading (race conditions) and in memory allocations. At best, I can sprinkle assert()s and look at core dumps. So the good old divide-and-conquer approach is still required, MIDAS still rules. K.O.
2221	18 Jun 2021	Konstantin Olchanski	Bug Report	my html modbvalue thing is not working?
I have a web page and I try to use modbvalue, but nothing happens. The best I can tell, I follow the documentation (https://midas.triumf.ca/MidasWiki/index.php/Custom_Page#modbvalue). <td id=setv0><div class="modbvalue" data-odb-path="/Equipment/CAEN_hvps01/Settings/VSET[0]" data-odb-editable="1">(ch0)</div></td> I suppose I could add debug logging to the javascript framework for modbvalue to find out why it is not seeing or how it is not liking my web page. But how would a non-expert user (or an expert user in a hurry) would debug this? Should the modbvalue framework log more error messages to the javascrpt console ("I am ignoring your modbvalue entry because...")? Should it have a debug mode where it reports to the javascript console all the tags it scanned, all the tags it found, etc to give me some clue why it does not find my modbvalue tag? Right now I am not even sure if this framework is activated, perhaps I did something wrong in how I load the page and the modbvalue framework is not loaded. The documentation gives some magic incantations but does not explain where and how this framework is loaded and activated. (But I do not see any differences between my page and the example in the documentation. Except that I do not load control.js, I do not need all the thermometer bars, etc. If I do load it, still my modbvalue does not work). K.O.
2222	18 Jun 2021	Konstantin Olchanski	Info	1000 Mbytes/sec through midas achieved!
> In MEG II we also kind of achieved this rate. > > Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade switch. Right. We built this DAQ system about 3 years ago and the cheep Chineese switches arrived on the market about 1 year after we purchased the big 96 port juniper switch. Bad timing/good timing. Actually I have a very nice 24-port 1gige switch ($2000 about 3 years ago), I could have used 4 of them in parallel, but they were discontinued and replaced with a $5000 switch (+$3000 for a 10gige uplink. I think I got the last very last one cheap switch). But not all Chineese switches are equal. We have an Ubiquity 10gige switch, and it does not have working end-to-end ethernet flow control. (yikes!). BTW, for this project we could not use just any cheap switch, we must have 64 fiber SFP ports for connecting on-TPC electronics. This narrows the market significantly and it does not match the industry standard port counts 8-16-24-48-96. > MikroTik CRS354-48G-4S+2Q+RM 54 port > MikroTik CRS326-24S-2Q+RM 26 Port We have a hard time buying this stuff in Vancouver BC, Canada. Most of our regular suppliers are US based and there is a technology trade war still going on between the US and China. I guess we could buy direct on alibaba, but for the risk of scammers, scalpers and iffy shipping. > both cost in the order of 500 US$ tell one how much we overpay for US based stuff. not surprising, with how Cisco & co can afford to buy sports arenas, etc. > We were astonished that they don't loose UDP packets when all inputs send a packet at the > same time, and they have to pipe them to the single output one after the other, > but apparently the switch have enough buffers. You probably see ethernet flow control in action. Look at the counters for ethernet pause frames in your daq boards and in your main computer. > (which is usually NOT written in the data sheets). True, when I looked into this, I found a paper by somebody in Berkley for special technique to measure the size of such buffers. (The big Juniper switch has only 8 Mbytes of buffer. The current wisdom for backbone networks is to have as little buffering as possible). > To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is > completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the > details. We do not do this. (very bad!). When each trigger arrives, all 64+8 DAQ boards send a train of UDP packets at maximum line speed (64+8 at 1 gige) all funneled into one 10 gige ((64+8)/10 oversubscription). Before we got ethernet flow control to work properly, we had to throttle all the 1gige links by about 60% to get any complete events at all. This would not have been acceptable for physics data taking. > Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your > bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend > attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk. Dummy frontend is not very representative, because limitation is the memory bandwidth and CPU load, and a real ethernet receiver has quite a bit of both (interrupt processing, DMA into memory, implicit memcpy() inside the socket read()). For example, typical memcpy() speeds are between 22 and 10 Gbytes/sec for current generation CPUs and DRAM. This translates for a total budget of 22 and 10 memcpy() at 10gige speeds. Subtract from this 1 memcpy() to DMA data from ethernet into memory and 1 memcpy() to DMA data from memory to storage. Subtract from this 2 implicit memcpy() for read() in the frontend and write() in mlogger. (the Linux sendfile() syscall was invented to cut them out). Subtract from this 1 memcpy() for instruction and incidental data fetch (no interesting program fits into cache). Subtract from this memory bandwidth for running the rest of linux (systemd, ssh, cron jobs, NFS, etc). Hardly anything left when all is said and done. (Found it, the alphagdaq memcpy() runs at 14 Gbytes/sec, so total budget of 14 memcpy() at 10gige speeds). And the event builder eats up 2 CPU cores to process the UDP packets at 10gige rate, there is quite a bit of CPU-expensive data unpacking, inspection and processing going on that cannot be cut out. (alphagdaq has 4 cores, "8 threads"). K.O. P.S. Waiting for rack-mounted machines with AMD "X" series processors... K.O.
2223	18 Jun 2021	Konstantin Olchanski	Info	1000 Mbytes/sec through midas achieved!
> ... MEG II ... 34 crates each with 32 DRS4 digitiser chips and a single 1 Gbps readout link through a Xilinx Zynq SoC. > > Zynq ... embedded ethernet MAC does not support jumbo frames (always read the fine prints in the manuals!) > and the embedded Linux ethernet stack seems to struggle when we go beyond 250 Mbps of UDP traffic. that's an ouch. we use the altera ethernet mac, and jumbo frames are supported, but the firmware data path was originally written assuming 1500-byte packets and it is too much work to rewrite it for jumbo frames. we send the data directly from the FPGA fabric to the ethernet, there is an avalon/axi bus multiplexer to split the ethernet packets to the NIOS slow control CPU. not sure if such scheme is possible for SoC FPGAs with embedded ARM CPUs. and yes, a 1 GHz ARM CPU will not do 10gige. You see it yourself, measure your memcpy() speed. Where typical PC will have dual-channel 128-bit wide memory (and the famous for it's low latency Intel memory controller), ARM SoC will have at best 64-bit wide memory (some boards are only 32-bit wide!), with DDR3 (not DDR4) severely under-clocked (i.e. DDR3-900, etc). This is why the new Apple ARM chips are so interesting - can Apple ARM memory controller beat the Intel x86 memory controller? > On the receiver side, we have the DAQ server with an Intel E5-2630 v4 CPU that's the right gear for the job. quad-channel memory with nominal "Max Memory Bandwidth 68.3 GB/s", 10 CPU cores. My benchmark of memcpy() for the much older duad-channel memory i7-4820 with DDR3-1600 DIMMs is 20 Gbytes/sec. waiting for ARM CPU with similar specs. > and a 10 Gbit connection to the network using an Intel X710 Network card. > In the past, we used also a "cheap" 10 Gbit card from Tehuti but the driver performance was so bad that it could not digest more than 5 Gbps of data. yup, same here. use Intel ethernet exclusively, even for 1gige links. > A major modification to Konstantin scheme is that we need to calibrate all WFMs online so that a software zero suppression I implemented hardware zero suppression in the FPGA code. I think 1 GHz ARM CPU does not have the oomph for this. > rb_get_wp() returns almost always DB_TIMEOUT replace rb_xxx() with std::deque<std::vector<char>> (protected by a mutex, of course). lots of stuff in the mfe.c frontend is obsolete in the same way. check out the newer tmfe frontends (tmfe.md, tmfe.h and tmfe examples). > It is difficult to report three years of development in a single Elog but quite successful at it. big thanks for your write-up. I think our info is quite useful for the next people. K.O.
2224	18 Jun 2021	Konstantin Olchanski	Info	Add support for rtsp camera streams in mlogger (history_image.cxx)
> mlogger (history_image) now supports rtsp cameras my goodness, we will drive the video surveillance industry out of business. > My suggestion / request would be to move the camera management out of > mlogger and into a new program (mcamera?), so that users can choose to off > load the CPU load to another system (I understand the OpenCV will use GPU > decoders if available also, which can also lighten the CPU load). every 2 years I itch to separate mlogger into two parts - data logger and history logger. but then I remember that the "I" in MIDAS stands for "integrated", and "M" stands for "maximum" and I say, "nah..." (I guess we are not maximum integrated enough to have mhttpd, mserver and mlogger to be one monolithic executable). There is also a line of thinking that mlogger should remain single-threaded for maximum reliability and ease of debugging. So if we keep adding multithreaded stuff to it, perhaps it should be split-apart after all. (anything that makes the size of mlogger.cxx smaller is a good thing, imo). K.O.

Goto page Previous 1, 2, 3 ... 111, 112, 113 ... 150, 151, 152 Next

ELOG V3.1.4-2e1708b5