15 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved!
|
I am sure everybody else has 10gige and 40gige networks and are sending terabytes of data before breakfast.
Myself, I only have one computer with a 10gige network link and sufficient number of daq boards to fill
it with data. Here is my success story of getting all this data through MIDAS.
This is the anti-matter experiment ALPHA-g now under final assembly at CERN. The main particle detector is a long but
thin cylindrical TPC. It surrounds the magnetic bottle (particle trap) where we make and study anti-hydrogen. There are
64 daq boards to read the TPC cathode pads and 8 daq boards to read the anode wires and to form the trigger. Each daq
board can produce data at 80-90 Mbytes/sec (1gige links). Data is sent as UDP packets (no jumbo frames). Altera FPGA
firmware was done here at TRIUMF by Bryerton Shaw, Chris Pearson, Yair Lynn and myself.
Network interconnect is a 96-port Juniper switch with a 10gige uplink to the main daq computer (quad core Intel(R)
Xeon(R) CPU E3-1245 v6 @ 3.70GHz, 64 GBytes of DDR4 memory).
MIDAS data path is: UDP packet receiver frontend -> event builder -> mlogger -> disk -> lazylogger -> CERN EOS cloud
storage.
First chore was to get all the UDP packets into the main computer. "U" in UDP stands for "unreliable", and at first, UDP
packets have been disappearing pretty much anywhere they could. To fix this, in order:
- reading from the udp socket must be done in a dedicated thread (in the midas context, pauses to write statistics or
check alarms result in lost udp packets)
- udp socket buffer has to be very big
- maximum queue sizes must be enabled in the 10gige NIC
- ethernet flow control must be enabled on the 10gige link
- ethernet flow control must be enabled in the switch (to my surprise many switches do not have working end-to-end
ethernet flow control and lose UDP packets, ask me about this. our big juniper switch balked at first, but I got it
working eventually).
- ethernet flow control must be enabled on the 1gige links to each daq module
- ethernet flow control must be enabled in the FPGA firmware (it's a checkbox in qsys)
- FPGA firmware internally must have working back pressure and flow control (avalon and axi buses)
- ideally, this back-pressure should feed back to the trigger. ALPHA-g does not have this (it does not need it).
Next chore was to multithread the UDP receiver frontend and to multithread the event builder. Stock single-threaded
programs quickly max out with 100% CPU use and reach nowhere near 10gige data speeds.
Naive multithreading, with two threads, reader (read UDP packet, lock a mutex, put it into a deque, unlock, repeat) and
sender (lock a mutex, get a packet from deque, unlock, bm_send_event(), repeat) spends all it's time locking and
unlocking the mutex and goes nowhere fast (with 1500 byte packets, about 600 kHz of lock/unlock at 10gige speed).
So one has to do everything in batches: reader thread: accumulate 1000 udp packets in an std::vector, lock the mutex,
dump this batch into a deque, unlock, repeat; sender thread: lock mutex, get 1000 packets from the deque, unlock, stuff
the 1000 packets into 1 midas event, bm_send_event(), repeat.
It takes me 5 of these multithreaded udp reader frontends to keep up with a 10gige link without dropping any UDP packets.
My first implementation chewed up 500% CPU, that's all of it, there is only 4 CPU cores available, leaving nothing
for the event builder (and mlogger, and ...)
I had to:
a) switch from plain socket read() to socket recvmmsg() - 100000 udp packets per syscall vs 1 packet per syscall, and
b) switch from plain bm_send_event() to bm_send_event_sg() - using a scatter-gather list to avoid a memcpy() of each udp
packet into one big midas event.
Next is the event builder.
The event builder needs to read data from the 5 midas event buffers (one buffer per udp reader frontend, each midas event
contains 1000 udp packets as indovidual data banks), examine trigger timestamps inside each udp packet, collect udp
packets with matching timestamps into a physics event, bm_send_event() it to the SYSTEM buffer. rinse and repeat.
Initial single threaded implementation maxed out at about 100-200 Mbytes/sec with 100% busy CPU.
After trying several threading schemes, the final implementation has these threads:
- 5 threads to read the 5 event buffers, these threads also examine the udp packets, extract timestamps, etc
- 1 thread to sort udp packets by timestamp and to collect them into physics events
- 1 thread to bm_send_event() physics events to the SYSTEM buffer
- main thread and rpc handler thread (tmfe frontend)
(Again, to reduce lock contention, all data is passed between threads in large batches)
This got me up to about 800 Mbytes/sec. To get more, I had to switch the event builder from old plain bm_send_event() to
the scatter-gather bm_send_event_sg(), *and* I had to reduce CPU use by other programs, see steps (a) and (b) above.
So, at the end, success, full 10gige data rate from daq boards to the MIDAS SYSTEM buffer.
(But wait, what about the mlogger? In this experiment, we do not have a disk storage array to sink this
much data. But it is an already-solved problem. On the data storage machines I built for GRIFFIN - 8 SATA NAS HDDs using
raidz2 ZFS - the stock MIDAS mlogger can easily sink 1000 Mbytes/sec from SYSTEM buffer to disk).
Lessons learned:
- do not use UDP. dealing with packet loss will cost you a fortune in headache medicines and hair restorations.
- use jumbo frames. difference in per-packet overhead between 1500 byte and 9000 byte packets is almost a factor of 10.
- everything has to be done in bulk to reduce per-packet overheads. recvmmsg(), batched queue push/pop, etc
- avoid memory allocations (I has a per-packet std::string, replaced it with char[5])
- avoid memcpy(), use writev(), bm_send_event_sg() & co
K.O.
P.S. Let's counting the number of data copies in this system:
x udp reader frontend:
- ethernet NIC DMA into linux network buffers
- recvmmsg() memcpy() from linux network buffer to my memory
- bm_send_event_sg() memcpy() from my memory to the MIDAS shared memory event buffer
x event builder:
- bm_receive_event() memcpy() from MIDAS shared memory event buffer to my event buffer
- my memcpy() from my event buffer to my per-udp-packet buffers
- bm_send_event_sg() memcpy() from my per-udp-packet buffers to the MIDAS shared memory event buffer (SYSTEM)
x mlogger:
- bm_receive_event() memcpy() from MIDAS SYSTEM buffer
- memcpy() in the LZ4 data compressor
- write() syscall memcpy() to linux system disk buffer
- SATA interface DMA from linux system disk buffer to disk.
Would a monolithic massively multithreaded daq application be more efficient?
("udp receiver + event builder + logger"). Yes, about 4 memcpy() out of about 10 will go away.
Would I be able to write such a monolithic daq application?
I think not. Already, at 10gige data rates, for all practical purposes, it is impossible
to debug most problems, especially subtle trouble in multithreading (race conditions)
and in memory allocations. At best, I can sprinkle assert()s and look at core dumps.
So the good old divide-and-conquer approach is still required, MIDAS still rules.
K.O. |
18 Jun 2021, Konstantin Olchanski, Bug Report, my html modbvalue thing is not working?
|
I have a web page and I try to use modbvalue, but nothing happens. The best I can tell, I follow the documentation
(https://midas.triumf.ca/MidasWiki/index.php/Custom_Page#modbvalue).
<td id=setv0><div class="modbvalue" data-odb-path="/Equipment/CAEN_hvps01/Settings/VSET[0]" data-odb-editable="1">(ch0)</div></td>
I suppose I could add debug logging to the javascript framework for modbvalue to find out why it is not seeing
or how it is not liking my web page.
But how would a non-expert user (or an expert user in a hurry) would debug this?
Should the modbvalue framework log more error messages to the javascrpt console ("I am ignoring your modbvalue entry because...")?
Should it have a debug mode where it reports to the javascript console all the tags it scanned, all the tags it found, etc
to give me some clue why it does not find my modbvalue tag?
Right now I am not even sure if this framework is activated, perhaps I did something wrong in how I load the page
and the modbvalue framework is not loaded. The documentation gives some magic incantations but does not explain
where and how this framework is loaded and activated. (But I do not see any differences between my page and
the example in the documentation. Except that I do not load control.js, I do not need all the thermometer bars, etc.
If I do load it, still my modbvalue does not work).
K.O. |
18 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved!
|
> In MEG II we also kind of achieved this rate.
>
> Instead of an expensive high-grade switch, we chose a cheap "Chinese" high-grade switch.
Right. We built this DAQ system about 3 years ago and the cheep Chineese switches arrived
on the market about 1 year after we purchased the big 96 port juniper switch. Bad timing/good timing.
Actually I have a very nice 24-port 1gige switch ($2000 about 3 years ago), I could have
used 4 of them in parallel, but they were discontinued and replaced with a $5000 switch
(+$3000 for a 10gige uplink. I think I got the last very last one cheap switch).
But not all Chineese switches are equal. We have an Ubiquity 10gige switch, and it does
not have working end-to-end ethernet flow control. (yikes!).
BTW, for this project we could not use just any cheap switch, we must have 64 fiber SFP ports
for connecting on-TPC electronics. This narrows the market significantly and it does
not match the industry standard port counts 8-16-24-48-96.
> MikroTik CRS354-48G-4S+2Q+RM 54 port
> MikroTik CRS326-24S-2Q+RM 26 Port
We have a hard time buying this stuff in Vancouver BC, Canada. Most of our regular suppliers
are US based and there is a technology trade war still going on between the US and China.
I guess we could buy direct on alibaba, but for the risk of scammers, scalpers and iffy shipping.
> both cost in the order of 500 US$
tell one how much we overpay for US based stuff. not surprising, with how Cisco & co can afford
to buy sports arenas, etc.
> We were astonished that they don't loose UDP packets when all inputs send a packet at the
> same time, and they have to pipe them to the single output one after the other,
> but apparently the switch have enough buffers.
You probably see ethernet flow control in action. Look at the counters for ethernet pause frames
in your daq boards and in your main computer.
> (which is usually NOT written in the data sheets).
True, when I looked into this, I found a paper by somebody in Berkley for special
technique to measure the size of such buffers.
(The big Juniper switch has only 8 Mbytes of buffer. The current wisdom for backbone networks
is to have as little buffering as possible).
> To avoid UDP packet loss for several events, we do traffic shaping by arming the trigger only when the previous event is
> completely received by the frontend. This eliminates all flow control and other complicated methods. Marco can tell you the
> details.
We do not do this. (very bad!). When each trigger arrives, all 64+8 DAQ boards send a train of UDP packets
at maximum line speed (64+8 at 1 gige) all funneled into one 10 gige ((64+8)/10 oversubscription).
Before we got ethernet flow control to work properly, we had to throttle all the 1gige links by about 60%
to get any complete events at all. This would not have been acceptable for physics data taking.
> Another interesting aspect: While we get the data into the frontend, we have problems in getting it through midas. Your
> bm_send_event_sg() is maybe a good approach which we should try. To benchmark the out-of-the-box midas, I run the dummy frontend
> attached on my MacBook Pro 2.4 GHz, 4 cores, 16 GB RAM, 1 TB SSD disk.
Dummy frontend is not very representative, because limitation is the memory bandwidth
and CPU load, and a real ethernet receiver has quite a bit of both (interrupt processing,
DMA into memory, implicit memcpy() inside the socket read()).
For example, typical memcpy() speeds are between 22 and 10 Gbytes/sec for current
generation CPUs and DRAM. This translates for a total budget of 22 and 10 memcpy()
at 10gige speeds. Subtract from this 1 memcpy() to DMA data from ethernet into memory
and 1 memcpy() to DMA data from memory to storage. Subtract from this 2 implicit
memcpy() for read() in the frontend and write() in mlogger. (the Linux sendfile() syscall
was invented to cut them out). Subtract from this 1 memcpy() for instruction and incidental
data fetch (no interesting program fits into cache). Subtract from this memory bandwidth
for running the rest of linux (systemd, ssh, cron jobs, NFS, etc). Hardly anything
left when all is said and done. (Found it, the alphagdaq memcpy() runs at 14 Gbytes/sec,
so total budget of 14 memcpy() at 10gige speeds).
And the event builder eats up 2 CPU cores to process the UDP packets at 10gige rate,
there is quite a bit of CPU-expensive data unpacking, inspection and processing
going on that cannot be cut out. (alphagdaq has 4 cores, "8 threads").
K.O.
P.S. Waiting for rack-mounted machines with AMD "X" series processors... K.O. |
18 Jun 2021, Konstantin Olchanski, Info, 1000 Mbytes/sec through midas achieved!
|
> ... MEG II ... 34 crates each with 32 DRS4 digitiser chips and a single 1 Gbps readout link through a Xilinx Zynq SoC.
>
> Zynq ... embedded ethernet MAC does not support jumbo frames (always read the fine prints in the manuals!)
> and the embedded Linux ethernet stack seems to struggle when we go beyond 250 Mbps of UDP traffic.
that's an ouch. we use the altera ethernet mac, and jumbo frames are supported, but the firmware data path
was originally written assuming 1500-byte packets and it is too much work to rewrite it for jumbo frames.
we send the data directly from the FPGA fabric to the ethernet, there is an avalon/axi bus multiplexer
to split the ethernet packets to the NIOS slow control CPU. not sure if such scheme is possible
for SoC FPGAs with embedded ARM CPUs.
and yes, a 1 GHz ARM CPU will not do 10gige. You see it yourself, measure your memcpy() speed. Where
typical PC will have dual-channel 128-bit wide memory (and the famous for it's low latency
Intel memory controller), ARM SoC will have at best 64-bit wide memory (some boards are only 32-bit wide!),
with DDR3 (not DDR4) severely under-clocked (i.e. DDR3-900, etc). This is why the new Apple ARM chips
are so interesting - can Apple ARM memory controller beat the Intel x86 memory controller?
> On the receiver side, we have the DAQ server with an Intel E5-2630 v4 CPU
that's the right gear for the job. quad-channel memory with nominal "Max Memory Bandwidth 68.3 GB/s",
10 CPU cores. My benchmark of memcpy() for the much older duad-channel memory i7-4820 with DDR3-1600 DIMMs
is 20 Gbytes/sec. waiting for ARM CPU with similar specs.
> and a 10 Gbit connection to the network using an Intel X710 Network card.
> In the past, we used also a "cheap" 10 Gbit card from Tehuti but the driver performance was so bad that it could not digest more than 5 Gbps of data.
yup, same here. use Intel ethernet exclusively, even for 1gige links.
> A major modification to Konstantin scheme is that we need to calibrate all WFMs online so that a software zero suppression
I implemented hardware zero suppression in the FPGA code. I think 1 GHz ARM CPU does not have the oomph for this.
> rb_get_wp() returns almost always DB_TIMEOUT
replace rb_xxx() with std::deque<std::vector<char>> (protected by a mutex, of course). lots of stuff in the mfe.c frontend
is obsolete in the same way. check out the newer tmfe frontends (tmfe.md, tmfe.h and tmfe examples).
> It is difficult to report three years of development in a single Elog
but quite successful at it. big thanks for your write-up. I think our info is quite useful for the next people.
K.O. |
18 Jun 2021, Konstantin Olchanski, Info, Add support for rtsp camera streams in mlogger (history_image.cxx)
|
> mlogger (history_image) now supports rtsp cameras
my goodness, we will drive the video surveillance industry out of business.
> My suggestion / request would be to move the camera management out of
> mlogger and into a new program (mcamera?), so that users can choose to off
> load the CPU load to another system (I understand the OpenCV will use GPU
> decoders if available also, which can also lighten the CPU load).
every 2 years I itch to separate mlogger into two parts - data logger
and history logger.
but then I remember that the "I" in MIDAS stands for "integrated",
and "M" stands for "maximum" and I say, "nah..."
(I guess we are not maximum integrated enough to have mhttpd, mserver
and mlogger to be one monolithic executable).
There is also a line of thinking that mlogger should remain single-threaded
for maximum reliability and ease of debugging. So if we keep adding multithreaded
stuff to it, perhaps it should be split-apart after all. (anything that makes
the size of mlogger.cxx smaller is a good thing, imo).
K.O. |
20 Jun 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage
|
> I agree that those two things are problems, but I don't see why it is preferable to leave the MidasConfig.cmake in this "broken" state. For us
> problem 1 is less of an issue, becaues we run "link_directories(${MIDAS_LIBRARY_DIRS})" in the top CMakeLists.txt and then just link against "midas",
> not "${MIDAS_LIBRARIES}". However, number 2 would be nice, to not manually hack in target_include_directories(target ${MIDASSYS}/mscb/include),
> especially because ${MIDASSYS} is not set in cmake.
So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
Problem still remains with required auxiliary libraries for linking "-lmidas". Sometimes you
need "-lutil" and "-lrt" and "-lpthread", sometimes not. Some way to pass this information
automatically would be nice.
Problem still remains that I cannot do these changes because I have no test harness
for any of this. Would be great if you could contribute this and post the documentation
blurb that we can paste into the midas wiki documentation.
And I still do not understand why we have to do all this work when cmake "import(EXPORT)"
already does all of this automatically. What am I missing?
K.O.
>
> I see two solutions for problem 2: Treat mscb as a submodule and compile and install it together with midas, or add the include directory to
> ${MIDAS_INCLUDE_DIRS} (same applies to the other submodules, mscb is the one that made me open this elog just now)
>
> Cheers
> Lukas
>
> > > find_package(Midas)
> >
> > I am testing find_package(Midas). There is a number of problems:
> >
> > 1) ${MIDAS_LIBRARIES} is set to "midas;midas-shared;midas-c-compat;mfe".
> >
> > This seem to be an incomplete list of all libraries build by midas (rmana is missing).
> >
> > This means ${MIDAS_LIBRARIES} should not be used for linking midas programs (unlike ${ROOT_LIBRARIES}, etc):
> >
> > - we discourage use of midas shared library because it always leads to problems with shared library version mismatch (static linking is preferred)
> > - midas-c-compat is for building python interfaces, not for linking midas programs
> > - mfe contains a main() function, it will collide with the user main() function
> >
> > So I think this should be changed to just "midas" and midas linking dependancy
> > libraries (-lutil, -lrt, -lpthread) should also be added to this list.
> >
> > Of course the "install(EXPORT)" method does all this automatically. (so my fixing find_package(Midas) is a waste of time)
> >
> > 2) ${MIDAS_INCLUDE_DIRS} is missing the mxml, mjson, mvodb, midasio submodule directories
> >
> > Again, install(EXPORT) handles all this automatically, in find_package(Midas) it has to be done by hand.
> >
> > Anyhow, this is easy to add, but it does me no good in the rootana cmake if I want to build against old versions
> > of midas. So in the rootana cmake, I still have to add $MIDASSYS/mvodb & co by hand. Messy.
> >
> > I do not know the history of cmake and why they have two ways of doing things (find_package and install(EXPORT)),
> > this second method seems to be much simpler, everything is exported automatically into one file,
> > and it is much easier to use (include the export file and say target_link_libraries(rootana PUBLIC midas)).
> >
> > So how much time should I spend in fixing find_package(Midas) to make it generally usable?
> >
> > - include path is incomplete
> > - library list is nonsense
> > - compiler flags are not exported (we do not need -DOS_LINUX, but we do need -DHAVE_ZLIB, etc)
> > - dependency libraries are not exported (-lz, -lutil, -lrt, -lpthread, etc)
> >
> > K.O. |
24 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
I am updating the history plots. Main changes:
- the old history display code should again be easily usable (use the "open in old history display" checkbox)
- the history plot editor has an "edit in ODB" button that takes as to the plot definition in ODB (sometimes it is
easier to editing things in the ODB editor)
- error in history plot editor that created "formula" entry of incorrect size should be fixed
- "reorder" (and "delete entry") functions in the history plot editor should work again (plus added explanation text)
- "factor" and "offset" restored in the history plot editor
- added the long desired "voffset" to simplify plot scaling and positioning
- (factor, offset and voffset do not yet work in the new history plots, TBI ASAP)
- history plot editor and generate_hist_graph() now use the same code to read plot definitions from ODB. There should
be no more confusion about content of history plot entries in ODB and what each entry is supposed to do.
These changes have been precipitated by our inability to plot high voltage voltage and current on the same plot,
see bug https://bitbucket.org/tmidas/midas/issues/308/history-plot-formula-cannot-be-used-to
Voltage is in the range 0..1000 (volts) and current is in the range 0..50 and 0..0.100, autoscaling on voltage
makes the currents invisible at the zero line. In the past, we used the "factor" setting to scale
the graphs so we can see both voltage and currents at the same time (currents scaled up by factor 25 and 600,
as example).
The new "formula" feature was supposed to replace (and improve upon) the "factor" and "offset". But if I use
the formula "x*25", suddenly the plot is telling us that current values are not 50 uA, but 1250 uA (50*25),
and this is just wrong. We do not want to scale the micro-amps, we want to better position the plot on the graph,
like the old "factor" and "offset" allowed us to do.
So the idea is to use this computation:
y_position_on_plot = offset + factor*(formula(history_value) - voffset)
- "formula" is to transform history values into physical values (i.e. pressure meter reports bars, but we want atm, or
voltmeter is reading in discrete units of 0.125V, we want to see volts)
- "factor" and "offset" is to position the graphs on the plot for best visual presentation of data
- I also added is the much desired "voffset", you only know it is needed if you have a non-zero "offset" and you need
to change the "factor", surprise, "offset" has ot be changed, too, and good luck recalculating it correctly in one
try.
The way to use this stuff:
- adjust "voffset" to bring the graph to around y=0
- increase the "factor" to zoom-in on features and stuff
- adjust "offset" to move the graph up and down relative to all the other graphs on the plot
- now one can zoom in and out as needed by changing the "factor" and the plot will stay roughly in the right place
without having to readjust the offsets.
K.O. |
24 Jun 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage
|
> > So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> A more moderate option ...
For the record, I did not disappear. I have a very short time window
to complete commissioning the alpha-g daq (now that the network
and the event builder are cooperating). To add to the fun, our high voltage
power supply turned into a pumpkin, so plotting voltages and currents
on the same history plot at the same time (like we used to be able to do)
went up in priority.
K.O. |
25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
> A general warning: With the recent history changes implemented in the develop branch, starting from a fresh ODB and editing
> any history panel, on gets tons of errors and debug output from mhttpd: ...
This is the reason most projects have separate development and production branches.
I recommend everybody to use the released tagged versions of midas for production.
> I strongly recommend to make such modifications on a separate branch not to
> break running experiments.
Is there something that does not work anymore? Did I break something? The debug messages I am still
tuning.
K.O.
>
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Minimum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Maximum" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Log axis" returned status 312
> MVOdb: Error: MIDAS db_get_value() at ODB path "/History/Display/Default/Trigger rate/Zero ylow" returned status 312
> Load from ODB History/Display/Default/Trigger rate: hist plot: 2 variables
> timescale: 1h, minimum: 0.000000, maximum: 0.000000, zero_ylow: 0, log_axis: 0, show_run_markers: 1, show_values: 1,
> show_fill: 1
> var[0] event [System][Trigger per sec.] formula [], colour [#00AAFF] label [] factor 1.000000 offset 0.000000 voffset
> 0.000000 order 10
> var[1] event [System][Trigger kB per sec.] formula [], colour [#FF9000] label [] factor 1.000000 offset 0.000000 voffset
> 0.000000 order 20
>
>
>
> This has to be fixed by the original author. I strongly recommend to make such modifications on a separate branch not to
> break running experiments.
>
> Stefan |
25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
> I disagree ...
I am happy with disagreement and differences of opinions. Zest of life, driver of progress and improvements, etc.
I am even more happy with solutions to problems. The current problem is that the offset and factor feature
of history plots has been removed without much discussion.
I stress, we have been using this feature to run experiments for the last 20 years.
I do not understand objections to it being restored. If you do not want to use it, do not use it.
K.O.
> with the proposed change to scale the HV current for a "nice" display. If values are scaled, the axis should be
> scaled in the same way. Otherwise people might read the current from the plot, look at the axis, and again get the wrong
> value (the factor of 25x you mention). Sure you can hover with the cursor over the graph, and see the right value, but think
> of taking a screen shot, putting this into a publication, and get complaints from the reviewer.
>
> The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a
> new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For
> the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.
>
> Stefan |
25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
> > The only "correct" way in my opinion is to implement two vertical axis, as can be seen in some papers. One for the HV, and a
> > new TBD right axis for the current values, then indicating for each graph if the left or right vertical axis applies. For
> > the secondary axis we can have autoscaling or fixed scaling, as we have for the primary axis.
In the past, we have done some useful plots with maybe 10 variables plotted
at the same time with different scaling and positioning on the graph.
Having 2 vertical axis is maybe useful for the specific case of plotting high voltages,
but not in the general case.
Actually, just 2 vertical axis will not work to plot high voltages in ALPHA-g, because
we have anode currents on the scale 0..0.1 uA and cathode currents on the scale 50..60 uA.
K.O. |
25 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
I will have to post an example of a scaled plot. I figure everybody forgot how they look like.
K.O.
> We are using the new history formula as a quick way to convert signals from sensors to actual physical values (for example Voltage->Temperature, Voltage->relative humidity
> ...), so it is great that the shown voltage is the calculated one.
>
> I would like to add a point to this discussion.
> In our collaboration people attach images of history plots to elogs, meeting presentation and/or physical logbooks.
> The proposed scaling formula may work fine online using the cursors, but, once an image is created, I do not understand how it is possible to extract the value for a scaled
> variables.
> Suppose you see a graph in a presentation with a current increase by some PSU and the current was scaled to be in the same plot of the voltage.
> Looking at the delta in the image, how can you judge the current increase without any axis/grid to refer to?
>
> So I support Stefan proposal for a secondary axis, as long as it is clear which value belong to which axis.
> Maybe marking the channels in the description or using different line styles/thickness?
>
> Best,
> Marco |
28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer
|
> Hi all,
> for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> Let me know if you think this is a good approach.
>
Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
should be named "ODBLoadJSON" to be clear about this.
(JSON is preferred over .odb and .xml for many reasons (ask me))
K.O. |
28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer
|
> > > Hi all,
> > > for my experiment we ended up with the need of changing lot of parameters (~9000 values) in the ODB at once by the sequencer.
> > > The very first solution was to use a sequencer function with a ton of ODBSET calls, however a more elegant solution may be to provide an "ODBLoad" command which mimics the "load" command of odbedit.
> > > I already have a working modification to the sequencer for this, if you agree I will commit it to a dedicated brach.
> > > Let me know if you think this is a good approach.
> > >
> >
> > Sounds like a good idea. I trust you are using the data in json format? Perhaps the command
> > should be named "ODBLoadJSON" to be clear about this.
> >
> > (JSON is preferred over .odb and .xml for many reasons (ask me))
>
> What if some experiment keep some files in .xml format (ask me!). The routine should check for the extension and support all three formats.
>
Yes, hard to tell without seeing his full proposal, including the code. If it is load from file,
sure we look at the file extension, I think the existing code already would do this and support all 3 formats.
But if he wants to load ODB data from a text literal or from a string,
we might as well stick to json. I guess we could support the other formats, but I do not see anybody
using anything other than json for new code like this.
ODBPasteJSON("/foo/bar/baz", '{"var1":1, "var2":"somestr"}');
K.O. |
28 Jun 2021, Konstantin Olchanski, Suggestion, ODB Load in Sequencer
|
> ... at MEG, we keep hundreds of XML files for configuration. Mostly historical, but that's how it is.
same here, lots of historical .odb and .xml files.
I think the .odb and .xml support is here to stay. Best I remember, latest things I fixed in both
was support for unlimited string length (and removal of associated buffer overflows). Right now,
I am not sure if both are UTF-8 clean and if they properly escape all control characters,
something to fix as we go or as we bump into problems.
K.O. |
30 Jun 2021, Konstantin Olchanski, Bug Fix, changes in history plots
|
> I am updating the history plots.
> So the idea is to use this computation:
> y_position_on_plot = offset + factor*(formula(history_value) - voffset)
Stefan and myself did some brain storming on zoom. Writing it down the way I remember it.
- we distilled the gist of the problem - the numerical values we show in the plot labels and in hover-over-the-graph
are before formula is applied or after the formula is applied?
- I suggested a universal solution using a double formula: use formula1 for one case;
use formula2 for the other case;
use formula1 for "physics calibration", use formula2 for factor and offset for composite plots:
numeric_value = formula1(history_value)
plotted_value = formula2(numeric_value)
- we agree that this is way too complicated, difficult to explain and difficult to coherently present in the history editor
- Stefan suggested a simple solution, a checkbox labeled "show raw value" next to each history variable. by default, the
value after the formula is plotted and displayed. if checked, the raw value (before the formula) is displayed, and the
value after the formula is plotted. (so this works the same as the factor and offset on the old history plots).
- if "show raw value" is enabled, the numerical values shown will be inconsistent against the labels on the vertical axis.
Our solution it to turn the axis labels off. (for composite plots, like oscillator frequency in Hz vs oscillator
temperature in degC, both scaled to see their correlation, the vertical axis is unit-less "arbitrary units", of course)
- to simplify migration of old history plots that use custom factor and offset settings, we think in the direction of
automatically moving them to the "formula". (factor=2, offset=10 automatically populates formula with "2*x+10", "show raw
value" checked/enabled). Thus we can avoid implementing factor and offset in the new history code (an unwelcome
complication).
- I think this covers all the use cases I have seen in the past, so we will move in this direction.
K.O. |
09 Jul 2021, Konstantin Olchanski, Bug Report, cmake question
|
cmake check and mate in 1 move. please help.
the midas cmake file has a typo in the ROOT_CXX_FLAGS, I fixed it and now I am dead in the
water, need help from cmake experts and pushers.
On Ubuntu:
ROOT_CXX_FLAGS has -std=c++14
midas cmake defines -std=gnu++11 (never mind that I asked for c++11, not "c++11 with GNU
extensions")
the two compiler flags collide and the build explodes, the best I can tell c++11 prevails
and ROOT header files blow up because they expect c++14.
if I remove the midas cmake request for c++11, -std=gnu++11 is gone, there is no conflict
with ROOT C++14 request and the build works just fine.
but now it explodes on CentOS-7 because by default, c++11 is not enabled. (include <mutex>
blows up).
what a mess.
K.O. |
09 Jul 2021, Konstantin Olchanski, Info, cannot push to bitbucket
|
the day has arrived when I cannot git push to bitbucket. cloud computing rules!
I have never seen this error before and I do not think we have any hooks installed,
so it must be some bitbucket stuff. their status page says some kind of maintenance
is happening, but the promised error message is "repository is read only" or something
similar.
I hope this clears out automatically. I am updating all the cmake crud and I have no idea
which changes I already pushed and which I did not, so no idea if anything will work for
people who pull from midas until this problem is cleared out.
daq00:mvodb$ git push
X11 forwarding request failed on channel 0
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Delta compression using up to 12 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (2/2), 247 bytes | 247.00 KiB/s, done.
Total 2 (delta 1), reused 0 (delta 0)
remote: null value in column "attempts" violates not-null constraint
remote: DETAIL: Failing row contains (13586899, 2021-07-10 01:13:28.812076+00, 1970-01-01
00:00:00+00, 1970-01-01 00:00:00+00, 65975727, null).
To bitbucket.org:tmidas/mvodb.git
! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'git@bitbucket.org:tmidas/mvodb.git'
daq00:mvodb$
K.O. |
11 Jul 2021, Konstantin Olchanski, Info, midas cmake update
|
I reworked the midas cmake files:
- install via CMAKE_INSTALL_PREFIX should work correctly now:
- installed are bin, lib and include - everything needed to build against the midas library
- if built without CMAKE_INSTALL_PREFIX, a special mode "MIDAS_NO_INSTALL_INCLUDE_FILES" is activated, and the include path
contains all the subdirectories need for compilation
- -I$MIDASSYS/include and -L$MIDASSYS/lib -lmidas work in both cases
- to "use" midas, I recommend: include($ENV{MIDASSYS}/lib/midas-targets.cmake)
- config files generated for find_package(midas) now have correct information (a manually constructed subset of information
automatically exported by cmake's install(export))
- people who want to use "find_package(midas)" will have to contribute documentation on how to use it (explain the magic used to
find the "right midas" in /usr/local/midas or in /midas or in ~/packages/midas or in ~/pacjages/new-midas) and contribute an
example superproject that shows how to use it and that can be run from the bitpucket automatic build. (features that are not part
of the automatic build we cannot insure against breakage).
On my side, here is an example of using include($ENV{MIDASSYS}/lib/midas-targets.cmake). I posted this before, it is used in
midas/examples/experiment and I will ask ben to include it into the midas wiki documentation.
Below is the complete cmake file for building the alpha-g event bnuilder and main control frontend. When presented like this, I
have to agree that cmake does provide positive value to the user. (the jury is still out whether it balances out against the
negative value in the extra work to "just support find_package(midas) already!").
#
# CMakeLists.txt for alpha-g frontends
#
cmake_minimum_required(VERSION 3.12)
project(agdaq_frontends)
include($ENV{MIDASSYS}/lib/midas-targets.cmake)
add_compile_options("-O2")
add_compile_options("-g")
#add_compile_options("-std=c++11")
add_compile_options(-Wall -Wformat=2 -Wno-format-nonliteral -Wno-strict-aliasing -Wuninitialized -Wno-unused-function)
add_compile_options("-DTMFE_REV0")
add_compile_options("-DOS_LINUX")
add_executable(feevb feevb.cxx TsSync.cxx)
target_link_libraries(feevb midas)
add_executable(fectrl fectrl.cxx GrifComm.cxx EsperComm.cxx JsonTo.cxx KOtcp.cxx $ENV{MIDASSYS}/src/tmfe_rev0.cxx)
target_link_libraries(fectrl midas)
#end |
11 Jul 2021, Konstantin Olchanski, Suggestion, MidasConfig.cmake usage
|
> > > So you say "nuke ${MIDAS_LIBRARIES}" and "fix ${MIDAS_INCLUDE}". Ok.
> > A more moderate option ...
>
> For the record, I did not disappear. I have a very short time window
> to complete commissioning the alpha-g daq (now that the network
> and the event builder are cooperating). To add to the fun, our high voltage
> power supply turned into a pumpkin, so plotting voltages and currents
> on the same history plot at the same time (like we used to be able to do)
> went up in priority.
>
in the latest update, find_package(midas) should work correctly, the include path is right,
the library list is right.
please test.
I find that the cmake install(export) method is simpler on the user side (just one line of
code) and is easier to support on the midas side (config file is auto-generated).
I request that proponents of the find_package(midas) method contribute the documentation and
example on how to use it. (see my other message).
K.O. |
|