BNMR: Rewrite 2020

From DaqWiki
Revision as of 11:46, 2 May 2022 by Bsmith (talk | contribs) (→‎New PPG compiler)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

BNMR / BNQR DAQ 2020 rewrite

The BNMR/BNQR DAQ systems are complex, and can be run in many different modes with very different behaviours. In 2020, we decided to rewrite the DAQ code to become more modular and structured, so that future upgrades and maintenance can be done more easily, and with less likelihood of unintended side effects.

The frontend code (previously febnmr_vmic.exe), logger code (previously mdarc), configuration code (previously rf_config), helper scripts (previously written in perl), PPG compiler and webpages have all been rewritten.

Running old and new DAQs

The old DAQ is currently still the default.

To use the new DAQ, change to the bash shell as the bnmr/bnqr user on isdaq01/lxbnmr/lxbnqr. In bash, the start-all, kill-all, autoruns scripts etc are all configured to use the new DAQ.

The webpages for the new DAQ are currently served on ports 8447 (bnmr) and 8448 (bnqr). The old DAQ remains on ports 8443/8444.

You can run both DAQs in parallel, EXCEPT for the frontend on the VMIC machines! You must ensure that only feBNMR_VMIC (for the old DAQ) OR febnmr (for the new DAQ) is running, not both.

Currently data for the new DAQs are written to /home/bnmr/cycling_dev/online and /home/bnqr/cycling_dev_bnqr/online rather than the main dlog directories.

When the bnmr group are happy with the new DAQ, we will make it the default, serve it on ports 8443/8444, write data to the main dlog directories, and retire the old DAQ completely.

What's changed

For analyzers

The MUD files produced by the new DAQ should be fully backwards-compatible with the old files, except:

  • The irrelevant L and R histograms are no longer saved for BNMR (all bins used to be 0 anyway).
  • Some ODB parameters saved as independent variables are no longer relevant, and so are not saved in the new MUD files. All other variables keep their old titles (even if their "real" name in the ODB has changed).
  • If a CAMP instrument is offline, the old DAQ would not save any values in the MUD file for TI modes, but would save the most recent value for TD modes (even if that value was very old). The new DAQ has the same behaviour for all modes - no data is saved for CAMP devices that are offline.
  • The "description" of some ODB parameters has changed slightly (e.g. "e00 beam on dwelltimes" is now "num beam on dwelltimes").
  • It is now possible to write comments during a run and have them saved in the MUD file.

For shifters

  • The new webpages are far more responsive, and include more explanations about the various modes and parameters. There are also tools to vizualise the PPG cycle and PSM I,Q pairs that will be used, if desired.
  • Mode changes happen instantly rather than taking several seconds.
  • You will not see any messages like "Run is on hold waiting for xxx" - data-taking will start as soon as the run starts.
  • The list of DAQ programs that need to be running has changed, but you can use the same start-all/kill-all commands to start/stop everything that's needed.

Non-DAQ tools

Many paths in the ODB have changed (e.g. there is no /Equipment/FIFO_acq any more!). I have new versions of lcrp, midas_tcl and mui that are aware of the new ODB locations (this includes the autorun script in mui). Any other tools that access the ODB during a run may need to be updated - let me know and I can help with the migration.

MUD bug

The new DAQ allows arbitrary comments to be saved in MUD files during a run. However, there is a bug in some versions of MUD that prevents these comments from being parsed (the file will still be read successfully, you just won't have access to the comments that were written). Derek has already updated the mudpy python tools, but other MUD installs will also need updating:

In the file mud_friendly.c at about line 686 there is a definition of #define _sea_cmtgrp( fd ). You need to change the middle two arguments from MUD_GRP_CMT_ID, (UINT32)1 to MUD_SEC_GRP_ID, MUD_GRP_CMT_ID.

New DAQ summary

The legacy DAQ code directly referred to mode names in many places in the code. This makes it very hard to introduce new modes, as one has to make changes in many places. The new DAQ code instead uses feature flags to define each mode (e.g. there are flags for whether the mode is time-differential/integral, whether the RF frequency is scanned/fixed etc). Only one piece of code (the "mode changer") knows what each mode name means, and sets the various feature flags appropriately. The rest of the DAQ code just looks to see whether the current mode is scanning the RF frequency or not, etc.

The code is built upon a common framework that is used by BNMR, BNQR, EBIT, MPET and CPET. This "cycling framework" handles all the logic for running a PPG cycle a certain number of times, then changing some variable and running the PPG again. The core framework is very flexible.

All the differences between BNMR and BNQR are confined to a few central places:

  • febnmr.cxx/febnqr.cxx for the frontend (e.g. PSM version, scaler versions etc)
  • config_bnmr.js/config_bnqr.js for the web interface (e.g. colour scheme, which webpage features to enable etc)
  • mud_ind_var_links.py to ensure backwards-compatibility of the ODB variables written to MUD files
  • the ODB for configuring behaviour of all other programs.

Internally, the code is written using objects and C++ inheritance. This means that most of the code is agnostic to the type of hardware being controlled, and changing hardware does not have many knock-on effects. In many cases, once you have written a class to control the device (in 1 file), it is only a 1-line change to include it in the rest of the framework.

Programs

Note that midas now has first-class support for python - one can write clients/frontends in python and talk to midas natively. The python programs do not call "odbedit" etc to read the ODB, they read it directly, so are much faster than the legacy perl scripts! Midas provides an RPC (remote procedure call) system so clients can talk to each other directly if desired, rather than passing data around through the ODB/text files etc.

  • febnmr.exe/febnqr.exe - frontend that talks directly to the PPG, scalers and other hardware. Handles scanning variables (frequency, CAMP, EPICS) as needed. Sends histogram data to midas buffers. Runs on lxbnmr/lxbnqr.
  • kalliope_fe.exe - frontend that talks to kalliope devices and sends histogram data to midas buffers.
  • bnxr_logger.exe - writes data to MUD files. Reads information from midas banks (for scaler histograms), EPICS, and CAMP.
  • mode_changer.py - re-configures the ODB for different experimental modes, and changes between real/test run types.
  • rf_calculator_fe.py - generates PPG program based on user-specified ODB parameters; reports calculated quantities in the ODB; sets up links in the ODB that should be stored in MUD files.
  • ppg_compiler_fe.py - low-level conversion of PPG program to PPG bytecode.
  • run_comment_editor.py - responds to comments written by user in text box on the status page, saving to disk and/or midas banks.
  • mhttpd - Midas webserver.
  • mlogger - Midas logger that writes data to Midas files and to history.
  • mserver - Midas RPC server that enables communication between programs on different machines (e.g. lxbnmr and isdaq01).

Communication between programs:

  • RPC between febnmr.exe/febnqr.exe and ppg_compiler_fe.py to generate PPG bytecode at begin-of-run
  • RPC between rf_calculator_fe.py and ppg_compiler_fe.py to generate PPG bytecode when user clicks button on Settings webpage
  • Midas buffers between febnmr.exe/febnqr.exe and bnxr_logger.exe for data
  • Midas buffers between kalliope_fe.exe and bnxr_logger.exe for data
  • RPC between febnmr.exe/febnqr.exe and kalliope_fe.exe to inform it of PPG cycle state
  • run_comment_editor.py writes to a text file in the data directory, which is then read by bnxr_logger.exe when it's time to write a new MUD file. Text file is deleted at the end of the run.

New PPG compiler

The old PPG compiler was based on template files, which were very limited. One had to specify which channels should be on, and how long to hold that pattern for. Consider the case where you have pulses on 2 channels, but the timing and width of the pulses vary. Depending on how the pulses overlap, you may need to create up to 6 different templates. This is both quite labour intensive, and can lead to copy/paste errors giving subtly different behaviour between the different versions.

Old ppg timing.png


The new compiler is written in python, and is more flexible and intelligent. It is based on time offsets and pulses, so the 6 different options are now all the same - you have a pulse on each channel. The new compiler figures out if the pulses are overlapping automatically, and will generate the appropriate bytecode.

New ppg timing.png

Features of the core framework not used by BNMR/BNQR

The core "cycling framework" allows some flexibility that is not currently used by BNMR/BNQR:

  • 2D scanning - BNxR only ever scans one variable, but the framework allows a 2D scan (e.g. scanning N values of an EPICS variable and M values of a CAMP variable, covering all NxM combinations).
  • Multiple variable scanning - BNxR only ever scans one variable, but the framework allows for multiple variables to be scanned at once (e.g. N values of an EPICS variable and N values of a CAMP variable, incremented both at the same time). The framework also allows up to 5 EPICS variables and 5 CAMP variables to be scanned together.
  • PPG timing scanning - the framework allows for the PPG program to be re-compiled during a scan, with different timing offsets/pulse widths. As BNxR users do not specify the PPG program directly (there is a pre-step converting human-readable questions), a bit of work would be needed to adapt this to BNxR's specific needs.

Configuration locations

  • Mapping scaler channels to histogram bank names: febnmr.cxx / febnqr.cxx
  • Mapping histogram bank names to MUD histograms: Logging webpage
  • Features of each mode: mode_changer.py
  • Real/test run number ranges: /RealOrTest in ODB (read by mode_changer.py) and febnmr.cxx/febnqr.cxx (for sanity-checking at start of run)
  • Human-understandable questions for configuring the PPG: ppg_prog_helper.py
  • PPG program "templates" (now just python code): ppg_prog_helper.py
  • ODB paths saved in MUD file: mud_ind_var_links.py (use by rf_calculator_fe.py)
  • EPICS variables logged in MUD file: Logging webpage
  • CAMP variables logged in MUD file: Logging webpage (to set the log_action to search for on the CAMP server)

Error meanings

If the DAQ encounters an unrecoverable error, it will stop the run (or crash if things went really badly). Examples of unrecoverable errors include invalid ODB values, lost connections to other servers etc. Details will generally be reported on the Messages page.

The status page shows how often various recoverable errors have been encountered during a run:

  • Constant time - if running in "constant time" mode (where there is a constant time between PPG cycles), this is the number of times the DAQ had not finished reading out data / setting up the next cycle by the time the next cycle was due to start. The DAQ will stop the PPG, step back a value in the scan, then try again. If this happens often, consider increasing the "DAQ service time" PPG variable.
  • Flipping - if flipping helicity (or flipping between sample/reference for BNQR), this reports how many times the observed state didn't match the expected state at the end of the cycle. The DAQ will try again. (If the DAQ fails to set the correct state at the start of a cycle it will stop the run; the "flipping" error is how often the state unexpectedly changed during a cycle). If this happens often, it suggests that something else is controlling the same device in addition to the DAQ software.
  • Empty bin - the scalers are configured to count pulses from a 25MHz clock on one if their channels. If any of the time bins on this channel report 0 counts, it indicates a failure of either the scaler hardware/firmware or the DAQ software. The DAQ will ignore the data from this cycle and try again. If this happens often, a DAQ expert should investigate the problem.
  • EPICS unstable - if scanning an EPICS variable (e.g. the Rb cell voltage), whether the measured value was unstable during a cycle. If so, the DAQ will ignore the data from this cycle and try again. There are custom tolerance functions defined for the Rb cell, laser and field that should suffice. If the tolerance check fails often, it can be overriden on the "EPICS definitions" page (Settings page > Miscellaneous > EPICS device > Define more devices here).
  • NB out-of-tolerance - the total counts on the neutral beam monitors for each cycle are recorded, and compared to both a "reference" value (generally the first cycle of the run) and the total from the previous cycle. If the current value is far from the reference or previous values, it indicates that the beam intensity is unstable. The DAQ will ignore the data from this cycle and try again. The tolerances (in %) can be configured in the ODB (Status page > Neutral beam > Set tolerances). The reference values can be updated by clicking the "Re-ref" buttons on the status page.
  • RF trip - power to the PSM (RF generator) has tripped. The DAQ will ignore the data from this cycle, unset the trip, and try again. If it happens often, a DAQ hardware expert should investigate the problem.

Real vs test data

In real data mode, run numbers start with a 4, data is archived to the central CMMS server at the end of the run, and the data is discoverable via the MUD run selection webpage.

In test data mode, run numbers start with a 3, data is not archived at the end of the run, and so is not discoverable from the CMMS website.

Current status

Feature status

As of January 2021:

  • Features complete and fully tested:
    • PPG compilation produces same bytecode as legacy DAQ
    • Mode changing and configuration of frontend
    • RF computations
    • Autorun interactions
    • Manually saving and loading settings; loading settings from old MUD files
    • PPG and RF visualization on webpage
    • Saving run comments in the MUD file
    • MUD file creation, cleanup and archiving
  • Features partially tested:
    • Dual channel mode
    • lcrplot changes (code compiles, functionality not tested)
    • Kalliope (frontend code is integrated and data can be written to midas files but not MUD files)
  • To-do:
    • Fix any more user tools that access ODB
    • User testing and feedback
    • Deploy as the default DAQ (and change default login shell to bash)
    • Convert cronjob to use new beamtime monitor script

Final testing and deployment is planned for the 2020/2021 winter shutdown.

Testing performed

  • Low-level tests of the PPG compiler and simulator (automated tests in cycling_framework repo).
  • Comparing generated PPG bytecode between old and new DAQs (automated tests for all modes in bnxr repo). In most cases, the generated bytecode is identical. The only discrepancies were due to minor copy/paste errors between related templates in the old DAQ. I manually checked that the new bytecode makes sense.
  • Comparing computed IQ pairs, frequencies, bin ranges etc between old and new DAQs (automated tests for all modes in bnxr repo).
  • Comparing RF output on oscilloscope (done manually for all modulation modes on both experiments).
  • Comparing MUD file structure and content between old and new DAQs (done manually for all modes). I injected periodic test pulses into the scalers, then used a python script to highlight differences between the two MUD files.
  • Extensive manual testing on a dummy experiment on my laptop, and the two real DAQs.

Compiling

Code location

Currently the new DAQ code (and updated versions of external code) are in:

  • /home/bnmr/cycling_dev/packages
  • /home/bnqr/cycling_dev_bnqr/packages

When the bnmr group are happy with the new DAQ, these will migrate to the "main" locations in $HOME/packages.

Updating midas

Note that bash has been configured to see the new DAQ version by default; csh still sees the old version by default.

# On isdaq01:
bash

# Get updates
cd $MIDASSYS
git pull
git submodule update

# Compile 64-bit version
cd build
make install

# Cross-compile 32-bit version
cd ..
make linux32
# Change status page from midas default to a symlink
if ! [ -L $MIDASSYS/resources/status.html ]; then
  mv $MIDASSYS/resources/status.html $MIDASSYS/resources/status.html.orig
  if [ $USER == "bnmr" ]; then
    ln -s ~/cycling_dev/packages/benmr/bnxr_common/custom/status.html $MIDASSYS/resources/status.html
  else
    ln -s ~/cycling_dev_bnqr/packages/benqr/bnxr_common/custom/status.html $MIDASSYS/resources/status.html
  fi
fi

Updating bnmr/bnqr

Note that bash has been configured to see the new DAQ version by default; csh still sees the old version by default.

# On isdaq01:
bash

# Get updates
cd ~/cycling_dev/packages/benmr
git pull
git submodule update

# Build 64-bit and 32-bit executables (no need to compile anything on lxbnmr/lxbnqr)
cd build
make install