BNMR: Troubleshooting

From DaqWiki
Jump to navigation Jump to search

Introduction

When troubleshooting the BNMR/BNQR DAQ system, the first question to ask is "Has anything changed since the system last ran successfully?". If no, then it is likely that the problem is with the hardware. First check that the VME crates and required NIM bins are turned on.

Access to the VME cpus

Since the BNMR/BNQR platforms are electrically isolated, accessing the VME cpus from off-platform can fail even if the VME crate is switched on. Check the network connection. If the VMICs can be accessed, but not the CAMP MVME162, check the firewall.

Network connection to platform

The ethernet is connected to both BNQR and BNMR HV platforms via a pair of optical couplers (small gray boxes). The high voltage end is located close to the VME Crates on each platform. They are connected via orange fibre-optic cables to the low voltage (ground) ends which are in one of the blue racks on the floor. The low-voltage boxes are also connected directly to the site Ethernet by a wall connection. The high voltage end is connected to an ethernet switch, to which all the network devices on the platform are connected. If the platform network fails, there will be no flashing lights on the ethernet switches on the platforms. Make sure that power is on to the ethernet switch on the platform and both optical couplers (platform and ground). Check that the site ethernet connection to the low-voltage box is still working. If working, the ethernet lights will be flashing in the grey optical coupler box. The site connection could have been disconnected for some reason. A laptop may be useful here. If the site connection is working, the problem may be with one of the optical couplers. Swap in the spare or borrow the box from the other platform to diagnose the problem.

Firewall to platform

The CAMP MVME162 cpus are isolated from the network by a firewall, provided by the cpus daqfire1 and daqfire2, located off the platform in blue racks. These need to be up and running. The firewall was installed to prevent the CAMP mvme162 from causing trouble on the control group network (with broadcasts not properly handled by the control group cpus I believe).

Frontend not running

Restart the frontend either by pressing the button on the webserver Programs page, or by issuing start-all in an xterm. If the frontend dies after restarting, check the error messages in the frontend window and/or on the webserver Messages page.

Frontend stuck on first cycle

After starting a run, and the hardware has been initialized, the frontend window should show lines of the form

Seen 1/100 bins
Seen 2/100 bins
Seen 4/100 bins
Seen 8/100 bins

This shows that the cycle has started, and the Scaler is receiving "External Next" pulses from the PPG. The cycle ends when the expected number of pulses have been received from the PPG, the data is read out and the PPG cycle restarted.

If the scaler receives too few or no "External Next" pulses from the PPG, the frontend will become stuck (generally saying Seen 0/100 bins or similar). This may be due to bad cabling. The PPG output "MCS Next" is connected to the Scaler "External Next" Input via a NIM fan-out module. Check that the NIM bin is powered up, and that the fan-out module is still operational.

If the scaler receives too many "External Next" pulses, the DAQ will run but the data will not be correct. Extra bins from the cycle may appear in subsequent cycles, so the data gets out of phase. There is a test in the frontend code (using scaler Ref Ch 1) that should detect this.

If running in dual-channel mode, the PPG is started by an external signal from the kicker. Check if the signal is being received, and that the operators have configured dual channel mode correctly in EPICS. If running in single-channel mode, the PPG is started by a software signal from the frontend, so this is not a concern.


Debugging VME DAQ Modules

Individual VME DAQ Modules can be debugged with test programs - see BNMR: Hardware Debugging.