VME-PPG32: Difference between revisions

From DaqWiki
Jump to navigation Jump to search
m (12 revisions imported)
No edit summary
 
Line 1: Line 1:
== VME-PPG32 - pulse pattern generator VME FPGA board ==
= VME-PPG32 Pulse Pattern Generator =


=== References ===
The '''VME-PPG32''' is a VME FPGA board that generates precise digital pulse patterns on 32 NIM output channels. It is driven by a 100 MHz clock (10 ns resolution) and executes a small user-written program stored in on-board memory. Programs can contain loops, subroutines, and branches, making it possible to generate multi-period sequences that last from nanoseconds to hours from a few dozen instructions.


* [https://edev.triumf.ca/projects/edevel00022] VME-PPG32 Rev0 (REA 198) project page on edev.triumf.ca
This page covers hardware specifications, the VME register interface, the instruction set, and a step-by-step programming tutorial.
* [https://edev.triumf.ca/projects/edevel00067] VME-PPG32 Rev1 (REA 198) project page on edev.triumf.ca
* [https://edev.triumf.ca/projects/edevel00125] VME-PPG32 Rev2 (REA 198) project page on edev.triumf.ca
* [https://edev.triumf.ca/documents/27] Rev0 board schematics on edev site
* [https://edev.triumf.ca/documents/111] Rev1 board schematics on edev site
* [https://edev.triumf.ca/documents/158] Rev2 board schematics on edev site
* [[Image:VME-PPG32 Rev1.pdf]] Rev1 board schematics, local copy
* [[Image:VME-PPG32 Rev2.pdf]] Rev2 board schematics, local copy


VME-PPG32-IO32 firmware: (IO32 functions, no PPG functions)
This page has been re-written for clarity with the help of an AI. For the old instructions see this page: [[VME-PPG32 (legacy)]]
* (obsolete) [http://ladd00.triumf.ca/viewcvs/daqsvn/trunk/VME-PPG32-Rev0] Svn repository for VME-PPG32-Rev0 initial test firmware (IO32 function only, no PPG function)
* (obsolete) [http://ladd00.triumf.ca/viewcvs/daqsvn/trunk/VME-NIMIO32/VME-NIMIO32/PPG32-Rev1] Svn repository for VME-PPG32-Rev1 firmware (IO32 function only, no PPG function)
* https://bitbucket.org/ttriumfdaq/vme-nimio32/ - main firmware repository (IO32 function)
* https://bitbucket.org/ttriumfdaq/vme-nimio32/src/master/VME-NIMIO32/PPG32-Rev1/ (IO32 function for the PPG32)


VME-PPG32 VME CPLD firmware: (VME address decoder)
__TOC__
* (obsolete) [http://ladd00.triumf.ca/viewcvs/daqsvn/trunk/VME-NIMIO32/MAX3000A_Addr_decode] Svn repository for VME-PPG32 Rev0 and Rev1 VME address decoder CPLD (Altera EPM3032)
* https://bitbucket.org/ttriumfdaq/vme-nimio32/src/master/MAX3000A_Addr_decode/ - VME address decoder CPLD (PPG32 is same as IO32)


VME-PPG32 firmware: (PPG function)
----
* [http://daq-plone.triumf.ca/HR/VME/ppg32] Current PPG firmware Source/binary


=== General characteristics ===
== How the PPG Works ==


==== Available hardware ====
Conceptually the PPG32 is a tiny processor dedicated to driving 32 output pins. You load a short '''program''' into its on-board memory, then '''trigger''' it; the board executes the program top-to-bottom and drives its NIM outputs accordingly.


* Altera cyclone 3 FPGA: EP3C40Q240C8
Each '''instruction''' answers four questions:
* Serial flash for FPGA configuration: Altera EPCS16
* VME interface: VME-D[31..0] bidirectional, VME-A[23..0] input only, DTACK output, no BERR, no RETRY/RESP. VME-A[31..20] input only connected only to address decoder FPGA (Altera MAX-something CPLD). This permits all single-word transfer modes, 32-bit DMA (BLT32) and 2eVME DMA (only drives D-lines, but still faster than BLT32). 64-bit DMA (MBLT64) and 2eSST are impossible.
* 32 NIM outputs
* 4 NIM inputs
* 32 "NIM output" LEDs
* 4 "NIM input" LEDs
* 1 "VME access" LED
* 2 output serial DAC: AD5439YRUZ, output is unipolar 0..2.5V, 10-bit accuracy.
* Rev1 and Rev2 boards: inputs are switchable between NIM and TTL (JMP3)
* Rev2 boards: outputs are switchable between NIM and TTL (SW1 micro switches)


==== PPG characteristics ====
* '''Which outputs go HIGH?''' — the ''SET mask''.
* '''Which outputs go LOW?''' — the ''CLEAR mask''.
* '''For how long?''' — the ''delay count'', in units of 10 ns.
* '''What happens next?''' — the ''instruction type'': continue to the next slot, loop, call a subroutine, branch, or halt.


  4k words (128bit words) of program memory.
Instructions execute sequentially from slot 0 unless a loop, branch, or subroutine redirects the flow. A 256-entry hardware stack supports nested loops and subroutine calls.
  256 entry stack.
  Halt/Continue/Loop/Subroutine/Branch instructions                                   
  100Mhz clock (derived from either 50Mhz internal crystal or external 20Mhz clock input - see note 1)
  fixed 3 clk per instruction plus 32bit delay individualy programmable for each instruction


  FPGA resource Usage : 1044 LE, 500 kbits Memory
A typical workflow is:


# '''Reset''' the board — puts it in a known, halted state.
# '''Write''' the program over the VME bus — one instruction per memory slot.
# '''Trigger''' execution, either by software (write a bit over VME) or by an external NIM pulse.
# '''Poll''' the status bit until the program halts, then read results or load the next program.


  Note 1 :
=== Output Model: SET and CLEAR ===
  Some modules allow an external clock frequency of 20-100MHz . Divide-downs must be programmed where external clock frequency is not 20MHz.
  Examples of divide-down programming is shown at end of this document.


=== Onboard jumper settings ===
Each instruction carries two independent 32-bit masks. A '''1''' in the SET mask drives that channel HIGH; a '''1''' in the CLEAR mask drives it LOW. Bit 0 corresponds to channel 1, bit 31 to channel 32.


* JMP1 - set to "INP" for input with 50 Ohm termination, set to "DAC" for DAC output
The hardware documentation does not specify what happens to a channel whose bit appears in '''neither''' mask, nor in '''both'''. The safe, conventional practice — used throughout this page and in the UCN sequencer code — is to assign every channel to '''exactly one''' mask in every instruction, so all 32 outputs are fully defined at each step. In code this usually appears as a value and its bitwise complement:
* JMP2 - set to "INP" same as JMP1
* JMP3 - set to "NIM" (pins 1-2) for NIM inputs or "TTL" (pins 2-3) for TTL inputs
* JMP4 - "MSEL1" jumper set to "ACT" for use with the active-serial flash
* IrqSel - leave open
* JTAG - leave open (not a jumper block!)
* SW1..3 - VME base address selectors (see below)


=== Firmware update procedure ===
<pre>
set_mask = my_outputs;
clr_mask = ~my_outputs;  // every other channel explicitly driven LOW
</pre>


(note1: right now the PPG firmware update procedure is unnecessary complicated because there is no ppg.pof file in the PPG firmware distribution version "1mar12" and the PPG firmware does not include the Altera active-serial programmer block.)
----


(note2: VME-PPG32-IO32 firmware (sof file) is required to update the PPG firmware)
== Hardware Description ==


* obtain the PPG firmware zip file (from http://daq-plone.triumf.ca/HR/VME/ppg32)
=== FPGA and Memory ===
* extract the ppg.jic file
* obtain the VME-PPG32-IO32 sof file (VME-PPG32.sof) from https://ladd00.triumf.ca/viewvc/daqsvn/trunk/VME-NIMIO32/VME-NIMIO32/PPG32-Rev1/VME-PPG32.sof?view=log


==== Update using USB-Blaster and jic file ====
{| class="wikitable"
|-
! Component !! Part / Value
|-
| FPGA || Altera Cyclone 3 EP3C40Q240C8
|-
| Configuration flash || Altera EPCS16 serial flash
|-
| Program memory || 4096 × 128-bit words (4k instructions)
|-
| Call stack || 256 entries
|-
| FPGA resources || 1044 LEs, 500 kbits internal memory
|}
 
Programs are written to and read from program memory over the VME bus. Each 128-bit instruction occupies one slot; the program counter advances one slot per instruction executed.


* use Quartus programmer to burn the jic file through the EP3 active-serial flash loader:
=== I/O Characteristics ===
* start Quartus programmer (tools->programmer)
* say "auto detect" - EP3, EPM1270 and EPM3032A should be detected
* attach VME-PPG32.sof to the EP3 part (context menu "change file")
* select "program"
* say "start"
* observe that 2 "red" LEDs have turned on on the PPG board (40MHz clock on outputs 4 and 20).
* say "auto detect"
* an "EPCS16" part should show up attached to the EP3.
* attach ppg.jic to the EPCS16 part
* select "program"
* say "start"
* observe progress bar go from 0 to 100% in about 2 minutes.
* PPG firmware is now loaded into the board
* cycle the power on the board to reboot into the PPG firmware
* when running the PPG firmware, all LEDs are off after reboot.


==== Update using VME flash programmer ====
{| class="wikitable"
|-
! Feature !! Detail
|-
| NIM outputs || 32 channels (front panel)
|-
| NIM inputs || 4 channels (front panel)
|-
| Output LEDs || 32 (one per output channel)
|-
| Input LEDs || 4 (one per input channel) + 1 VME access LED
|-
| Serial DACs || 2 × AD5439YRUZ, unipolar 0–2.5 V, 10-bit
|-
| Rev1/Rev2 || Inputs switchable NIM/TTL via JMP3
|-
| Rev2 only || Outputs switchable NIM/TTL via SW1 micro switches
|}


(note: VME flash programmer interface does not work in ppg firmware version "1mar12").
=== Clock ===


* obtain the latest copy of srunner_vme (follow instructions here [[VME-NIMIO32#Firmware_update_procedure]]) (srunner_vme.cxx svn rev 214 or newer for jic support)
The PPG runs at '''100 MHz''' (10 ns per tick). The clock source is selected by the CSR:
* ./srunner_vme_gef.exe -program -16 ppg.jic 0x100024
* reboot the PPG


(note: the above does not work: b) srunner_vme treats 0x10xxxx is an A24 address, but PPG firmware does not respond to A24 addresses; c) the active serial interface does not seem to work anyway)
* '''Internal:''' 50 MHz crystal on board, doubled to 100 MHz via PLL.
* '''External:''' 20 MHz NIM signal on Input 3, scaled to 100 MHz via the on-board PLL (see [[#Clock Control Register (+0x30)|Clock Control Register]]).


(note: the reboot function is not available in the PPG firmware)
Every instruction takes a fixed '''3 clock cycles''' of overhead regardless of type, plus a programmable 32-bit delay. The actual dwell time for any instruction is therefore:


==== Update using VME flash programmer when running VME-PPG32-IO32 firmware ====
<blockquote>'''dwell = (3 + delay_count) × 10 ns'''</blockquote>


* ./test_VMENIMIO32_gef.exe --addr 0x100000 --read 0  ### to confirm VME-PPG32-IO32 firmware revision
The delay count is a 32-bit field, so the maximum single-instruction dwell is approximately '''42.9 seconds''' (2<sup>32</sup> × 10 ns).
* ./srunner_vme_gef.exe -program -16 ppg.jic 0x100020
* ./test_VMENIMIO32_gef.exe --addr 0x100000 --reboot ### PPG will stop responding
* ./test_a32.exe 0x100000 ### should read 0x00000000 (NOT 0xFFFFFFFF)


=== VME interface ===
'''Caveat:''' the source wiki states a 10-second maximum, which conflicts with the documented 32-bit field width and is unexplained. Until verified on hardware, treat ~10 s as a conservative practical limit, and use a [[#Looping for Long Durations|loop]] for anything longer.


VME A32/D32 access only. Rotary switches SW1, SW2 and SW3 set the upper 12 bits of the address.
=== VME Interface ===
VME registers are listed in the table below.


==== Registers ====
{| class="wikitable"
{| cellpadding="10" cellspacing="0" border="1"
! Number || Address || Name || Access || Description
|-
|-
|  0 || 0x00000 || CSR || RW || Control/Status Register
! Parameter !! Value
|-
|-
| 1 || 0x00004 || Test || RW || Test Register
| Address space || A32 only
|-
|-
| 2 || 0x00008 || Addr || RW || Program Address Register
| Data width || D32 only
|-
|-
| 3 || 0x0000C || Inst_Lo || RW || Instruction Register Part 1/4
| Transfer modes || Single-word, 32-bit DMA (BLT32), 2eVME DMA
|-
|-
| 4 || 0x00010 || Inst_Med || RW || Instruction Register Part 2/4
| Not supported || MBLT64, 2eSST
|-
|-
| 5 || 0x00014 || Inst_Hi || RW || Instruction Register Part 3/4
| Data direction || VME-D[31..0] bidirectional; VME-A[23..0] input only
|-
|-
| 6 || 0x00018 || Inst_Top || RW || Instruction Register Part 4/4
| Handshake || DTACK output
|}
 
=== Jumper Settings ===
 
{| class="wikitable"
|-
|-
| 7 || 0x0001C || Inv_Mask || RW || Output Inversion Mask
! Jumper !! Setting !! Function
|-
| JMP1 || INP / DAC || NIM input with 50 Ω termination, or DAC output
|-
| JMP2 || INP / DAC || Same as JMP1
|-
| JMP3 || NIM (1–2) / TTL (2–3) || Input signal standard selection (Rev1 and later)
|-
| JMP4 || ACT || Active-serial flash programming mode (leave set)
|-
| IrqSel || Open || Leave open
|-
| SW1–3 || — || VME base address selection (A20–A31)
|}
 
=== NIM Input Assignments ===
 
{| class="wikitable"
|-
! Input !! Function
|-
| 4 || External start — rising edge triggers program execution; subsequent edges while running are ignored
|-
| 3 || External 20 MHz clock input — used by PLL to generate 100 MHz PPG clock
|-
| 2 || Unassigned
|-
| 1 || Unassigned (outputs internal PPG clock in test mode)
|}
 
=== Input LED Indicators ===
 
{| class="wikitable"
|-
|-
|  8 || 0x00020 || Version || R || Firmware Version Register
! LED !! Meaning
|-
|-
| 9 || 0x00024 || Flash || RW || Serial Flash Control Register
| 4 || Clock source: lit = external clock selected
|-
|-
| 10 || 0x00028 || Serial || R || Serial Number Register
| 3 || NIM Input 2 signal status
|-
|-
| 11 || 0x0002C || Hardware || R || hardware Identification Register
| 2 || External clock quality: lit = external clock present and good
|-
|-
| 12 || 0x00030 || clock Control || RW || Clock Control Register
| 1 || Program running: lit = PPG is executing
|}
|}


===== CSR Register (0x00000) =====
----
 
== Hardware Setup ==
 
=== VME Address Configuration ===
 
The VME base address is set by address switches SW1–3 on the board. Set address bits A20–A23 to <code>1</code> and A24–A31 to <code>0</code> to place the board at <code>0x00100000</code> (the default used in firmware examples on this wiki). After setting switches, program the address-decoder CPLD (EPM3032) using a JTAG programmer.
 
=== Startup and Verification Checklist ===
 
# Verify power supply voltages — check for shorts on 1.2 V, ±3.3 V, 12 V, and 2.5 V rails before powering on.
# Set VME address switches.
# Program the address-decoder CPLD (EPM3032) via JTAG if the board is new or the CPLD was cleared.
# Load the <code>VME-PPG32</code> firmware (<code>.sof</code>) to the FPGA via Quartus Programmer or the VME flash method below.
# Verify board presence: <code>vmescan</code> should detect the board at the configured base address.
# Run the built-in register test (write <code>0xBEEFBEEF</code> to the Test register at offset <code>+0x04</code> and read it back).
# Exercise NIM outputs and inputs, and verify LEDs respond.
# Flash firmware to active-serial memory (EPCS16) for persistence across power cycles.
 
=== Firmware Update Methods ===
 
The board runs one of two firmware personalities: '''VME-PPG32-IO32''', a plain 32-channel I/O image with no sequencer, and '''VME-PPG32''', the full pulse-pattern generator described on this page. The PPG features documented here require the VME-PPG32 image.
 
'''Method 1 — USB-Blaster JTAG (preferred for initial programming):'''


The first 5 bits control the ppg, the remaining 26 bits provide read-only status information.
# Start Quartus Programmer and auto-detect.
(writes to the 26 bits of status information are ignored, and overwritten on the next status update)
# Attach <code>VME-PPG32.sof</code> to the EP3C40 device and program it.
# Auto-detect again, attach <code>ppg.jic</code> to the EPCS16 device, and program it (takes approximately 2 minutes).
# '''Power-cycle the board''' to reboot into the PPG firmware. When running the PPG firmware, all LEDs are off after reboot.


{| cellpadding="10" cellspacing="0" border="1"
'''Method 2 — VME flash programmer (requires IO32 firmware already loaded):'''
|+CSR Description
 
<pre>
./srunner_vme_gef.exe -program -16 ppg.jic 0x100020
./test_VMENIMIO32_gef.exe --addr 0x100000 --reboot
</pre>
 
----
 
== Register Reference ==
 
=== Register Map ===
 
All registers are accessed at 32-bit aligned offsets from the board's VME base address.
 
{| class="wikitable"
|-
! # !! Offset !! Name !! Access !! Description
|-
|-
!Bit || Name || Access || description
| 0 || <code>+0x00</code> || CSR || R/W || Control and Status Register
|-
|-
|0|| Run || R/W || Run Control/Status
| 1 || <code>+0x04</code> || Test || R/W || Test register (read-back verification)
|-
|-
|1|| Ext-Clk-Toggle || W || Toggles between PPG external and internal Clk
| 2 || <code>+0x08</code> || Addr || R/W || Program address — selects instruction slot to read/write
|-
|-
|2|| Ext-Start|| R/W || 1=Ext-PPG Start, 0=Int-PPG Start
| 3 || <code>+0x0C</code> || Inst_Lo || R/W || Instruction bits 0–31 (SET mask)
|-
|-
|3|| PPG-Reset || R/W || 1=Reset, 0=Normal operation
| 4 || <code>+0x10</code> || Inst_Med || R/W || Instruction bits 32–63 (CLEAR mask)
|-
|-
|4|| Test-Mode || R/W || 1=Test Mode, 0=Normal operation
| 5 || <code>+0x14</code> || Inst_Hi || R/W || Instruction bits 64–95 (delay count)
|-
|-
|16|| Ext-Clk-Sel || R || 1= External clock is selected, 0 = Internal clock selected
| 6 || <code>+0x18</code> || Inst_Top || R/W || Instruction bits 96–127 (type + data); '''writing this register commits the instruction to program memory'''
|-
|-
|17|| Ext-Clk good || R || 1= External clock is connected and is "good", 0= external clock not connected or is "bad"
| 7 || <code>+0x1C</code> || Inv_Mask || R/W || Output inversion mask — a 1 inverts the polarity of that output channel (one bit per output; useful for active-low signals)
|-
|-
|?-31|| Status || R || Readback of PC, SP, Current Delay Counter
| 8 || <code>+0x20</code> || Version || R || Firmware version (Unix timestamp)
|}
|-
| 9 || <code>+0x24</code> || Flash || R/W || Serial flash control
|-
| 10 || <code>+0x28</code> || Serial || R || Board serial number
|-
| 11 || <code>+0x2C</code> || Hardware || R || Hardware revision ID
|-
| 12 || <code>+0x30</code> || Clock Control || R/W || PLL reconfiguration (see [[#Clock Control Register (+0x30)|below]])
|}


RUN bit: Writing 1 here instructs the ppg to begin executing its program, reading this bit returns 1 if the program is still running, or 0 if halted.
=== CSR Register (<code>+0x00</code>) ===


Ext_Clk bit: Writing an edge here toggles the ppg logic between the internal and external CLK (connected to Nim_Input[3]). The VME interface always uses the internal clock.
The Control/Status Register is the main run-control interface.
Note that the correct divide-down must be programmed unless external clock is the same frequency (i.e. 10MHz) as internal clock.


Ext_Start bit: Writing 1 here disables the CSR-Run-bit-start. (Reading this bit still returns the correct status), and switches control to the External Start input (NIM_INPUT[4])
{| class="wikitable"
|-
! Bit(s) !! Name !! Access !! Description
|-
| 0 || Run || R/W || '''Write 1''' to start program execution from slot 0. '''Read:''' 1 = running, 0 = halted.
|-
| 1 || Ext-Clk-Toggle || W || Toggle between external and internal clock source.
|-
| 2 || Ext-Start || R/W || '''1''' = wait for rising edge on NIM Input 4 to start; '''0''' = start via CSR Run bit (software trigger).
|-
| 3 || PPG-Reset || R/W || '''Write 1''' to reset the PPG (clears program counter, halts execution). This is '''not''' a full power-up reset. '''Must be cleared''' (write 0) after reset or the board will not operate.
|-
| 4 || Test-Mode || R/W || '''1''' = test mode: NIM Input 1 outputs the internal PPG clock, NIM Input 2 outputs the active PPG clock. '''0''' = normal operation.
|-
| 16 || Ext-Clk-Sel || R || 1 = external clock currently selected (LED 4 lit).
|-
| 17 || Ext-Clk-Good || R || 1 = external clock signal present and locked (LED 2 lit).
|-
| ?–31 || Status || R || Readback of program counter (PC), stack pointer (SP), and current delay counter.
|}


Reset bit: Set bit to Reset PPG. Stops PPG pgm even if executing a long delay. Does NOT do a full reset to power-up condition. Bit must be cleared after Reset or module will not operate.
Common CSR write values and their effects:


Test-Mode bit: Set bit to enable Test Mode. When Test Mode is enabled, inputs 1 and 2 become output the internal PPG clock and actual clock PPG is using, respectively. If internal clock is set, inputs 1 and 2 output identical clocks. If external clock is set, and the external clock is "good", input 1 will not change, but input 2 will show the external clock frequency. 
{| class="wikitable"
If Test-Mode bit is cleared, Normal Mode is enabled, where inputs 1 and 2 act as regular inputs.
|-
! Value written !! Effect
|-
| <code>0x8</code> || Reset — clears program counter and halts execution
|-
| <code>0x0</code> || Idle — clears reset; selects software trigger mode (Ext-Start bit cleared)
|-
| <code>0x4</code> || Arm for external trigger — waits for rising edge on NIM Input 4
|-
| <code>0x1</code> || Software start — immediately begins executing from instruction slot 0
|}


Ext-Clk-Sel: If bit is set, external clock is selected and LED 4 will be lit.  If clear, internal clock is selected
=== Clock Control Register (<code>+0x30</code>) ===


Ext-Clk Good: If bit set: external clock is connected to Nim_Input[3] and "good", LED 2 will be lit. If clear: external clock is either not connected or "bad".
This register reprograms the on-board PLL to scale an external input clock to 100 MHz. It is only needed when operating with an external clock source other than the default 20 MHz.


===== Test Register (0x00004) =====
The PLL relationship is:


Simple Test Register - Value Written is preserved and can be read back.
<blockquote>VCO frequency = F<sub>in</sub> × M / N<br>Clock output = VCO / C0</blockquote>


===== Address Register (0x00008) =====
Operating limits: F<sub>in</sub>: 5–472 MHz; F<sub>VCO</sub>: 600–1300 MHz; lock time < 1 ms.


Sets PPG Program Memory Address - next instruction will be written to this location.  Also when program is started, execution begins from this address.  Also in test-Mode, the NIM/LED outputs follow the state of this register.
Default configuration for a 20 MHz input: M = 30, N = 1 (bypassed), C0 = 6 → 100 MHz output.


===== Instruction Registers (0x0000C - 0x00018) =====
'''Bit layout:'''


Registers to hold the 128bit Program Instructions.  Writing the Upper register triggers the storing of the entire 128bit instruction to the address currently in the address register.
{| class="wikitable"
 
|-
The instructions format is as follows ...
! Bits !! Description
{| cellpadding="10" cellspacing="0" border="1"
|-
|+Instruction Format
| 30–28 || Phase counter select (0 = all; 1 = M; 2–6 = Clock 0–4)
|-
| 26–24 || Counter parameter (0 = HighCount; 1 = LowCount; 4 = Bypass; 5 = Mode)
|-
| 23–20 || Counter type (0 = N; 1 = M; 2 = Cp/LF; 3 = VCO; 4–8 = Clock 0–4)
|-
| 16–8 || 9-bit parameter data
|-
| 5 || PLL Reset
|-
| 4 || Up/Down (1 = up; 0 = down)
|-
| 3 || PhaseStep
|-
|-
!Bits 0-31 || Bits 32-63 || Bits 64-95 || Bits 96-115 || Bits 116-117 || Bits 118-127
| 2 || Write parameter
|-
|-
| 32 Output Set Bits || 32 Output Clear Bits || 32bit Delay Count || 20bit Data || 4bit instruction type || Ignored
| 1 || Reconfigure
|-
|-
|}  
| 0 || Control trigger (toggle to apply changes)
|}


The 32bit delay count at 100Mhz gives maximum delay of 10 seconds per instruction.
'''Example: 100 MHz external frequency divide-down:'''


  The instruction types are ...
<pre>
  0 - Halt
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
  1 - Continue
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x00000305
  2 - new Loop        ( 20 bit data used for count - i.e. maximum 1 million )
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
  3 - End Loop
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x01000205
  4 - Call Subroutine ( 20 bit data used for address )
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
  5 - Return from subroutine
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x05000105
  6 - Branch          ( 20 bit data used for address )
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x04000005
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x3
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
</pre>


===== Output Inversion Mask (0x0001C) =====
'''Example: return to 10 MHz internal frequency:'''
32 individual inversion bits (1 per output) a 1 inverts the state of that output.


===== Firmware Version (0x00020) =====
<pre>
Returns the 32bit unix timestamp corresponding to the date this firmware was compiled.
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x04000105
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x3
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
</pre>


===== Serial Flash Control (0x00024) =====
----
Used for reading/Writing to the Configuration Flash device (Reading/Updating firmware)


===== Serial Number (0x00028) =====
== Instruction Set ==
Returns the module serial number, and Board revision (if set) or 0xdead


===== hardware Type (0x0002C) =====
=== Instruction Format ===
Returns a 32 bit description of this hardware - which should confirm the identity of this module as a vme-ppg32


===== Clock Control (0x00030) =====
Every PPG program is a sequence of 128-bit instructions stored in program memory. Each instruction is written to the board as four consecutive 32-bit register writes (to <code>Inst_Lo</code>, <code>Inst_Med</code>, <code>Inst_Hi</code>, <code>Inst_Top</code>). Writing <code>Inst_Top</code> commits the instruction to the currently selected slot.
Set PLL parameters applied to external clock input. Default parameters are for 20Mhz External Clock, and multiply this by 5 to get 100Mhz ppg clock.


Register Contents as follows ...
{| class="wikitable"
{| cellpadding="10" cellspacing="0" border="1"
|+Instruction Format
|-
|-
!Bits 30-28 || Bits 26-24 || Bits 23-20 || Bits 16-8 || Bit 5 || Bit 4 || Bit 3 || Bit 2 || Bit 1 || Bit 0
! Bits !! Register !! Name !! Description
|-
|-
| Phase Counter Select || Counter Parameter || Counter Type || 9 bit Data || PLL Reset || Up/Down || PhaseStep || Write Parameter || reconfigure || Control Trigger
| 0–31 || <code>Inst_Lo</code> || SET mask || One bit per output channel (bit 0 = channel 1 … bit 31 = channel 32). A '''1''' drives that channel HIGH.
|-
|-
| 32–63 || <code>Inst_Med</code> || CLEAR mask || Same bit-to-channel mapping. A '''1''' drives that channel LOW.
|-
| 64–95 || <code>Inst_Hi</code> || Delay count || Number of additional 10 ns clock cycles to hold this state. Total dwell = (3 + delay_count) × 10 ns.
|-
| 96–115 || <code>Inst_Top</code> bits 0–19 || Data || 20-bit payload — loop count or branch/call address, depending on instruction type.
|-
| 116–118 || <code>Inst_Top</code> bits 20–22 || Type || 3-bit instruction opcode (see [[#Instruction Types|Instruction Types]]).
|-
| 119–127 || <code>Inst_Top</code> bits 23–31 || — || Ignored.
|}
|}


The Control Trigger Bit [bit 0] enables the four control signals in bits 1,2,3,5, and needs to be toggled from off to on, to apply these 4 signals.
'''SET and CLEAR masks:''' these follow the [[#Output Model: SET and CLEAR|output model]] described above — assign each channel to exactly one mask per instruction.


On rising edge of control trigger ...
'''Note on the type field:''' the board's own register summary labels this field as a "4-bit instruction type" at bits 116–117, which is internally inconsistent (that range is only two bits). The actual opcode values listed below occupy bits 20–22 of <code>Inst_Top</code> (bits 116–118 of the full instruction), a 3-bit field. This page uses the values, which are unambiguous.
  Phasestep=1      => clock phase is adjusted by 1 unit, in the direction selected by "Up/Down" [bit 4: 1=up,0=Down]
  Write Parameter=1 => 9bit-Data, Counter-Type and Counter-Param are written into the PLL reconfiguration registers
  Reconfigure=1    => PLL is reconfigured with the parameters currently in its reconfiguration registers
  PLL-Reset=1      => PLL is reset


=== Instruction Types ===


PhaseCounterSelect ...
The 3-bit opcode occupies bits 20–22 of the <code>Inst_Top</code> word (bits 116–118 of the full 128-bit instruction). The 20-bit data payload occupies bits 0–19 of <code>Inst_Top</code>.
  0 => All Clocks
  1 => M ?
  2-6 => Clock 0-4


phasestep is applied immediately (to clocks selected by PhaseCounterSelect), and does not require a reconfiguration or reset.
{| class="wikitable"
|-
! Type !! Opcode !! <code>Inst_Top</code> value !! Description
|-
| Halt || 0 || <code>0x000000</code> || Stop execution. The program counter does not advance; CSR bit 0 goes low.
|-
| Continue || 1 || <code>0x100000</code> || Advance to the next instruction slot.
|-
| New Loop || 2 || <code>0x200000 + N</code> || Begin a loop that repeats N times (maximum N = 1,048,575 = 0xFFFFF). Pushes the loop start address and count onto the stack.
|-
| End Loop || 3 || <code>0x300000</code> || Decrement the loop counter. If > 0, jump back to the start of the loop body (the instruction after the matching New Loop); otherwise fall through.
|-
| Call || 4 || <code>0x400000 + addr</code> || Push the next instruction address onto the stack and jump to the 20-bit address in the data field.
|-
| Return || 5 || <code>0x500000</code> || Pop the return address from the stack and jump to it.
|-
| Branch || 6 || <code>0x600000 + addr</code> || Unconditional jump to the 20-bit address in the data field (does not push a return address).
|}


The parameters below need to be written (with writeparameter=1), and then require a reconfiguration to be applied.
=== Timing ===


Counter-Parameter ...
Every instruction occupies exactly '''3 clock cycles''' of overhead regardless of type. The delay count field adds additional cycles on top of that:
  0: HighCount [For VCO this parameter is: PostScale K=2 Yes/No]
  1: LowCount
  4: Bypass
  5: Mode Odd/Even


Counter-Type ...
<blockquote>
  0: N
'''dwell time = (3 + delay_count) × 10 ns'''
  1: M
</blockquote>
  2: Cp/LF
  3: VCO
  4-8: Clock 0-4
  9-D: Clock 5-9 (Stratix Only)
  E-F: Invalid


Clock frequencies are defined by ...
To produce a 280 ns pulse, write a delay count of 25: (3 + 25) × 10 = 280 ns.
  VCO Frequency = Fin * M/N
  The individual clock outputs [clock0-4] are given by ... VCO / C0-C4


  Each of M,N,C0-C4 are the sum of a high and low count.
The maximum delay count is 2<sup>32</sup> − 1 = 4,294,967,295, giving a maximum single-instruction dwell of approximately 42.9 seconds. Longer durations require a loop (see [[#Looping for Long Durations|Looping for Long Durations]]).
  Note - can get 50% duty cycle with odd count by setting mode=odd with high=low+1
  Each counter can be bypassed by setting bypass=1 (=> Scale=1)


Limits etc ...
'''Important:''' every instruction — including Halt, New Loop, End Loop, etc. — incurs the 3-cycle overhead. Loop and branch instructions with <code>delay_count = 0</code> still consume 30 ns.
  Fin  =   5 -  472 Mhz
  Fvco = 600 - 1300 Mhz
  Lock Time < 1ms


Examples ...
----
  For Fin = 20Mhz [defaults settings on pwerup]
  Fvco=1200,K=2 => 600Mhz .. M=30, N=1(Bypassed), C0=6(3+3) Fin=20Mhz => C0=100Mhz


  For Fin=100Mhz [and M=30,N=1,C0=6] Need to Change N to 5 ...
== Programming Tutorial ==
  Write:  0x00000305  0  0x01000205  0  0x05000105      0    0x04000005        0  0x3          0
            type=0,hi=3    type=0,lo=2    type=0,mode=odd      type=0,bypass=no      Reconfigure


  Change back from 100 to 20 - need to change N to bypassed ...
This section walks through writing PPG programs from first principles, building up to a complete real-world example. All examples use the <code>mvme_write_value</code> VME library function:
  write:   0x04000105        0    0x3          0
            type=0,bypass=no      Reconfigure


  For Deap 62.5 Mhz (20*25/8) Fvco=1000,k=2=>500 M=8 N=1 C0=5 gives 100Mhz
<pre>
                                Fvco=1000,k=2=>500 M=8 N=1 C0=8 gives 62.5Mhz
mvme_write_value(vme, address, value);
</pre>


**Note - to help debug clock setting problems the internal clock can be viewed on ppg output #1 (and the 20 Mhz internal clock on output #0), if the test-mode bit is set in the CSR [bit 4].  It is then possible to check the ppg clock is locked and at the correct frequency (100Mhz), and see the relation between the external and internal clocks.
where <code>vme</code> is an open VME handle, <code>address</code> is an absolute 32-bit VME address, and <code>value</code> is a 32-bit unsigned integer.


=== Front Panel ===
The examples assume a base address of <code>BASE_ADDR = 0x00100000</code>. Adjust for your hardware configuration.


==== NIM Inputs ====
=== The set_command Helper ===


  The Input assignments are (as labelled on front panel) ...
All PPG programming is built from a single primitive: selecting an instruction slot and writing its four 32-bit fields. It is convenient to wrap this in a helper:
  4 - External Start - Rising Edge starts, multiple starts are (or should be) ignored.
  3 - External Clock (20Mhz, internally scaled - 100Mhz)
  2 - Unassigned
  1 - Unassigned


==== Input LEDs ====
<pre>
void set_command(int slot,
                unsigned int set_mask,    // Inst_Lo:  channels to drive HIGH
                unsigned int clr_mask,    // Inst_Med: channels to drive LOW
                unsigned int delay,      // Inst_Hi:  extra 10 ns ticks
                unsigned int type_data)  // Inst_Top: opcode + payload
{
    mvme_write_value(vme, BASE_ADDR + 0x08, slot);      // select slot
    mvme_write_value(vme, BASE_ADDR + 0x0C, set_mask);
    mvme_write_value(vme, BASE_ADDR + 0x10, clr_mask);
    mvme_write_value(vme, BASE_ADDR + 0x14, delay);
    mvme_write_value(vme, BASE_ADDR + 0x18, type_data);  // commits instruction
}
</pre>


  The Input LED assignments are (as labelled on front panel) ...
All higher-level programming is built from calls to <code>set_command</code>.
  4 - Clock setting (Lit => External Clk)
  3 - NimIn[2] status
  2 - External Clock Good indicator (Lit => Clock is good)
  1 - Program Running


=== Test Software ===
=== Step 1: Reset the Board ===


==== c-shell script to run 2 nested loops ====
Before writing any program, reset the board to put it in a known state:


  set dec_hex=( 0x0  0x1  0x2  0x3  0x4  0x5  0x6  0x7  \
<pre>
                0x8  0x9  0xa  0xb  0xc  0xd  0xe  0xf  \
mvme_write_value(vme, BASE_ADDR + 0x00, 0x8); // CSR bit 3: assert reset
                0x10 0x11 0x12 0x13 0x14 0x15 0x16 0x17 \
mvme_write_value(vme, BASE_ADDR + 0x00, 0x0); // clear reset → idle
                0x18 0x19 0x1a 0x1b 0x1c 0x1d 0x1e 0x1f \
</pre>
                0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 )
 
  # dly4,loop4,dly3,loop6,dly2,end,end
  set inst=(                                                \
      0xff  0x00  0x4  0x100000  0x0  0x0  0x0  0x200004  \
      0x1  0x1  0x3  0x100000  0x0  0x0  0x0  0x200006  \
      0x2  0x2  0x2  0x100000  0x0  0x0  0x0  0x300000  \
      0x0  0x0  0x0  0x300000  0x0  0x0  0x0  0x000000  )
 
  set i=0
  set j=0
  while( $i < $#inst )
      @ j++; vme_poke -a VME_A32UD -A 0x00100008 -d VME_D32 $dec_hex[$j]
      @ i++; vme_poke -a VME_A32UD -A 0x0010000c -d VME_D32 $inst[$i]
      @ i++; vme_poke -a VME_A32UD -A 0x00100010 -d VME_D32 $inst[$i]
      @ i++; vme_poke -a VME_A32UD -A 0x00100014 -d VME_D32 $inst[$i]
      @ i++; vme_poke -a VME_A32UD -A 0x00100018 -d VME_D32 $inst[$i]
  end
 
  # start program (from addr 0)...
  vme_poke -a VME_A32UD -A 0x00100008 -d VME_D32 0x0
  vme_poke -a VME_A32UD -A 0x00100000 -d VME_D32 0x1


Asserting reset clears the program counter and halts execution. The reset bit '''must''' then be cleared (write <code>0x0</code>) or the board will not operate. After this the board sits idle in software-trigger mode, waiting for a program to be loaded and started. It will not fire on any trigger until you write <code>0x1</code> (software start) or <code>0x4</code> (external trigger arm) to the CSR.


==== c-shell script to set divide-downs for 100MHz frequency  ====
=== Step 2: Write a Safety Halt at Slot 0 ===


This would be used with an external frequency input of 100MHz
Always write a Halt to slot 0 before writing the rest of your program. If an external trigger edge arrives while you are still writing instructions, the PPG executes this Halt immediately rather than running a partially-written or stale program:


  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
<pre>
  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x00000305  # set hi counter to 3
set_command(0,
   vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
            0x00000000,   // SET:   no channels raised
   vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x01000205  # set lo counter to 2
            0xFFFFFFFF,   // CLEAR: all channels driven low
   vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
            0,            // delay: 0 extra ticks → 30 ns dwell
  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x05000105  # set mode counter to odd
            0x000000);    // type:  Halt
  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
</pre>
  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x04000005  # set counter bypass to 0
  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x3
  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0


=== Step 3: Write Your Program ===


==== c-shell script to set divide-downs for 10MHz  frequency  ====
Write instructions to consecutive slots starting from slot 1. Each instruction specifies which outputs are high, which are low, how long to hold that state, and what to do next.


This would be used when returning to the internal frequency of 10MHz :
'''Example: generate a single 280 ns pulse on channel 1, then halt.'''


  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x04000105 # set counter bypass to 1
Channel 1 is bit 0 of the output masks. The sequence is:
  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0        # prepare for next cmd
 
  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x3        # reconfig clock
# All outputs low for 30 ns (initial clear).
  vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0        # prepare for next cmd
# Channel 1 high for 280 ns.
# All outputs low, halt.
 
<pre>
// Slot 1: all outputs low for 30 ns (delay=0 → (3+0)×10 = 30 ns)
set_command(1, 0x00000000, 0xFFFFFFFF, 0, 0x100000);
 
// Slot 2: channel 1 (bit 0) high for 280 ns (delay=25 → (3+25)×10 = 280 ns)
set_command(2, 0x00000001, 0xFFFFFFFE, 25, 0x100000);
 
// Slot 3: all outputs low, halt
set_command(3, 0x00000000, 0xFFFFFFFF, 0, 0x000000);
</pre>
 
=== Step 4: Arm and Start ===
 
Once all instructions are written, arm or start the board by writing to the CSR:
 
<pre>
// Rewind program address to slot 0 before starting
mvme_write_value(vme, BASE_ADDR + 0x08, 0x0);
 
// Option A: software trigger — starts immediately
mvme_write_value(vme, BASE_ADDR + 0x00, 0x1);  // Run bit
 
// Option B: external trigger — waits for rising edge on NIM Input 4
mvme_write_value(vme, BASE_ADDR + 0x00, 0x4);  // Ext-Start bit
</pre>
 
=== Step 5: Detecting Completion ===
 
Poll CSR bit 0. When it falls to 0 the program has halted and the next sequence can be loaded:
 
<pre>
while (mvme_read_value(vme, BASE_ADDR + 0x00) & 0x1) {
    // still running — sleep or yield here
}
// Program has halted; safe to reload
</pre>
 
=== Timing Calculations ===
 
To produce a dwell of exactly T nanoseconds, the required delay count is:
 
<blockquote>delay_count = T / 10 − 3</blockquote>
 
For example, to hold an output high for exactly 500 ns:
: delay_count = 500 / 10 − 3 = '''47'''
 
The minimum possible dwell is '''30 ns''' (delay_count = 0). It is impossible to produce a dwell shorter than 30 ns with any single instruction.
 
When two consecutive instructions must sum to an exact total time, remember that each instruction carries its own 30 ns overhead:
 
: T<sub>total</sub> = (3 + d<sub>1</sub>) × 10 + (3 + d<sub>2</sub>) × 10 = (6 + d<sub>1</sub> + d<sub>2</sub>) × 10 ns
 
=== Looping for Long Durations ===
 
A single delay count can produce at most ~42.9 seconds. For longer durations, split the period across multiple loop iterations. Each iteration holds for a fraction of the total time; the loop instruction repeats until the full duration is reached.
 
'''Strategy:''' choose a loop count N and compute the per-iteration delay count so that N iterations sum to the target duration. The example below uses N = 100:
 
<pre>
double total_time_s = 30.0;
int N = 100;
 
// Each iteration should cover total_time_s / N seconds. Convert to 10 ns ticks.
// We ignore the ~30 ns spent on the End Loop instruction each iteration (and the
// one-time New Loop): at second scale that error is < 0.001 %. For tight timing,
// subtract the overhead or measure on hardware.
unsigned int body_delay = (unsigned int)(total_time_s * 1e8 / N);
 
// Slot i:  New Loop, repeat N times  (0x200000 + N) — runs once
// Slot i+1: hold outputs for body_delay ticks, Continue — the loop body
// Slot i+2: End Loop — jumps back to the body until the count is exhausted
set_command(i,  0x0,              0x0,              0,          0x200000 + N);
set_command(i+1, my_output_mask,  ~my_output_mask, body_delay, 0x100000);
set_command(i+2, 0x0,              0x0,              0,          0x300000);
</pre>
 
New Loop initialises the counter once; End Loop runs once per iteration and jumps back to the loop body. Both are 3-cycle instructions, so the loop adds a fixed 30 ns plus 30 ns per iteration of overhead — negligible for periods of seconds or longer.
 
Loops may be '''nested''', and combined with subroutine calls, up to the 256-entry stack depth: place another New Loop / End Loop pair (or a [[#Subroutines|subroutine call]]) inside the body.
 
=== Subroutines ===
 
Use Call and Return instructions to share a block of instructions across multiple points in a program. The hardware call stack holds up to 256 return addresses.
 
<pre>
// --- Main program ---
set_command(0, 0x0, 0xFFFFFFFF, 0,  0x000000);        // slot 0: safety Halt
set_command(1, 0x0, 0xFFFFFFFF, 10, 0x100000);        // slot 1: setup, Continue
set_command(2, 0x0, 0x0,        0,  0x400000 + 10);    // slot 2: Call subroutine at slot 10
set_command(3, 0x0, 0xFFFFFFFF, 0,  0x000000);         // slot 3: Halt (reached after Return)
 
// --- Subroutine at slot 10 ---
set_command(10, 0x00000001, 0xFFFFFFFE, 25, 0x100000); // pulse channel 1
set_command(11, 0x00000000, 0xFFFFFFFF, 25, 0x100000); // all low
set_command(12, 0x0,        0x0,        0,  0x500000); // Return
</pre>
 
=== Complete Example: Timing Calibration Sequence ===
 
This is the timing calibration sequence used in the UCN sequencer frontend. It fires 10 pulses on output channel 29 (bit 28, mask <code>0x10000000</code>) at a 0.2 s cadence so that downstream DAQ systems can establish an absolute time reference. It uses a software trigger.
 
The program layout:
 
{| class="wikitable"
|-
! Slot !! Instruction !! Purpose
|-
| 0 || All LOW, Continue (190 ns) || Blank at start of sequence
|-
| 1 || New Loop ×10 || Repeat 10 times
|-
| 2 || CH29 HIGH, Continue (280 ns) || Rising-edge timing pulse
|-
| 3 || All LOW, Continue (~0.2 s) || Off time between pulses
|-
| 4 || End Loop || Jump back to slot 1 nine more times
|-
| 5 || All LOW, Halt || End of sequence
|}


=== Procedure for newly assembled board startup and test ===
<pre>
// Reset and select software trigger
mvme_write_value(vme, BASE_ADDR + 0x00, 0x8);  // reset
mvme_write_value(vme, BASE_ADDR + 0x00, 0x0);  // clear reset (software trigger mode)


(note1: VME-PPG32-IO32 firmware is used to test the board)
// Slot 0: blank — all outputs low, 190 ns dwell (delay=16 → (3+16)×10=190 ns)
set_command(0, 0x00000000, 0xFFFFFFFF, 0x10, 0x100000);


(note2: TTL inputs and outputs are not tested)
// Slot 1: New Loop, repeat 10 times
set_command(1, 0x0, 0x0, 0x0, 0x20000A);  // 0x200000 + 10


(note3: DAC outputs are not tested)
// Slot 2: channel 29 (bit 28) HIGH, 280 ns pulse (delay=25 → (3+25)×10=280 ns)
set_command(2, 0x10000000, 0xEFFFFFFF, 25, 0x100000);


(note4: shorts between NIM outputs are not tested)
// Slot 3: hold all LOW for the rest of the 0.2 s period.
// 0.2 s = 2e7 ticks of 10 ns. Subtract the 28 ticks used by the slot-2 pulse
// ((3+25)=28). The few ticks of loop/instruction overhead are negligible at
// this scale, so we approximate:
unsigned int off_time = (unsigned int)(0.2 * 1e8) - 28;
set_command(3, 0x00000000, 0xFFFFFFFF, off_time, 0x100000);


* check the board for shorts of power to ground. Use multimeter in "ohm" mode, measure resistance between "gnd" and "1.2V", "-3.3V", "VME +12V", 2.5V" and "3.3V". None should measure 0 ohm.
// Slot 4: End Loop
* set VME address jumper A20-23 to "1", jumpers A24-27 and A28-31 to "0"
set_command(4, 0x0, 0x0, 0x0, 0x300000);
* set inputs and outputs to "NIM" mode
* connect JTAG USB blaster
* power up the board (standalone or in a VME crate)
* start Quartus programmer
* select correct USB blaster
* run "auto detect", 3 devices should be detected: EP3C40Q240 (Cyclone3 FPGA), EPM1270 (parallel flash loader CPLD), EPM3032AT44 (VME address decoder CPLD)
* flash the VME address decoder pof file into the EPM3032 part (get pof file from here: https://ladd00.triumf.ca/viewvc/daqsvn/trunk/VME-NIMIO32/MAX3000A_Addr_decode/VME_Addr_decode.pof?view=log)
* (do not do this) flash the CFI parallel flash loader into the EPM1270 part (get pof file where?!?)
* load the VME-PPG32-IO32 firmware sof file into the EP3 part (get sof file here: https://ladd00.triumf.ca/viewvc/daqsvn/trunk/VME-NIMIO32/VME-NIMIO32/PPG32-Rev1/VME-PPG32.sof?view=log)
* run "vmescan_gef.exe", it should detect the IO32 board at A24 VME address 0x00100000, data should correspond to the sof file revision date code
* confirm VME access LED is working (flashes during vme scan).
* confirm VME Data bus is okey: "./test_VMENIMIO32_gef.exe --addr 0x100000 --testbits 4"
* (do not do this) follow the firmware update instructions to flash the firmware pof file using the VME flash programmer at [[VME-NIMIO32#Firmware_update_procedure]] (get VME-PPG32-IO32 pof file from here: https://ladd00.triumf.ca/viewvc/daqsvn/trunk/VME-NIMIO32/VME-NIMIO32/PPG32-Rev1/VME-PPG32.pof?view=log)
* (do not do this) confirm FPGA reboot is working - "./test_VMENIMIO32_gef.exe --addr 0x100000 --reboot" prints 0xFFFFFFFF on the second read of firmware revision
* test NIM inputs and LEDs: use NIM pulse generator or any NIM module inverted output, connect to each NIM input, observe that corresponding "green" LEDs is lighting up
* test NIM outputs and LEDs: "./test_VMENIMIO32_gef.exe --addr 0x100000 --nimout 3 1 --pulsenim", observe all "red" LEDs are flashing, connect NIM outputs to NIM scaler, observe scaler counts at each LED flash
* load PPG firmware into the active serial flash (follow instructions here: [[#Update_using_VME_flash_programmer_when_running_VME-PPG32-IO32_firmware]]
* unplug the board from VME, wait 10 sec, plug it back in, confirm that it is detected by vmescan if running VME-PPG32-IO32 firmware or test_a32 or PPG test tools if running the PPG firmware (confirms the flash memory contents is good)


=== VME-PPG32-IO32 firmware ===
// Slot 5: all LOW, Halt
set_command(5, 0x00000000, 0xFFFFFFFF, 0x1, 0x000000);


The VME-PPG32 board can run a special version of VME-NIMIO32 firmware (subproject "PPG32-Rev1" of the VME-NIMIO32 firmware). For instructions, please refer to the [[VME-NIMIO32]] documentation.
// Rewind address pointer and start
mvme_write_value(vme, BASE_ADDR + 0x08, 0x0);  // address = slot 0
mvme_write_value(vme, BASE_ADDR + 0x00, 0x1);  // CSR: software start


Update 2018-Dec-05 - use /daq/daqshare/olchansk/altera/13.1.3.178/quartus/bin/quartus to build this PPG32 firmware. K.O.
// Poll for completion (~2 seconds)
while (mvme_read_value(vme, BASE_ADDR + 0x00) & 0x1) { /* wait */ }
</pre>

Latest revision as of 19:49, 28 May 2026

VME-PPG32 Pulse Pattern Generator

The VME-PPG32 is a VME FPGA board that generates precise digital pulse patterns on 32 NIM output channels. It is driven by a 100 MHz clock (10 ns resolution) and executes a small user-written program stored in on-board memory. Programs can contain loops, subroutines, and branches, making it possible to generate multi-period sequences that last from nanoseconds to hours from a few dozen instructions.

This page covers hardware specifications, the VME register interface, the instruction set, and a step-by-step programming tutorial.

This page has been re-written for clarity with the help of an AI. For the old instructions see this page: VME-PPG32 (legacy)


How the PPG Works

Conceptually the PPG32 is a tiny processor dedicated to driving 32 output pins. You load a short program into its on-board memory, then trigger it; the board executes the program top-to-bottom and drives its NIM outputs accordingly.

Each instruction answers four questions:

  • Which outputs go HIGH? — the SET mask.
  • Which outputs go LOW? — the CLEAR mask.
  • For how long? — the delay count, in units of 10 ns.
  • What happens next? — the instruction type: continue to the next slot, loop, call a subroutine, branch, or halt.

Instructions execute sequentially from slot 0 unless a loop, branch, or subroutine redirects the flow. A 256-entry hardware stack supports nested loops and subroutine calls.

A typical workflow is:

  1. Reset the board — puts it in a known, halted state.
  2. Write the program over the VME bus — one instruction per memory slot.
  3. Trigger execution, either by software (write a bit over VME) or by an external NIM pulse.
  4. Poll the status bit until the program halts, then read results or load the next program.

Output Model: SET and CLEAR

Each instruction carries two independent 32-bit masks. A 1 in the SET mask drives that channel HIGH; a 1 in the CLEAR mask drives it LOW. Bit 0 corresponds to channel 1, bit 31 to channel 32.

The hardware documentation does not specify what happens to a channel whose bit appears in neither mask, nor in both. The safe, conventional practice — used throughout this page and in the UCN sequencer code — is to assign every channel to exactly one mask in every instruction, so all 32 outputs are fully defined at each step. In code this usually appears as a value and its bitwise complement:

set_mask = my_outputs;
clr_mask = ~my_outputs;   // every other channel explicitly driven LOW

Hardware Description

FPGA and Memory

Component Part / Value
FPGA Altera Cyclone 3 EP3C40Q240C8
Configuration flash Altera EPCS16 serial flash
Program memory 4096 × 128-bit words (4k instructions)
Call stack 256 entries
FPGA resources 1044 LEs, 500 kbits internal memory

Programs are written to and read from program memory over the VME bus. Each 128-bit instruction occupies one slot; the program counter advances one slot per instruction executed.

I/O Characteristics

Feature Detail
NIM outputs 32 channels (front panel)
NIM inputs 4 channels (front panel)
Output LEDs 32 (one per output channel)
Input LEDs 4 (one per input channel) + 1 VME access LED
Serial DACs 2 × AD5439YRUZ, unipolar 0–2.5 V, 10-bit
Rev1/Rev2 Inputs switchable NIM/TTL via JMP3
Rev2 only Outputs switchable NIM/TTL via SW1 micro switches

Clock

The PPG runs at 100 MHz (10 ns per tick). The clock source is selected by the CSR:

  • Internal: 50 MHz crystal on board, doubled to 100 MHz via PLL.
  • External: 20 MHz NIM signal on Input 3, scaled to 100 MHz via the on-board PLL (see Clock Control Register).

Every instruction takes a fixed 3 clock cycles of overhead regardless of type, plus a programmable 32-bit delay. The actual dwell time for any instruction is therefore:

dwell = (3 + delay_count) × 10 ns

The delay count is a 32-bit field, so the maximum single-instruction dwell is approximately 42.9 seconds (232 × 10 ns).

Caveat: the source wiki states a 10-second maximum, which conflicts with the documented 32-bit field width and is unexplained. Until verified on hardware, treat ~10 s as a conservative practical limit, and use a loop for anything longer.

VME Interface

Parameter Value
Address space A32 only
Data width D32 only
Transfer modes Single-word, 32-bit DMA (BLT32), 2eVME DMA
Not supported MBLT64, 2eSST
Data direction VME-D[31..0] bidirectional; VME-A[23..0] input only
Handshake DTACK output

Jumper Settings

Jumper Setting Function
JMP1 INP / DAC NIM input with 50 Ω termination, or DAC output
JMP2 INP / DAC Same as JMP1
JMP3 NIM (1–2) / TTL (2–3) Input signal standard selection (Rev1 and later)
JMP4 ACT Active-serial flash programming mode (leave set)
IrqSel Open Leave open
SW1–3 VME base address selection (A20–A31)

NIM Input Assignments

Input Function
4 External start — rising edge triggers program execution; subsequent edges while running are ignored
3 External 20 MHz clock input — used by PLL to generate 100 MHz PPG clock
2 Unassigned
1 Unassigned (outputs internal PPG clock in test mode)

Input LED Indicators

LED Meaning
4 Clock source: lit = external clock selected
3 NIM Input 2 signal status
2 External clock quality: lit = external clock present and good
1 Program running: lit = PPG is executing

Hardware Setup

VME Address Configuration

The VME base address is set by address switches SW1–3 on the board. Set address bits A20–A23 to 1 and A24–A31 to 0 to place the board at 0x00100000 (the default used in firmware examples on this wiki). After setting switches, program the address-decoder CPLD (EPM3032) using a JTAG programmer.

Startup and Verification Checklist

  1. Verify power supply voltages — check for shorts on 1.2 V, ±3.3 V, 12 V, and 2.5 V rails before powering on.
  2. Set VME address switches.
  3. Program the address-decoder CPLD (EPM3032) via JTAG if the board is new or the CPLD was cleared.
  4. Load the VME-PPG32 firmware (.sof) to the FPGA via Quartus Programmer or the VME flash method below.
  5. Verify board presence: vmescan should detect the board at the configured base address.
  6. Run the built-in register test (write 0xBEEFBEEF to the Test register at offset +0x04 and read it back).
  7. Exercise NIM outputs and inputs, and verify LEDs respond.
  8. Flash firmware to active-serial memory (EPCS16) for persistence across power cycles.

Firmware Update Methods

The board runs one of two firmware personalities: VME-PPG32-IO32, a plain 32-channel I/O image with no sequencer, and VME-PPG32, the full pulse-pattern generator described on this page. The PPG features documented here require the VME-PPG32 image.

Method 1 — USB-Blaster JTAG (preferred for initial programming):

  1. Start Quartus Programmer and auto-detect.
  2. Attach VME-PPG32.sof to the EP3C40 device and program it.
  3. Auto-detect again, attach ppg.jic to the EPCS16 device, and program it (takes approximately 2 minutes).
  4. Power-cycle the board to reboot into the PPG firmware. When running the PPG firmware, all LEDs are off after reboot.

Method 2 — VME flash programmer (requires IO32 firmware already loaded):

./srunner_vme_gef.exe -program -16 ppg.jic 0x100020
./test_VMENIMIO32_gef.exe --addr 0x100000 --reboot

Register Reference

Register Map

All registers are accessed at 32-bit aligned offsets from the board's VME base address.

# Offset Name Access Description
0 +0x00 CSR R/W Control and Status Register
1 +0x04 Test R/W Test register (read-back verification)
2 +0x08 Addr R/W Program address — selects instruction slot to read/write
3 +0x0C Inst_Lo R/W Instruction bits 0–31 (SET mask)
4 +0x10 Inst_Med R/W Instruction bits 32–63 (CLEAR mask)
5 +0x14 Inst_Hi R/W Instruction bits 64–95 (delay count)
6 +0x18 Inst_Top R/W Instruction bits 96–127 (type + data); writing this register commits the instruction to program memory
7 +0x1C Inv_Mask R/W Output inversion mask — a 1 inverts the polarity of that output channel (one bit per output; useful for active-low signals)
8 +0x20 Version R Firmware version (Unix timestamp)
9 +0x24 Flash R/W Serial flash control
10 +0x28 Serial R Board serial number
11 +0x2C Hardware R Hardware revision ID
12 +0x30 Clock Control R/W PLL reconfiguration (see below)

CSR Register (+0x00)

The Control/Status Register is the main run-control interface.

Bit(s) Name Access Description
0 Run R/W Write 1 to start program execution from slot 0. Read: 1 = running, 0 = halted.
1 Ext-Clk-Toggle W Toggle between external and internal clock source.
2 Ext-Start R/W 1 = wait for rising edge on NIM Input 4 to start; 0 = start via CSR Run bit (software trigger).
3 PPG-Reset R/W Write 1 to reset the PPG (clears program counter, halts execution). This is not a full power-up reset. Must be cleared (write 0) after reset or the board will not operate.
4 Test-Mode R/W 1 = test mode: NIM Input 1 outputs the internal PPG clock, NIM Input 2 outputs the active PPG clock. 0 = normal operation.
16 Ext-Clk-Sel R 1 = external clock currently selected (LED 4 lit).
17 Ext-Clk-Good R 1 = external clock signal present and locked (LED 2 lit).
?–31 Status R Readback of program counter (PC), stack pointer (SP), and current delay counter.

Common CSR write values and their effects:

Value written Effect
0x8 Reset — clears program counter and halts execution
0x0 Idle — clears reset; selects software trigger mode (Ext-Start bit cleared)
0x4 Arm for external trigger — waits for rising edge on NIM Input 4
0x1 Software start — immediately begins executing from instruction slot 0

Clock Control Register (+0x30)

This register reprograms the on-board PLL to scale an external input clock to 100 MHz. It is only needed when operating with an external clock source other than the default 20 MHz.

The PLL relationship is:

VCO frequency = Fin × M / N
Clock output = VCO / C0

Operating limits: Fin: 5–472 MHz; FVCO: 600–1300 MHz; lock time < 1 ms.

Default configuration for a 20 MHz input: M = 30, N = 1 (bypassed), C0 = 6 → 100 MHz output.

Bit layout:

Bits Description
30–28 Phase counter select (0 = all; 1 = M; 2–6 = Clock 0–4)
26–24 Counter parameter (0 = HighCount; 1 = LowCount; 4 = Bypass; 5 = Mode)
23–20 Counter type (0 = N; 1 = M; 2 = Cp/LF; 3 = VCO; 4–8 = Clock 0–4)
16–8 9-bit parameter data
5 PLL Reset
4 Up/Down (1 = up; 0 = down)
3 PhaseStep
2 Write parameter
1 Reconfigure
0 Control trigger (toggle to apply changes)

Example: 100 MHz external frequency divide-down:

vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x00000305
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x01000205
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x05000105
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x04000005
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x3
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0

Example: return to 10 MHz internal frequency:

vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x04000105
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x3
vme_poke -a VME_A32UD -A 0x00100030 -d VME_D32 0x0

Instruction Set

Instruction Format

Every PPG program is a sequence of 128-bit instructions stored in program memory. Each instruction is written to the board as four consecutive 32-bit register writes (to Inst_Lo, Inst_Med, Inst_Hi, Inst_Top). Writing Inst_Top commits the instruction to the currently selected slot.

Bits Register Name Description
0–31 Inst_Lo SET mask One bit per output channel (bit 0 = channel 1 … bit 31 = channel 32). A 1 drives that channel HIGH.
32–63 Inst_Med CLEAR mask Same bit-to-channel mapping. A 1 drives that channel LOW.
64–95 Inst_Hi Delay count Number of additional 10 ns clock cycles to hold this state. Total dwell = (3 + delay_count) × 10 ns.
96–115 Inst_Top bits 0–19 Data 20-bit payload — loop count or branch/call address, depending on instruction type.
116–118 Inst_Top bits 20–22 Type 3-bit instruction opcode (see Instruction Types).
119–127 Inst_Top bits 23–31 Ignored.

SET and CLEAR masks: these follow the output model described above — assign each channel to exactly one mask per instruction.

Note on the type field: the board's own register summary labels this field as a "4-bit instruction type" at bits 116–117, which is internally inconsistent (that range is only two bits). The actual opcode values listed below occupy bits 20–22 of Inst_Top (bits 116–118 of the full instruction), a 3-bit field. This page uses the values, which are unambiguous.

Instruction Types

The 3-bit opcode occupies bits 20–22 of the Inst_Top word (bits 116–118 of the full 128-bit instruction). The 20-bit data payload occupies bits 0–19 of Inst_Top.

Type Opcode Inst_Top value Description
Halt 0 0x000000 Stop execution. The program counter does not advance; CSR bit 0 goes low.
Continue 1 0x100000 Advance to the next instruction slot.
New Loop 2 0x200000 + N Begin a loop that repeats N times (maximum N = 1,048,575 = 0xFFFFF). Pushes the loop start address and count onto the stack.
End Loop 3 0x300000 Decrement the loop counter. If > 0, jump back to the start of the loop body (the instruction after the matching New Loop); otherwise fall through.
Call 4 0x400000 + addr Push the next instruction address onto the stack and jump to the 20-bit address in the data field.
Return 5 0x500000 Pop the return address from the stack and jump to it.
Branch 6 0x600000 + addr Unconditional jump to the 20-bit address in the data field (does not push a return address).

Timing

Every instruction occupies exactly 3 clock cycles of overhead regardless of type. The delay count field adds additional cycles on top of that:

dwell time = (3 + delay_count) × 10 ns

To produce a 280 ns pulse, write a delay count of 25: (3 + 25) × 10 = 280 ns.

The maximum delay count is 232 − 1 = 4,294,967,295, giving a maximum single-instruction dwell of approximately 42.9 seconds. Longer durations require a loop (see Looping for Long Durations).

Important: every instruction — including Halt, New Loop, End Loop, etc. — incurs the 3-cycle overhead. Loop and branch instructions with delay_count = 0 still consume 30 ns.


Programming Tutorial

This section walks through writing PPG programs from first principles, building up to a complete real-world example. All examples use the mvme_write_value VME library function:

mvme_write_value(vme, address, value);

where vme is an open VME handle, address is an absolute 32-bit VME address, and value is a 32-bit unsigned integer.

The examples assume a base address of BASE_ADDR = 0x00100000. Adjust for your hardware configuration.

The set_command Helper

All PPG programming is built from a single primitive: selecting an instruction slot and writing its four 32-bit fields. It is convenient to wrap this in a helper:

void set_command(int slot,
                 unsigned int set_mask,    // Inst_Lo:  channels to drive HIGH
                 unsigned int clr_mask,    // Inst_Med: channels to drive LOW
                 unsigned int delay,       // Inst_Hi:  extra 10 ns ticks
                 unsigned int type_data)   // Inst_Top: opcode + payload
{
    mvme_write_value(vme, BASE_ADDR + 0x08, slot);       // select slot
    mvme_write_value(vme, BASE_ADDR + 0x0C, set_mask);
    mvme_write_value(vme, BASE_ADDR + 0x10, clr_mask);
    mvme_write_value(vme, BASE_ADDR + 0x14, delay);
    mvme_write_value(vme, BASE_ADDR + 0x18, type_data);  // commits instruction
}

All higher-level programming is built from calls to set_command.

Step 1: Reset the Board

Before writing any program, reset the board to put it in a known state:

mvme_write_value(vme, BASE_ADDR + 0x00, 0x8);  // CSR bit 3: assert reset
mvme_write_value(vme, BASE_ADDR + 0x00, 0x0);  // clear reset → idle

Asserting reset clears the program counter and halts execution. The reset bit must then be cleared (write 0x0) or the board will not operate. After this the board sits idle in software-trigger mode, waiting for a program to be loaded and started. It will not fire on any trigger until you write 0x1 (software start) or 0x4 (external trigger arm) to the CSR.

Step 2: Write a Safety Halt at Slot 0

Always write a Halt to slot 0 before writing the rest of your program. If an external trigger edge arrives while you are still writing instructions, the PPG executes this Halt immediately rather than running a partially-written or stale program:

set_command(0,
            0x00000000,   // SET:   no channels raised
            0xFFFFFFFF,   // CLEAR: all channels driven low
            0,            // delay: 0 extra ticks → 30 ns dwell
            0x000000);    // type:  Halt

Step 3: Write Your Program

Write instructions to consecutive slots starting from slot 1. Each instruction specifies which outputs are high, which are low, how long to hold that state, and what to do next.

Example: generate a single 280 ns pulse on channel 1, then halt.

Channel 1 is bit 0 of the output masks. The sequence is:

  1. All outputs low for 30 ns (initial clear).
  2. Channel 1 high for 280 ns.
  3. All outputs low, halt.
// Slot 1: all outputs low for 30 ns (delay=0 → (3+0)×10 = 30 ns)
set_command(1, 0x00000000, 0xFFFFFFFF, 0, 0x100000);

// Slot 2: channel 1 (bit 0) high for 280 ns (delay=25 → (3+25)×10 = 280 ns)
set_command(2, 0x00000001, 0xFFFFFFFE, 25, 0x100000);

// Slot 3: all outputs low, halt
set_command(3, 0x00000000, 0xFFFFFFFF, 0, 0x000000);

Step 4: Arm and Start

Once all instructions are written, arm or start the board by writing to the CSR:

// Rewind program address to slot 0 before starting
mvme_write_value(vme, BASE_ADDR + 0x08, 0x0);

// Option A: software trigger — starts immediately
mvme_write_value(vme, BASE_ADDR + 0x00, 0x1);  // Run bit

// Option B: external trigger — waits for rising edge on NIM Input 4
mvme_write_value(vme, BASE_ADDR + 0x00, 0x4);  // Ext-Start bit

Step 5: Detecting Completion

Poll CSR bit 0. When it falls to 0 the program has halted and the next sequence can be loaded:

while (mvme_read_value(vme, BASE_ADDR + 0x00) & 0x1) {
    // still running — sleep or yield here
}
// Program has halted; safe to reload

Timing Calculations

To produce a dwell of exactly T nanoseconds, the required delay count is:

delay_count = T / 10 − 3

For example, to hold an output high for exactly 500 ns:

delay_count = 500 / 10 − 3 = 47

The minimum possible dwell is 30 ns (delay_count = 0). It is impossible to produce a dwell shorter than 30 ns with any single instruction.

When two consecutive instructions must sum to an exact total time, remember that each instruction carries its own 30 ns overhead:

Ttotal = (3 + d1) × 10 + (3 + d2) × 10 = (6 + d1 + d2) × 10 ns

Looping for Long Durations

A single delay count can produce at most ~42.9 seconds. For longer durations, split the period across multiple loop iterations. Each iteration holds for a fraction of the total time; the loop instruction repeats until the full duration is reached.

Strategy: choose a loop count N and compute the per-iteration delay count so that N iterations sum to the target duration. The example below uses N = 100:

double total_time_s = 30.0;
int N = 100;

// Each iteration should cover total_time_s / N seconds. Convert to 10 ns ticks.
// We ignore the ~30 ns spent on the End Loop instruction each iteration (and the
// one-time New Loop): at second scale that error is < 0.001 %. For tight timing,
// subtract the overhead or measure on hardware.
unsigned int body_delay = (unsigned int)(total_time_s * 1e8 / N);

// Slot i:   New Loop, repeat N times  (0x200000 + N) — runs once
// Slot i+1: hold outputs for body_delay ticks, Continue — the loop body
// Slot i+2: End Loop — jumps back to the body until the count is exhausted
set_command(i,   0x0,              0x0,              0,          0x200000 + N);
set_command(i+1, my_output_mask,   ~my_output_mask,  body_delay, 0x100000);
set_command(i+2, 0x0,              0x0,              0,          0x300000);

New Loop initialises the counter once; End Loop runs once per iteration and jumps back to the loop body. Both are 3-cycle instructions, so the loop adds a fixed 30 ns plus 30 ns per iteration of overhead — negligible for periods of seconds or longer.

Loops may be nested, and combined with subroutine calls, up to the 256-entry stack depth: place another New Loop / End Loop pair (or a subroutine call) inside the body.

Subroutines

Use Call and Return instructions to share a block of instructions across multiple points in a program. The hardware call stack holds up to 256 return addresses.

// --- Main program ---
set_command(0, 0x0, 0xFFFFFFFF, 0,  0x000000);         // slot 0: safety Halt
set_command(1, 0x0, 0xFFFFFFFF, 10, 0x100000);         // slot 1: setup, Continue
set_command(2, 0x0, 0x0,        0,  0x400000 + 10);    // slot 2: Call subroutine at slot 10
set_command(3, 0x0, 0xFFFFFFFF, 0,  0x000000);         // slot 3: Halt (reached after Return)

// --- Subroutine at slot 10 ---
set_command(10, 0x00000001, 0xFFFFFFFE, 25, 0x100000); // pulse channel 1
set_command(11, 0x00000000, 0xFFFFFFFF, 25, 0x100000); // all low
set_command(12, 0x0,        0x0,        0,  0x500000); // Return

Complete Example: Timing Calibration Sequence

This is the timing calibration sequence used in the UCN sequencer frontend. It fires 10 pulses on output channel 29 (bit 28, mask 0x10000000) at a 0.2 s cadence so that downstream DAQ systems can establish an absolute time reference. It uses a software trigger.

The program layout:

Slot Instruction Purpose
0 All LOW, Continue (190 ns) Blank at start of sequence
1 New Loop ×10 Repeat 10 times
2 CH29 HIGH, Continue (280 ns) Rising-edge timing pulse
3 All LOW, Continue (~0.2 s) Off time between pulses
4 End Loop Jump back to slot 1 nine more times
5 All LOW, Halt End of sequence
// Reset and select software trigger
mvme_write_value(vme, BASE_ADDR + 0x00, 0x8);  // reset
mvme_write_value(vme, BASE_ADDR + 0x00, 0x0);  // clear reset (software trigger mode)

// Slot 0: blank — all outputs low, 190 ns dwell (delay=16 → (3+16)×10=190 ns)
set_command(0, 0x00000000, 0xFFFFFFFF, 0x10, 0x100000);

// Slot 1: New Loop, repeat 10 times
set_command(1, 0x0, 0x0, 0x0, 0x20000A);  // 0x200000 + 10

// Slot 2: channel 29 (bit 28) HIGH, 280 ns pulse (delay=25 → (3+25)×10=280 ns)
set_command(2, 0x10000000, 0xEFFFFFFF, 25, 0x100000);

// Slot 3: hold all LOW for the rest of the 0.2 s period.
// 0.2 s = 2e7 ticks of 10 ns. Subtract the 28 ticks used by the slot-2 pulse
// ((3+25)=28). The few ticks of loop/instruction overhead are negligible at
// this scale, so we approximate:
unsigned int off_time = (unsigned int)(0.2 * 1e8) - 28;
set_command(3, 0x00000000, 0xFFFFFFFF, off_time, 0x100000);

// Slot 4: End Loop
set_command(4, 0x0, 0x0, 0x0, 0x300000);

// Slot 5: all LOW, Halt
set_command(5, 0x00000000, 0xFFFFFFFF, 0x1, 0x000000);

// Rewind address pointer and start
mvme_write_value(vme, BASE_ADDR + 0x08, 0x0);  // address = slot 0
mvme_write_value(vme, BASE_ADDR + 0x00, 0x1);  // CSR: software start

// Poll for completion (~2 seconds)
while (mvme_read_value(vme, BASE_ADDR + 0x00) & 0x1) { /* wait */ }