.. vim: syntax=rst Detailed Design Description =========================== PST Beamformer Design --------------------- Forming Pulsar Timing (PST) beams is one of the primary functions of the LOW Correlator and Beamformer product. There are many requirements and interfaces that relate to the formation of PST beams. The design that implements these functions is shown in Figure 1, and has two parts: - In the first the SPS beams are steered to the subarray beam centre (3.6kHz channeliser, delay correction to a common wavefront for all stations) at the same time any processing needed to suppress RFI are implemented (see RD1 as not discussed here). Delay correction is implemented via a delay by integer number of SPS samples (coarse delay) and to interpolate between the samples a separate fine delay is implemented as a phase slope across the band. - Following this are 16 instances of the PST beamformer. This is implemented as a phase only beamformer and the channelised bandwidth of 3.6kHz has been chosen to allow this. Note that PST requirements do not impose any requirement on filterbank channel width. To produce beams with high polarisation purity each instance of the beamformer applies a Jones Matrix to the data from each station before beamforming and to optimise the beam shape and sidelobe level each station can be weighted (e.g. Guassian weighting could be applied across the stations.) At the output the data is scaled to produce 16-bit data at a particular RMS level, as well as calculation of a quality metric indicating how RFI has affected the beamforming. Finally the resulting data is placed into UDP packets with timing information and metadata added. PST Beamformer Firmware Implementation -------------------------------------- The top level PST Beamformer Alveo personality is shown in Figure 1. The design contains multiple PST beamformer pipelines, with each capable of processing a number of channels and stations. For the Alveo 55C there are 3 pipelines (and for the Alveo V80 the number is expected to be of the order 8.) Each pipeline receives identical input data, but each is independently configurable via the PCIe register interface. A data arbiter collects PST beam data from each of the pipelines in a round robin fashion and outputs data to PST via the P4 switch (the physical line rate to PST is set by the P4 switch and not the Alveo.) The following paragraphs describe each block digging deeper into the firmware hierarchy. .. image:: images/PST_firmware_overview.JPG Figure 1. Highest level of the PST Beamformer firmware Figure 2 shows the contents of each PST processing pipeline. Each pipeline can process an independent set of stations and channels. Each pipeline ingests SPS station data into HBM memory buffer, which is processed by a number of filterbank instances. The “fine” channelised data is stored in HBM memory then 16 beamformer instances (one for each beam) form beams which are then packetised. Each pipeline has its own independent register set. .. image:: images/PST_beamformer_pipeline.JPG Figure 2. PST Beamformer pipeline firmware diagram Figure 3 shows the SPS station ingest and subsequent station and PST beam delay framework. Note that the SPS metadata contains the packet number (effectively time) which is used to write the station data to particular HBM memory locations. The virtual channel table determines what can be written to HBM memory for this particular pipeline. The size of each CT1 HBM buffer (for each pipeline and filterbank instance) is: Triple buffering 32 packets (each 2048 SPS samples) Dual-pol stations 2x8-bit complex data 1024 virtual channels 3*32*2048*2*2*8*1024/8/1024/1024/1024=0.75GB The maximum write data rate is 1024-virtual channels * 2-pol * 2*8-bits / 1080ns = ~30Gbps. With 3 pipelines per PST Alveo the overall data rate is ~90Gbps. Given that the SPS metadata controls timing of the HBM memory writes it is logical that it also controls the HBM reads. When reading HBM data the coarse delay is implemented as a read address offset. Delay polynomials are evaluated using double precision floating point precision for the first output time sample in the corner turn frame. The station beam polynomial is evaluated for every filterbank output time sample and converted into two parts - whole SPS time samples and the residue. Coarse delays are fixed across a corner turn frame of 71 ms (=32 packets) noting that the subsequent PST filterbank processes SPS data in bursts of 32 SPS packets. The residual delay is converted into phase at the sky frequency and applied as a phase offset + phase slope across the station channel. Delay polynomials for stations and PST beams can have independent start and validity times. When either delay polynomial is no longer valid the output beams will be flagged. Each delay stage has a different delay update rate: Whole samples are updated every 71ms, which is duration of the 32 SPS packets Station beam and PST beamformer phase shifts are updated at the PST sample rate of 781250/216*4/3=4822Hz=207µs .. image:: images/SPS_ingest_and_CT1.JPG Figure 3. PST Beamformer data ingest firmware drawing Figure 4 illustrates the filterbank signal processing chain for the PST beamformer. The data going into the filterbank is dual polarisation data for the same station. The PST filterbanks are: 256 channels of which 256*27/32=216 are useful) 12 taps Outputs are 4/3 oversampled Bandwidth is 781250/216*(4/3)=4.8kHz oversampled, with passband of ~3.6kHz Output period is 207.36µs As the filterbank data comes out it is phase shifted to complete the station delay. The data is then detected for RFI and if present blanked and marked as RFI. If one polarisation has RFI then both polarisations are marked as RFI. To help the RFI algorithms set appropriate levels a running power average of all channels is produced for internal use only. The filterbank data is then stored in a HBM buffer of duration 71ms. This buffer stores all data ready for being processed in the following beamformer stage. The size of CT2 HBM buffer is: Double buffered 1024 virtual channels 216 PST channels Dual polarisation 2*8-bit complex data 341.3 samples 2*1024*216*2*2*8*341/8/1024/1024/1024=0.56GB .. image:: images/filterbank.JPG Figure 4. PST beamformer Filterbank and RFI processing firmware drawing The filterbank coefficients are provided by the PST product and are stored in a signed 18-bit format inside a read only memory. With a 12-tap filterbank design and 256-channels results in a total of 3072 taps. .. image:: images/CT2.JPG Figure 5. Cornerturn 2 and delay polynomial evaluation drawing. Figure 6 illustrates one of 16 PST beamformer instances (one for each beam.) The 16 beamformer instances are daisy chained together as each beamformer uses the same input data. Each instance creates one PST beam for one subarray of channels and stations. The input data is 8-bit dual polarisation complex SPS data. The first operation of the PST beamformer is a Jones 2x2 correction matrix. Note that the antenna weights are combined with the Jones Matrix (a zero weight will result in this stations data not contributing to the beamformer.) The beamformer weights are applied as a phase shift and then all stations in the subarray are summed. The packetiser converts the beam data to 16 bit integers. The beam data is scaled to ensure the data values are in the range of std = 260 to 460. The quality of the beam data (called relative weight) is calculated based on the amount of RFI and the station weights. .. image:: images/beamformer.JPG Figure 6 PST Beamformer processing firmware drawing The beam data is then packetised, as per the diagram shown in Figure 7. There are three parts to the PSR packet; the UDP metadata, the PST beam metadata and the actual beam data. The payload off the packet is as follows: 32 time samples/packet Corner turn frame is 27 SPS packets 192 SPS samples per PST filterbank output (4/3 oversampled) 27 * 2048 / 192 = 288 time samples per corner turn frame 216 fine channels per filterbank output 24 fine channels/packet 9 packets per coarse channel 2 bytes per stokes value 32x24x(2 bytes)x(4 stokes) = 6144 bytes of data per packet .. image:: images/PSR_packetiser.JPG Figure 7. PST Packetiser drawing