SKA PST DSP Architecture
The Digital Signal Processing (DSP) component of the Pulsar Timing (PST) product is responsible for performing the digital signal processing of the channelised voltages in the tied-array beam data generated by the Correlator Beam Former (CBF) producing reduced data products that will be transmitted to the Science Data Processor (SDP) for subsequent secondary analysis.
The PST DSP component is configured, controlled, and monitored by via a gRPC interface with PST LMC.
DSP will consist of signal processing functionality that will be incrementally released throughout the array released schedule and will be delivered in the following order:
AA0.5 Disk Voltage Recorder
AA1 Dynamic Spectrum Mode (flow-through)
AA1 Dynamic Spectrum Mode (limited channelisation)
AA2 Pulsar Timing Mode (limited channelisation)
AA3 Dynamic Spectrum and Pulsar Timing Modes (complete)
AA0.5 Voltage Recorder
The AA0.5 release of DSP, also referred to as DSP.DISK, will simply read tied-array voltage timeseries from the Shared Memory Ring Buffer (SMRB) and write them to a set of files for persistent storage. The meta-data that describe the polarisation vectors are read from the header block of the data and weights ring buffers. The Data and Weights streams are read from separate ring buffers and written to a sequence of separate files.
Decomposition
The voltage recorder (DSP AA0.5) consists of a monitoring and control module (DSP.MGMT) that interacts with the MGMT component and controls the other sub-components of the DSP. DSP.MGMT controls instances of DataBlockManager, one for each sub-band. Each DataBlockManager, contains DataBlock instances for the Data and Weights streams. The DataBlock contains ring buffers for the Header (meta-data) and Data (time-series). Each of these ring buffers have a configurable number of elements and element size.
Data Product Structure
The data and weights streams of each scan recorded by DSP.DISK are each written to sequences of files. The data file sequence are written to a data subdirectory and the weights to a weights subdirectory. DSP.DISK splits the data stream into files that are contiguous and each contain approximately 10 seconds data data. The file naming convention for each stream will adhere to:
<TIMESTAMP>_<BYTE_OFFSET>.<FILE_NUMBER>.dada
Where
TIMESTAMP is the UTC timestamp of the first integer second of the observation, written in the format: YYYY-MM-DD-HH:MM:SS.
BYTE_OFFSET is byte offset of the first sample in the file from the first byte in observation.
FILE_NUMBER is index of the file that has been written in the sequence.
File Structure
The files written by DSP.DISK for the data or weights streams use the same file structure, which is compatible with the PSRDADA file format which is compatible with existing Pulsar Timing software such as DSPSR. The file structure is quite simple consisting of an ASCII Header block, typically of 4096 bytes, and then a raw data block.
Header Block
The 4096 byte header block is simply defined as a list of key/value stored in ASCII characters with each pair delimited by newline characters and separated by one or more whitespace characters. The ASCII header is padded with the null character from the final valid header character to the final byte in the header block.
Header Block Fields
Key |
Description |
Units |
Examples |
---|---|---|---|
HDR_SIZE |
Size of the header |
bytes |
4096 |
HDR_VERSION |
Version of the header |
1.0 |
|
TELESCOPE |
Name of the telescope observatory |
SKALow,SKAMid |
|
RECEIVER |
Name of the receiver used |
LFAA,SPFRX |
|
INSTRUMENT |
TBD |
LowCBF |
|
NBIT |
Number of bits per real or imag sample |
8,16 |
|
NANT |
Number of antenna present |
1 |
|
NPOL |
Number of polarisations present |
2 |
|
NDIM |
Number of dimensions, 1=Real 2=Complex |
1,2 |
|
TSAMP |
Sampling interval in |
microseconds |
207.36 |
BAND |
Name of the observing band |
Low,Band1,Band2 |
|
WT_NSAMP |
Number of samples per relative weight |
32 |
|
UDP_NSAMP |
Number of samples per UDP packet |
32 |
|
UDP_NCHAN |
Number of channels per UDP packet |
24 |
|
UDP_FORMAT |
Name of the UDP format |
LowPST,MidBand1,…,MidBand5 |
|
OS_FACTOR |
Over-sampling ratio of data |
4/3,8/7 |
|
NCHAN |
Number of channels |
432 |
|
FREQ |
Centre frequency all channels |
MHz |
51.19357639 |
BW |
Bandwidth of all channels |
MHz |
1.562499999936 |
START_CHANNEL |
First absolute channel number |
0 |
|
END_CHANNEL |
Final absolute channel number |
432 |
|
RESOLUTION |
Minimum coherent block of data |
bytes |
110592 |
BYTES_PER_SECOND |
Data rate |
bytes/sec |
16666666 |
DATA_HOST |
Network address at which data is recorded |
IPv4 |
127.0.0.1 |
DATA_PORT |
Network UDP port at which data is recorded |
9510 |
|
LOCAL_HOST |
Local address at which data is recorded |
IPv4 |
127.0.0.1 |
CALFREQ |
Frequency of pulsed noise diode - unused |
Hz |
11.1111111111 |
OBS_OFFSET |
Offset of UTC_START of first sample |
bytes |
0 |
SOURCE |
Name of astronomical source |
J0437-4715 |
|
DATA_KEY * |
TBD |
a000 |
|
WEIGHTS_KEY * |
TBD |
a010 |
|
NUMA_NODE |
TBD |
0 |
|
HB_NBUFS * |
TBD |
8 |
|
HB_BUFSZ * |
TBD |
bytes |
4096 |
DB_NBUFS * |
TBD |
8 |
|
DB_BUFSZ * |
TBD |
bytes |
11059200 |
WB_NBUFS * |
TBD |
8 |
|
WB_BUFSZ * |
TBD |
bytes |
93600 |
SCAN_ID |
Unique scan identifier provided by TM |
123234325345 |
|
BEAM_ID |
Beam number |
1-16 |
|
DATA_GENERATOR |
Name of test vector that is present |
Sine,Random |
|
SCANLEN_MAX |
Length of the entire scan |
seconds |
10 |
SINUSOID_FREQ |
Frequency of Sine test vector, if present |
MHz |
51.3 |
UTC_START |
Integer second timestamp of first sample |
UTC |
2023-03-15-03:41:29 |
PICOSECONDS |
Offset from UTC_START of first sample |
picoseconds |
0 |
FILE_NUMBER |
File number in the file sequence |
0 |
Data Block
The data block segment of a PSRDADA file does not imply any specific sample ordering, it is merely a data container. Any data reader must read the key/value pairs from the header block, and if it understands the meta-data, then interpret the data block as required. The data packing format used by DSP.DISK is defined by the UDP packet sequencing and structure in the CBF to PST data interface.
Common:
npol = 2
ndim = 2
Low CBF / PST:
nchan_per_packet = 24
nsamp_per_packet = 32
nbit = 16
Mid CBF / PST:
nchan_per_packet = 185
nsamp_per_packet = 4
nbit = 8 or 16
Interpretation Algorithm:
# pointer to data block
char * input
nheap = data_bytes / (nchan * npol * ndim * nbit / 8)
packets_per_heap = nchan / nchan_per_heap
for heap in range(nheap):
for packet in range(packets_per_heap):
for ipol in range(npol):
for ichan in range(nchan_per_packet):
channel = packet * nchan_per_packet + ichan
for isamp in range(nsamp_per_packet):
sample = heap * nsamp_per_packet + isamp
for idim in range(ndim):
value = float(input[idx]) / scale_factor
idx++
Note that the scale_factor is not yet defined. When unpacking the 8- or 16-bit data samples, the Weights Block is used to retrieve the scale_factor that must be used to properly denormalise the quantised samples in the Data Block.
Weights Block
The Weights Block contains the 2 vectors that are extracted from the meta-data in CBF/PSR data stream:
Block of relative weights: these describe the the per-antenna RFI mitigation that was performed by the beam-former during signal processing.
Scale factors: these are floating point values by which that quantised data must be rescaled prior to subsequent signal processing.
For the purposes of faithfully denormalising the data in the weights block, only the per-packet scale factor is required. For more information on the structure of the relative weights, refer to the CBF to PSR Interface Control Document. The summary is that each UDP packet consists of a 16-bit relative weight for each channel in the packet and a single 32-bit floating point scale factor. These are stored sequentially in the Weights Block for each packet.
Common:
weights_nbit = 16
packet_weights_size = nchan_per_packet * weights_nbit / 8
packet_scales_size = 32 / 8
combined_size = packet_weights_size + packet_scales_size
Iteration Algorithm:
# pointer to weights data blockweights data blockweights data blockweights data block
char * weights_ptr
for heap in range(nheap):
for packet in range(packets_per_heap):
scales = weights_ptr
weights = weights_ptr + packet_scales_size
weights_ptr += combined_size