PST metadata mapping

The following sections record the mapping between the scan configuration and scan request to the header keys within the output DADA files during PST digital signal processing. It also includes the mapping to the metadata keys.

General mapping

This section includes the mapping from the scan configuration, DADA Header keys, and direct mappings to the metadata. Where this is a simple conversion or calculation then the information is in the the Notes column. The Mode column is for the processing mode:

  • VR - Voltage recorder

  • FT - Flow through

  • DF - Detected filterbank (currently not supported)

  • PT - Pulsar timing (currently not supported)

  • All - all modes

Config Key / Default DADA Header Key Metadata Key Notes
eb_id EB_ID execution_block
frequency_band UDP_FORMAT There is a mapping from frequency band to UDP format: "low" -> "PstLow", for mid the "5a" and "5b" map to "MidPSTBand5" for bands 1 to 4 the value is "MidPSTBandX", where X is the band number
scan_id SCAN_ID obscore/obs_id Scan ID comes from the scan not the scan configuration
receiver_id FRONTEND
receptors ANTENNAE input is an array of values converted to string and joined by a ","
receptor_weights ANT_WEIGHTS input is an array of values converted to string and joined by a ","
timing_beam_id BEAM_ID If timing_beam_id is not set it is the BEAM id of the TANGO device
observer_id OBSERVER context/observer
project_id PROJID
subarray_id SUBARRAY_ID
source SOURCE obscore/target_name
delay_centre DELAY_CENTRE The input is an array but the header value is a string joined by a ","
target/target_name SOURCE obscore/target_name
target/attrs/crd1 STT_CRD1 obscore/s_ra This is in deg
target/attrs/crd1 STT_CRD2 obscore/s_dec This is in deg
target/attrs/epoch EQUINOX
max_scan_length SCANLEN_MAX max_scan_length is a float so we do an int() conversion
total_bandwidth BW / BW_OUT total_bandwidth is in Hz and bandwidth is in MHz
centre_frequency FREQ Config is in MHz, output is in MHz
ft/channel_polarisation_selection/channels CHAN_FT The config has this a tuple/array that is joined with a ","
ft/channel_polarisation_selection/polarisations POLN_FT
ft/requantisation/num_bits_out NBIT_OUT
ft/requantisation/scale DIGITIZER_SCALE
ft/rescale/timescale RESCALE_TIMESCALE Units is in seconds.
ft/rescale/algorithm RESCALE_ALGORITHM
ft/rescale/periodic_update RESCALE_PERIODIC_UPDATE
rfi_frequency_masks NMASK / FREQ_MASK / CHANNEL_MASK
In the above table, if the config key starts with ft/ then it's specific for flow through mode. All other values are valid
for all modes. As more modes are added, the prefixes of df/ and pt/ will be added to config keys that are specific to the detected filterbank and
pulsar timing modes respectively.

Deprecated Scan Config Mappings

The following scan config values have been deprecated and will be removed in version 3.0 of the PST Scan configuration schema.

Config Key / Default DADA Header Key Notes
activation_time ACTIVATION_TIME Never used by PST.
pointing_id PNT_ID Never used by PST.
test_vector_id TEST_VECTOR Never used by PST.
num_frequency_channels NCHAN Can be determined by frequency_band and bandwidth
udp_nsamp UDP_NSAMP Can be determined by frequency_band
wt_nsamp WT_NSAMP Can be determined by frequency_band
udp_nchan UDP_NCHAN Can be determined by frequency_band
num_of_polarizations NPOL Can be determined by frequency_band
bits_per_sample NBIT Can be determined by frequency_band
oversampling_ratio OS_FACTOR Can be determined by frequency_band
feed_polarization FD_POLN Can be determined by receiver_id
feed_handedness FD_HAND Can be determined by receiver_id
feed_angle FD_SANG Can be determined by receiver_id
feed_tracking_mode FD_MODE Can be determined by receiver_id
feed_position_angle FA_REQ Can be determined by receiver_id
itrf ITRF This has been replaced with delay_centre
source SOURCE This has been replaced with target/target_name
coordinates/ra STT_CRD1 This has been replaced with target/attrs/c1
coordinates/dec STT_CRD2 This has been replaced with target/attrs/c2
coordinates/equinox EQUINOX This has been replaced with target/attrs/epoch
num_rfi_frequency_masks NMASK Can be determined by length of rfi_frequency_masks
num_channelization_stages Can be determined by frequency_band
channelization_stages Can be determined by frequency_band

Fixed/Calculated mapping values

Frequency Band Configuration

The following values are fixed values based on the telescope and the frequency band used. For more details on the specific values see PST Static Configuration.

DADA Header Key Notes
TSAMP The time, in microseconds, used to sample the complex voltage
UDP_FORMAT The format of the UDP packets coming from the CBF to PST
UDP_NCHAN The number of PST fine channels in a UDP packet
UDP_NSAMP The number of time samples in a UDP packet
WT_NSAMP The number of samples per weight in a packet
NBIT The number of bits used per value dimension, if complex data the total number of bits is twice this
NDIM The number of dimensions of a sample value. 1 for real, 2 for complex. Weights are real valued.
NPOL The number of polarisations for each sample. For SKA this is always 2
OS_FACTOR The oversampling factor, expressed as a fraction.

Other fixed/calculated mapping values

The following are calculated or set to a fixed value by PST. An example is that LMC calculates the subband resources which the *X*_OUT values are determined, even if at the moment we only use 1 subband. Similarly values like RESOLUTION are calculated from other values based on the scan configuration.

DADA Header Key Subband? Value Notes
NSUBBAND N 1 PST only uses 1 subband atm, however, in the future this will depend on amount of data throughput
BMAJ N 0.0 There is currently no mapping. Not available in scan configuration
BMIN N 0.0 There is currently no mapping. Not available in scan configuration
COORD_MD N "J2000" We only support J2000
TRK_MODE N "TRACK" Only support tracking mode of "TRACK"
START_CHANNEL Y determine from BW and centre_frequency This is per subband
END_CHANNEL Y START_CHANNEL + NCHAN This is per subband
START_CHANNEL_OUT Y determine from BW and centre_frequency This is per subband
END_CHANNEL_OUT Y START_CHANNEL_OUT + NCHAN_OUT This is per subband
NCHAN_OUT Y NCHAN The number of channels out, currently set to nchan
FREQ_OUT Y FREQ Config is in MHz, output is in MHz
NDIM N 1 - for weights
2 - for data
PST only supports complex valued data but the weights file is real valued
BYTES_PER_SECOND N VR - NCHAN * NPOL * NBIT * NDIM / 8e6 / TSAMP
FT - NCHAN_FT * NPOL_OUT * NBIT_OUT * NDIM / 8e6 / TSAMP
DATA_HOST N specific to env This is specific to RECV.CORE and the host IP it is on
DATA_PORT N specific to env This is specific to RECV.CORE and port number to receive data on
TELESCOPE N "SKALow" or "SKAMid" This depends on which telescope and the UDP format used
UTC_START N The start time in UTC of the scan to the precision of a second.
PICOSECONDS N The fractional time in picoseconds after the UTC_START that the scan actually started
RESOLUTION N (NSAMP_PER_PACKET * NCHAN * NDIM * NPOL * NBIT) / 8 The number of bytes to have samples for all the channels. Note that this value in a header file is calculated differently.
OBS_OFFSET N The data offset from the start of a scan that the file is for, this should be a multiple of the RESOLUTION closest at or beyond 10 seconds
FILE_NUMBER N The file number, starting from 0
NANT N len(ANTENNAE) This is equal to len(receptors)
NMASK N len(rfi_frequency_masks) This is equal to len(rfi_frequency_masks)
FREQ_MASK N determine from rfi_frequency_masks See below about how this is encoded.
CHANNEL_MASK N determine from rfi_frequency_masks See below about how this is encoded.

RFI Frequency Masks mapping

The rfi_frequency_masks value in the PST scan configuration schema is defined as a list of pairs of start and end frequencies with units of Hertz (Hz).

Both FREQ_MASK and CHANNEL_MASK are encoded in the DADA header file as a comma separated list of pairs of values that themselves have a semicolon (:) between the pair values. If the input list was [[ V1, V2], [ V3, V4], ...] then the value in the DADA header file is encoded as V1:V2,V3:V4,....

In the case of FREQ_MASK the values are stored in MHz even though the values in ``rfi_frequency_masksare in Hz.  ForCHANNEL_MASKthe values are the offset from theSTART_CHANNEL` value.

Metadata fields

The following fields don’t have a direct mapping back to the scan configuration

Key Value / Default CSP Sources Notes
execution_block EB_ID execution_block_id
context/intent "Tied-array beam observation of " source
context/notes "Unknown" N/A If INTENT is in the headers we could get this
config/image "artefact.skao.int/ska-pst/ska-pst" N/A This is the CONFIG_IMAGE constant within SEND code
config/version "0.1.3" N/A This is the CONFIG_IMAGE constant within SEND code - should check if this should just be the version of the application
obscore/dataproduct_type "timeseries" N/A For different processing modes we will need to update this
obscore/dataproduct_subtype "voltages" N/A For different processing modes we will need to update this
obscore/calib_level 0 N/A
obscore/obs_id SCAN_ID scan_id
obscore/access_estsize sum of just the data files data sections N/A
obscore/s_ra STT_CRD1 coordinates/ra This is in deg
obscore/s_dec STT_CRD2 coordinates/dec This is in deg
obscore/t_min UTC_START N/A value is in MJD calculated from string. Note the PICOSECONDS is missing.
obscore/t_max t_min + scan_length N/A
obscore/t_resolution TSAMP frequency_band
obscore/t_exptime scan_length N/A
obscore/facility_name "SKA-Observatory" N/A
obscore/instrument_name TELESCOPE frequency_band Frequency band can be used to work out value. SDP puts a - after the SKA (i.e. SKA-Low or SKA-Mid)
obscore/pol_xel NPOL frequency_band
obscore/pol_states "null" N/A
obscore/em_xel NCHAN frequency_band and total_bandwidth
obscore/em_unit "Hz" N/A
obscore/em_min (FREQ - BW/2) * 1e6 centre_frequency and total_bandwidth Units in Hz not MHz so there is a factor of 1e6
obscore/em_max (FREQ + BW/2) * 1e6 centre_frequency and total_bandwidth Units in Hz not MHz so there is a factor of 1e6
obscore/em_res_power "null" N/A
obscore/em_resolution (BW / NCHAN) * 1e6 frequency_band, total_bandwidth and centre_frequency Units in Hz not MHz so there is a factor of 1e6
obscore/o_ucd "null" N/A

Potential bugs in mapping

  • config/version be the version of ska-pst that sent it and not be hardcode to 0.1.3?

  • obscore/dataproduct_type and obscore/dataproduct_subtype are currently hardcoded for VR mode

  • obscore/t_min is not including the fractional seconds of the start time

Config values currently not mapped

The following keys are on the scan configuration that are not sent to RECV or DSP which means they don’t get in to the output files

  • destination_address - this should be where to send data to for SDP. Currently we use a volume mount

  • subint_duration - this should map to OUTSUBINT. This should also affect how often a file is written out, for voltage recorder and flow through modes write out 10 seconds of data per file.

Documentation mismatch

  • source - says should map to SRC_NAME but we use SOURCE