PST metadata mapping
The following sections record the mapping between the scan configuration and scan request to the header keys within the output DADA files during PST digital signal processing. It also includes the mapping to the metadata keys.
General mapping
This section includes the mapping from the scan configuration, DADA Header
keys, and direct mappings to the metadata. Where this is a simple conversion
or calculation then the information is in the the Notes column. The Mode
column is for the processing mode:
VR - Voltage recorder
FT - Flow through
DF - Detected filterbank (currently not supported)
PT - Pulsar timing (currently not supported)
All - all modes
| Config Key / Default | DADA Header Key | Metadata Key | Notes |
|---|---|---|---|
| eb_id | EB_ID | execution_block | |
| frequency_band | UDP_FORMAT | There is a mapping from frequency band to UDP format: "low" -> "PstLow", for mid the "5a" and "5b" map to "MidPSTBand5" for bands 1 to 4 the value is "MidPSTBandX", where X is the band number | |
| scan_id | SCAN_ID | obscore/obs_id | Scan ID comes from the scan not the scan configuration |
| receiver_id | FRONTEND | ||
| receptors | ANTENNAE | input is an array of values converted to string and joined by a "," | |
| receptor_weights | ANT_WEIGHTS | input is an array of values converted to string and joined by a "," | |
| timing_beam_id | BEAM_ID | If timing_beam_id is not set it is the BEAM id of the TANGO device | |
| observer_id | OBSERVER | context/observer | |
| project_id | PROJID | ||
| subarray_id | SUBARRAY_ID | ||
| source | SOURCE | obscore/target_name | |
| delay_centre | DELAY_CENTRE | The input is an array but the header value is a string joined by a "," | |
| target/target_name | SOURCE | obscore/target_name | |
| target/attrs/crd1 | STT_CRD1 | obscore/s_ra | This is in deg |
| target/attrs/crd1 | STT_CRD2 | obscore/s_dec | This is in deg |
| target/attrs/epoch | EQUINOX | ||
| max_scan_length | SCANLEN_MAX | max_scan_length is a float so we do an int() conversion | |
| total_bandwidth | BW / BW_OUT | total_bandwidth is in Hz and bandwidth is in MHz | |
| centre_frequency | FREQ | Config is in MHz, output is in MHz | |
| ft/channel_polarisation_selection/channels | CHAN_FT | The config has this a tuple/array that is joined with a "," | |
| ft/channel_polarisation_selection/polarisations | POLN_FT | ||
| ft/requantisation/num_bits_out | NBIT_OUT | ||
| ft/requantisation/scale | DIGITIZER_SCALE | ||
| ft/rescale/timescale | RESCALE_TIMESCALE | Units is in seconds. | |
| ft/rescale/algorithm | RESCALE_ALGORITHM | ||
| ft/rescale/periodic_update | RESCALE_PERIODIC_UPDATE | ||
| rfi_frequency_masks | NMASK / FREQ_MASK / CHANNEL_MASK | ||
In the above table, if the config key starts with ft/ then it's specific for flow through mode. All other values are valid |
|||
for all modes. As more modes are added, the prefixes of df/ and pt/ will be added to config keys that are specific to the detected filterbank and |
|||
| pulsar timing modes respectively. |
Deprecated Scan Config Mappings
The following scan config values have been deprecated and will be removed in version 3.0 of the PST Scan configuration schema.
| Config Key / Default | DADA Header Key | Notes |
|---|---|---|
| activation_time | ACTIVATION_TIME | Never used by PST. |
| pointing_id | PNT_ID | Never used by PST. |
| test_vector_id | TEST_VECTOR | Never used by PST. |
| num_frequency_channels | NCHAN | Can be determined by frequency_band and bandwidth |
| udp_nsamp | UDP_NSAMP | Can be determined by frequency_band |
| wt_nsamp | WT_NSAMP | Can be determined by frequency_band |
| udp_nchan | UDP_NCHAN | Can be determined by frequency_band |
| num_of_polarizations | NPOL | Can be determined by frequency_band |
| bits_per_sample | NBIT | Can be determined by frequency_band |
| oversampling_ratio | OS_FACTOR | Can be determined by frequency_band |
| feed_polarization | FD_POLN | Can be determined by receiver_id |
| feed_handedness | FD_HAND | Can be determined by receiver_id |
| feed_angle | FD_SANG | Can be determined by receiver_id |
| feed_tracking_mode | FD_MODE | Can be determined by receiver_id |
| feed_position_angle | FA_REQ | Can be determined by receiver_id |
| itrf | ITRF | This has been replaced with delay_centre |
| source | SOURCE | This has been replaced with target/target_name |
| coordinates/ra | STT_CRD1 | This has been replaced with target/attrs/c1 |
| coordinates/dec | STT_CRD2 | This has been replaced with target/attrs/c2 |
| coordinates/equinox | EQUINOX | This has been replaced with target/attrs/epoch |
| num_rfi_frequency_masks | NMASK | Can be determined by length of rfi_frequency_masks |
| num_channelization_stages | Can be determined by frequency_band |
|
| channelization_stages | Can be determined by frequency_band |
Fixed/Calculated mapping values
Frequency Band Configuration
The following values are fixed values based on the telescope and the frequency band used. For more details on the specific values see PST Static Configuration.
| DADA Header Key | Notes |
|---|---|
| TSAMP | The time, in microseconds, used to sample the complex voltage |
| UDP_FORMAT | The format of the UDP packets coming from the CBF to PST |
| UDP_NCHAN | The number of PST fine channels in a UDP packet |
| UDP_NSAMP | The number of time samples in a UDP packet |
| WT_NSAMP | The number of samples per weight in a packet |
| NBIT | The number of bits used per value dimension, if complex data the total number of bits is twice this |
| NDIM | The number of dimensions of a sample value. 1 for real, 2 for complex. Weights are real valued. |
| NPOL | The number of polarisations for each sample. For SKA this is always 2 |
| OS_FACTOR | The oversampling factor, expressed as a fraction. |
Other fixed/calculated mapping values
The following are calculated or set to a fixed value by PST. An example is that LMC calculates the subband resources which the *X*_OUT values are determined,
even if at the moment we only use 1 subband. Similarly values like RESOLUTION are calculated from other values based on the scan configuration.
| DADA Header Key | Subband? | Value | Notes |
|---|---|---|---|
| NSUBBAND | N | 1 | PST only uses 1 subband atm, however, in the future this will depend on amount of data throughput |
| BMAJ | N | 0.0 | There is currently no mapping. Not available in scan configuration |
| BMIN | N | 0.0 | There is currently no mapping. Not available in scan configuration |
| COORD_MD | N | "J2000" | We only support J2000 |
| TRK_MODE | N | "TRACK" | Only support tracking mode of "TRACK" |
| START_CHANNEL | Y | determine from BW and centre_frequency |
This is per subband |
| END_CHANNEL | Y | START_CHANNEL + NCHAN |
This is per subband |
| START_CHANNEL_OUT | Y | determine from BW and centre_frequency |
This is per subband |
| END_CHANNEL_OUT | Y | START_CHANNEL_OUT + NCHAN_OUT |
This is per subband |
| NCHAN_OUT | Y | NCHAN |
The number of channels out, currently set to nchan |
| FREQ_OUT | Y | FREQ |
Config is in MHz, output is in MHz |
| NDIM | N | 1 - for weights 2 - for data |
PST only supports complex valued data but the weights file is real valued |
| BYTES_PER_SECOND | N | VR - NCHAN * NPOL * NBIT * NDIM / 8e6 / TSAMPFT - NCHAN_FT * NPOL_OUT * NBIT_OUT * NDIM / 8e6 / TSAMP |
|
| DATA_HOST | N | specific to env | This is specific to RECV.CORE and the host IP it is on |
| DATA_PORT | N | specific to env | This is specific to RECV.CORE and port number to receive data on |
| TELESCOPE | N | "SKALow" or "SKAMid" | This depends on which telescope and the UDP format used |
| UTC_START | N | The start time in UTC of the scan to the precision of a second. | |
| PICOSECONDS | N | The fractional time in picoseconds after the UTC_START that the scan actually started | |
| RESOLUTION | N | (NSAMP_PER_PACKET * NCHAN * NDIM * NPOL * NBIT) / 8 |
The number of bytes to have samples for all the channels. Note that this value in a header file is calculated differently. |
| OBS_OFFSET | N | The data offset from the start of a scan that the file is for, this should be a multiple of the RESOLUTION closest at or beyond 10 seconds | |
| FILE_NUMBER | N | The file number, starting from 0 | |
| NANT | N | len(ANTENNAE) |
This is equal to len(receptors) |
| NMASK | N | len(rfi_frequency_masks) |
This is equal to len(rfi_frequency_masks) |
| FREQ_MASK | N | determine from rfi_frequency_masks |
See below about how this is encoded. |
| CHANNEL_MASK | N | determine from rfi_frequency_masks |
See below about how this is encoded. |
RFI Frequency Masks mapping
The rfi_frequency_masks value in the PST scan configuration schema is defined as a list of pairs of start and end frequencies with units of Hertz (Hz).
Both FREQ_MASK and CHANNEL_MASK are encoded in the DADA header file as a comma separated list of pairs of values that themselves have a semicolon (:) between the pair values.
If the input list was [[ V1, V2], [ V3, V4], ...] then the value in the DADA header file is encoded as V1:V2,V3:V4,....
In the case of FREQ_MASK the values are stored in MHz even though the values in ``rfi_frequency_masksare in Hz. ForCHANNEL_MASKthe values are the offset from theSTART_CHANNEL`
value.
Metadata fields
The following fields don’t have a direct mapping back to the scan configuration
| Key | Value / Default | CSP Sources | Notes |
|---|---|---|---|
| execution_block | EB_ID |
execution_block_id |
|
| context/intent | "Tied-array beam observation of |
source |
|
| context/notes | "Unknown" | N/A | If INTENT is in the headers we could get this |
| config/image | "artefact.skao.int/ska-pst/ska-pst" | N/A | This is the CONFIG_IMAGE constant within SEND code |
| config/version | "0.1.3" | N/A | This is the CONFIG_IMAGE constant within SEND code - should check if this should just be the version of the application |
| obscore/dataproduct_type | "timeseries" | N/A | For different processing modes we will need to update this |
| obscore/dataproduct_subtype | "voltages" | N/A | For different processing modes we will need to update this |
| obscore/calib_level | 0 | N/A | |
| obscore/obs_id | SCAN_ID |
scan_id |
|
| obscore/access_estsize | sum of just the data files data sections | N/A | |
| obscore/s_ra | STT_CRD1 |
coordinates/ra |
This is in deg |
| obscore/s_dec | STT_CRD2 |
coordinates/dec |
This is in deg |
| obscore/t_min | UTC_START |
N/A | value is in MJD calculated from string. Note the PICOSECONDS is missing. |
| obscore/t_max | t_min + scan_length |
N/A | |
| obscore/t_resolution | TSAMP |
frequency_band |
|
| obscore/t_exptime | scan_length |
N/A | |
| obscore/facility_name | "SKA-Observatory" | N/A | |
| obscore/instrument_name | TELESCOPE |
frequency_band |
Frequency band can be used to work out value. SDP puts a - after the SKA (i.e. SKA-Low or SKA-Mid) |
| obscore/pol_xel | NPOL |
frequency_band |
|
| obscore/pol_states | "null" | N/A | |
| obscore/em_xel | NCHAN |
frequency_band and total_bandwidth |
|
| obscore/em_unit | "Hz" | N/A | |
| obscore/em_min | (FREQ - BW/2) * 1e6 |
centre_frequency and total_bandwidth |
Units in Hz not MHz so there is a factor of 1e6 |
| obscore/em_max | (FREQ + BW/2) * 1e6 |
centre_frequency and total_bandwidth |
Units in Hz not MHz so there is a factor of 1e6 |
| obscore/em_res_power | "null" | N/A | |
| obscore/em_resolution | (BW / NCHAN) * 1e6 |
frequency_band, total_bandwidth and centre_frequency |
Units in Hz not MHz so there is a factor of 1e6 |
| obscore/o_ucd | "null" | N/A |
Potential bugs in mapping
config/versionbe the version ofska-pstthat sent it and not be hardcode to0.1.3?obscore/dataproduct_typeandobscore/dataproduct_subtypeare currently hardcoded for VR modeobscore/t_minis not including the fractional seconds of the start time
Config values currently not mapped
The following keys are on the scan configuration that are not sent to RECV or DSP which means they don’t get in to the output files
destination_address- this should be where to send data to for SDP. Currently we use a volume mountsubint_duration- this should map to OUTSUBINT. This should also affect how often a file is written out, for voltage recorder and flow through modes write out 10 seconds of data per file.
Documentation mismatch
source - says should map to SRC_NAME but we use SOURCE