Handling Incomplete Data

It is possible to have incomplete data where one or more Measurement Sets have missing visibilities, dish pointings, or other. The pipeline has been tested on the following scenarios:

Missing Visibilities

Two cases have been tested where the MAIN table (containing visibilities) has missing data. These cases are:

  1. An entire row of visibilities is missing for one or more timestamps. In this case, the expected number of data points would not match the number of existing rows. The Measurement Set reader in ska-sdp-datamodels used by the pipeline first creates a MAIN table with zeros and updates them based on the data in the Measurement Set being read. In effect, any missing rows are automatically filled, fixing the mismatch in the expected number of data points and existing rows. This feature in the Measurement Set reader could be problematic if Measurement Sets are badly corrupted but get populated with zeros and no errors are raised by the Measurement Set reader. However, it is expected that the pipeline returns invalid fits where necessary because of zero visibilities or gain amplitudes (after gain calibration).

  2. Visibilities are missing for one or more frequency channels for some timestamps. In an SKA observation, the Measurement Set writer adds incorrect visibility values for that row and flags it (in the FLAG sub-table). The pipeline respects the flags and applies them accordingly, so this kind of scenario should not causes issues in the pipeline.

Missing Dish Pointings

Two cases where the POINTING sub-table has missing data have been tested. These cases are:

  1. The sub-table is missing data for one or more timestamps, so the number of rows do not match the expected number of data points. The pipeline needs dish pointings for a single timestamp to calibrate for the pointing offsets, so data points for all dishes at the missing timestamps are discarded.

  2. The sub-table table has data for some antennas but not for others for a given timestamp. Similar to the first case, the number of rows do not match the expected number of data points, so the same fix should apply here too. The pipeline needs dish pointings for a single timestamp to calibrate for the pointing offsets, so data points for all dishes at the missing timestamps are discarded.

Data from Slews

The pipeline has only been tested on Measurement Sets where data from slewing have already been discarded. This is expected to be the case for the SKA observations. Users who intend to use the pipeline on Measurement Sets containing data from slews should contact us and offer to make the datasets available for testing.