ska_sdp_instrumental_calibration.data_managers.visibility module
- ska_sdp_instrumental_calibration.data_managers.visibility.create_template_vis_from_ms(msname, ack=False, datacolumn='DATA', field_ids=None, data_desc_ids=None)[source]
Create empty "template" visibility objects from a Measurement Set.
This function inspects the provided Measurement Set (MS) to determine the shapes, types, and metadata required to create Visibility objects. It returns a list of these objects where the data arrays (vis, flags, weights, uvw) are initialized as empty Dask arrays. These templates can be populated later.
- Parameters:
msname (str) -- The file path to the Measurement Set.
ack (bool, optional) -- If True, print an acknowledgement message when opening the table. Default is False.
datacolumn (str, optional) -- The name of the column in the MS to use for determining the data type of the visibility data. Default is "DATA".
field_ids (list[int], optional) -- A list of field IDs to process. If None, defaults to [0].
data_desc_ids (list[int], optional) -- A list of data description IDs to process. If None, defaults to [0].
- Returns:
A list of Visibility objects corresponding to the selected field and data description IDs. The data arrays within are empty Dask arrays.
- Return type:
list[Visibility]
- Raises:
ValueError -- If the selection for a specific Field ID or Data Description ID yields no rows in the MS.
KeyError -- If the polarization configuration in the MS is not recognized.
- ska_sdp_instrumental_calibration.data_managers.visibility.get_col_from_ms(msname, colname, start_time_idx, ntimes, num_baselines, ack=False, field_ids=None, data_desc_ids=None)[source]
Extract data from a specific column in a Measurement Set.
This function reads a slice of data from the specified column, determined by a starting time index, a duration (number of times), and the number of baselines. It iterates over the specified Field IDs and Data Description IDs, returning the extracted data for each combination.
- Parameters:
msname (str) -- The file path to the Measurement Set.
colname (str) -- The name of the column to retrieve (e.g., "DATA", "UVW", "FLAG").
start_time_idx (int) -- The index of the starting time step to read. This is used to calculate the starting row offset:
start_time_idx * num_baselines.ntimes (int) -- The number of time steps to read.
num_baselines (int) -- The number of baselines per time step. Used to calculate the total number of rows to read.
ack (bool, optional) -- If True, print an acknowledgement message when opening the table. Default is False.
field_ids (list[int], optional) -- A list of Field IDs to query. If None, defaults to [0].
data_desc_ids (list[int], optional) -- A list of Data Description IDs to query. If None, defaults to [0].
- Returns:
A list of NumPy arrays containing the column data. Each element in the list corresponds to the data extracted for a specific combination of Field ID and Data Description ID.
- Return type:
- Raises:
ValueError -- If the query for a specific Field ID or Data Description ID returns zero rows (empty selection).
- ska_sdp_instrumental_calibration.data_managers.visibility.load_ms_as_dataset_with_time_chunks(ms_name, times_per_chunk, ack=False, datacolumn='DATA', field_id=0, data_desc_id=0)[source]
Load MSv2 data into a Visibility dataset using distributed time chunks.
This function loads data for a specific field and data description ID into a Visibility object. The loading is distributed, chunking the data along the time axis to facilitate parallel processing (e.g., with Dask).
- Parameters:
ms_name (str) -- The file path to the Measurement Set.
times_per_chunk (int) -- The number of time steps to include in each Dask chunk.
ack (bool, optional) -- If True, print an acknowledgement message when opening the table. Default is False.
datacolumn (str, optional) -- The name of the column to read (e.g., "DATA"). Default is "DATA".
field_id (int, optional) -- The Field ID to load. Default is 0.
data_desc_id (int, optional) -- The Data Description ID to load. Default is 0.
- Returns:
The loaded Visibility dataset with dask-backed arrays.
- Return type:
Visibility
Notes
The baselines dimension in the returned dataset is simplified to a NumPy array of baseline IDs, rather than the standard Pandas MultiIndex used by the Visibility class. This modification is necessary because xarray operations like map_blocks do not support Pandas MultiIndex coordinates.
Important: You must restore the baselines to the original Pandas MultiIndex format before passing this object to any functions in ska-sdp-func-python.
- ska_sdp_instrumental_calibration.data_managers.visibility.write_ms_to_zarr(input_ms_paths, vis_cache_directory, zarr_chunks, ack=False, datacolumn='DATA', field_id=0, data_desc_id=0)[source]
Convert a MSv2 into a Visibility dataset and write it to zarr. NOTE: The baselines coordinates in Visibility are simplified. See note section in
load_ms_as_dataset_with_time_chunks()
- ska_sdp_instrumental_calibration.data_managers.visibility.write_visibility_to_zarr(directory_to_write, zarr_chunks, visibility)[source]
Writes Visibility to zarr file in the provided directory.
Since native xarray.to_zarr() function does not allow writing python-object like attributes and coordinates, this function first writes the attributes and "baselines" coordinate values as python pickeled files, and removed them from visibility. Then writes the rest of the visibility to a zarr file.
- Returns:
Returns a dask delayed zarr writer task which the user needs to call compute on to write the actual visibilities.
- Return type:
dask.delayed
- ska_sdp_instrumental_calibration.data_managers.visibility.read_visibility_from_zarr(vis_cache_directory, vis_chunks)[source]
Read a Visibility dataset from a Zarr cache directory.
This function reconstructs a Visibility object by opening the main Zarr storage and manually reloading metadata that cannot be natively stored in Zarr (such as complex object attributes and Pandas MultiIndex baselines) from separate pickle files.
- Parameters:
- Returns:
The fully reconstructed Visibility dataset with attributes and baseline coordinates restored.
- Return type:
Visibility
- ska_sdp_instrumental_calibration.data_managers.visibility.check_if_cache_files_exist(vis_cache_directory)[source]
Verify if the necessary cache files exist in the specified directory.
This function checks for the presence of three specific artifacts required to reconstruct a Visibility dataset: the attributes pickle file, the baselines pickle file, and the Zarr directory itself.