ska_pst.stat.hdf5

This module is used for handling a HDF5 STAT file.

class ska_pst.stat.hdf5.Dimension(value)[source]

An enum used to represent the complex dimension/component within the data.

IMAG = 1

REAL = 0

property text: str

Map dimension enum value to text used in data frames.

Returns: ‘Real’ if value is REAL else ‘Imag’
Return type: str

class ska_pst.stat.hdf5.Polarisation(value)[source]

An enum used to represent polarisation indexes within the data.

POL_A = 0

POL_B = 1

static as_string(polarisations: List[Polarisation]) → str[source]

Return a valid string representation of a list of polarisations.

For a singular polarisation the value is the same as Polarisation.text but for a list it is “Both”. However, the list of polarisations needs to be unique.

This is the dual of Polarisations.from_string.

Parameters: polarisations (List[Polarisation]) – the list of polarisations to turn in a string representation.
Returns: a valid string representation of a list of polarisations.
Return type: str
Raises: AssertionError – incorrect list of polarisations provided.

static from_string(value: str) → List[Polarisation][source]

Return a list of polarisation enum values based on input string.

Valid values of input string are: A, B or Both. All other values are invalid. This maps to what is expected in the PST Scan Configuration schema.

Parameters: value (str) – the value to convert to a list of polarisations.
Returns: a list of polarisation enum values based on input string.
Return type: List[Polarisation]
Raises: AssertionError – if string is invalid.

property text: str

Map polarisation enum value to text used in data frames.

Returns: ‘A’ if value is POL_A else ‘B’
Return type: str

class ska_pst.stat.hdf5.StatFileFormat(*, header_dtype: 'npt.Void', has_weights: 'bool', has_polarisations: 'bool')[source]

has_polarisations: bool: Indicator for whether file format includes the selected polarisations.

has_weights: bool: Indicator for whether file format has weights statistics or not.

header_dtype: nptyping.Void: The Numpy structured array data type.

class ska_pst.stat.hdf5.StatisticsData(*, mean_frequency_avg: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32), mean_frequency_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32), variance_frequency_avg: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32), variance_frequency_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32), mean_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.Float32), variance_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.Float32), mean_spectral_power: nptyping.NDArray.(typing.Literal['NPol, NChan'], nptyping.Float32), max_spectral_power: nptyping.NDArray.(typing.Literal['NPol, NChan'], nptyping.Float32), histogram_1d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NDim, NBin'], nptyping.UInt32), histogram_1d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim, NBin'], nptyping.UInt32), rebinned_histogram_2d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NRebin, NRebin'], nptyping.UInt32), rebinned_histogram_2d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NRebin, NRebin'], nptyping.UInt32), rebinned_histogram_1d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NDim, NRebin'], nptyping.UInt32), rebinned_histogram_1d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim, NRebin'], nptyping.UInt32), num_clipped_samples_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.UInt32), num_clipped_samples: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.UInt32), num_clipped_samples_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.UInt32), spectrogram: nptyping.NDArray.(typing.Literal['NPol, NFreqBin, NTimeBin'], nptyping.Float32), timeseries: nptyping.NDArray.(typing.Literal['NPol, NTimeBin, 3'], nptyping.Float32), timeseries_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NTimeBin, 3'], nptyping.Float32), min_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32), max_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32), mean_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32))[source]

A data class that represents the statistics loaded from the HDF5 file.

histogram_1d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NDim, NBin'], nptyping.UInt32): Histogram of the input data integer states for each polarisation and dimension, averaged over all channels.

histogram_1d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim, NBin'], nptyping.UInt32): Histogram of the input data integer states for each polarisation and dimension, averaged over all channels, expect those flagged for RFI.

max_spectral_power: nptyping.NDArray.(typing.Literal['NPol, NChan'], nptyping.Float32): Maximum power spectra of the data for each polarisation and channel.

max_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32): The maximum of the weights for each channel.

mean_frequency_avg: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32): The mean of the data for each polarisation and dimension, averaged over all channels.

mean_frequency_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32): The mean of the data for each polarisation and dimension, averaged over all channels, expect those flagged for RFI.

mean_spectral_power: nptyping.NDArray.(typing.Literal['NPol, NChan'], nptyping.Float32): Mean power spectra of the data for each polarisation and channel.

mean_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.Float32): The mean of the data for each polarisation, dimension and channel.

mean_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32): The mean of the weights for each channel.

min_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32): The minimum of the weights for each channel.

num_clipped_samples: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.UInt32): Number of clipped input samples (maximum level) for each polarisation, dimension, averaged over all channels.

num_clipped_samples_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.UInt32): Number of clipped input samples (maximum level) for each polarisation, dimension, averaged over all channels, except those flagged for RFI.

num_clipped_samples_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.UInt32): Number of clipped input samples (maximum level) for each polarisation, dimension and channel.

rebinned_histogram_1d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NDim, NRebin'], nptyping.UInt32): Rebinned histogram of the input data integer states for each polarisation and dimension, averaged over all channels.

rebinned_histogram_1d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim, NRebin'], nptyping.UInt32): Rebinned histogram of the input data integer states for each polarisation and dimension, averaged over all channels, expect those flagged for RFI.

rebinned_histogram_2d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NRebin, NRebin'], nptyping.UInt32): Rebinned 2D histogram of the input data integer states for each polarisation, averaged over all channels.

rebinned_histogram_2d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NRebin, NRebin'], nptyping.UInt32): Rebinned 2D histogram of the input data integer states for each polarisation, averaged over all channels, expect those flagged for RFI.

spectrogram: nptyping.NDArray.(typing.Literal['NPol, NFreqBin, NTimeBin'], nptyping.Float32): Spectrogram of the data for each polarisation, averaged a configurable number of temporal and spectral bins (default ~1000).

timeseries: nptyping.NDArray.(typing.Literal['NPol, NTimeBin, 3'], nptyping.Float32): Time series of the data for each polarisation, rebinned in time to ntime_bins, averaged over all frequency channels.

timeseries_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NTimeBin, 3'], nptyping.Float32): Time series of the data for each polarisation, re-binned in time.

variance_frequency_avg: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32): The variance of the data for each polarisation and dimension, averaged over all channels.

variance_frequency_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32): The variance of the data for each polarisation and dimension, averaged over all channels, expect those flagged for RFI.

variance_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.Float32): The variance of the data for each polarisation, dimension and channel.

class ska_pst.stat.hdf5.StatisticsMetadata(*, file_format_version: str = '1.1.0', eb_id: str, telescope: str, scan_id: int, beam_id: str, utc_start: str, t_min: float, t_max: float, frequency_mhz: float, bandwidth_mhz: float, start_chan: int, npol: int, ndim: int, nchan: int, nchan_ds: int, ndat_ds: int, histogram_nbin: int, nrebin: int, channel_freq_mhz: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float64), timeseries_bins: nptyping.NDArray.(typing.Literal['NTimeBin'], nptyping.Float64), frequency_bins: nptyping.NDArray.(typing.Literal['NFreqBin'], nptyping.Float64), num_samples: int, num_samples_rfi_excised: int, num_samples_spectrum: nptyping.NDArray.(typing.Literal['NChan'], nptyping.UInt32), num_invalid_packets: int, num_weight_samples: int = 0, has_weights: bool = False, polarisations: str = 'Both')[source]

Data class modeling the metadata from a HDF5 STAT data file.

bandwidth_mhz: float: The bandwidth of data

beam_id: str: The beam id for the generated data file

channel_freq_mhz: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float64): The centre frequencies of each channel (MHz).

eb_id: str: The execution block id the file relates to.

property end_chan: int: Get the last channel that the header is for.

file_format_version: str = '1.1.0': The format of the HDF5 STAT file. Default is 1.1.0.

frequency_bins: nptyping.NDArray.(typing.Literal['NFreqBin'], nptyping.Float64): The frequency bins used for the spectrogram attribute (MHz).

frequency_mhz: float: The centre frequency for the data as a whole

has_weights: bool = False: Indicator of whether weights are included in the statistics or not.

histogram_nbin: int: The number of bins in the histogram data.

nchan: int: Number of channels in the data.

nchan_ds: int: The number of frequency bins in the spectrogram data.

ndat_ds: int: The number of temporal bins in the spectrogram and timeseries data.

ndim: int: Number of dimensions in the data (should be 2 for complex data).

npol: int: Number of polarisations.

nrebin: int: Number of bins to use for rebinned histograms

num_invalid_packets: int: The number invalid packets received while calculating the statistics.

num_samples: int: The total number of samples used to calculate the sample statistics.

num_samples_rfi_excised: int: The total number of samples used to calculate the sample statistics, expect those flagged for RFI.

num_samples_spectrum: nptyping.NDArray.(typing.Literal['NChan'], nptyping.UInt32): The number of samples, per channel, to calculate the sample statistics.

num_weight_samples: int = 0: The number of samples used to calculate the weight statisitics.

polarisations: str = 'Both'

Get a string representation of the polarisations.

Values are either A, B or Both.

property polarisations_list: List[Polarisation]

Get a list of polarisations of the STAT HDF5.

For version 1.0.0 it is assumed both Pol A and Pol B are valid. However, since version 1.1.0 and Flow Through stats the output stats could be for Pol A, Pol B or both.

scan_id: int: The scan id for the generated data file

start_chan: int: The starting channel number.

t_max: float: The time offset, in seconds, from the UTC start time to represent the time at the end the file.

t_min: float: The time offset, in seconds, from the UTC start time to represent the time at the start the file.

telescope: str: The telescope the data were collected for. Should be SKALow or SKAMid

timeseries_bins: nptyping.NDArray.(typing.Literal['NTimeBin'], nptyping.Float64): The timestamp offsets for each temporal bin.

utc_start: str: The UTC ISO formatted start time in of scan to the nearest second.

class ska_pst.stat.hdf5.TimeseriesDimension(value)[source]

An enum used to represent which index to use for max/min/mean in timeseries data.

MAX = 0

MEAN = 2

MIN = 1

ska_pst.stat.hdf5.get_stat_file_format(version: str) → StatFileFormat[source]

ska_pst.stat.hdf5.map_hdf5_key(hdf5_key: str) → str[source]: Map a key from a HDF5 attribute/dataset to a model dataclass property.