ska_pst.stat.hdf5

This module is used for handling a HDF5 STAT file.

class ska_pst.stat.hdf5.Dimension(value)[source]

An enum used to represent the complex dimension/component within the data.

IMAG = 1
REAL = 0
property text: str

Map dimension enum value to text used in data frames.

Returns

‘Real’ if value is REAL else ‘Imag’

Return type

str

class ska_pst.stat.hdf5.Polarisation(value)[source]

An enum used to represent polarisation indexes within the data.

POL_A = 0
POL_B = 1
static as_string(polarisations: List[Polarisation]) str[source]

Return a valid string representation of a list of polarisations.

For a singular polarisation the value is the same as Polarisation.text but for a list it is “Both”. However, the list of polarisations needs to be unique.

This is the dual of Polarisations.from_string.

Parameters

polarisations (List[Polarisation]) – the list of polarisations to turn in a string representation.

Returns

a valid string representation of a list of polarisations.

Return type

str

Raises

AssertionError – incorrect list of polarisations provided.

static from_string(value: str) List[Polarisation][source]

Return a list of polarisation enum values based on input string.

Valid values of input string are: A, B or Both. All other values are invalid. This maps to what is expected in the PST Scan Configuration schema.

Parameters

value (str) – the value to convert to a list of polarisations.

Returns

a list of polarisation enum values based on input string.

Return type

List[Polarisation]

Raises

AssertionError – if string is invalid.

property text: str

Map polarisation enum value to text used in data frames.

Returns

‘A’ if value is POL_A else ‘B’

Return type

str

class ska_pst.stat.hdf5.StatFileFormat(*, header_dtype: 'npt.Void', has_weights: 'bool', has_polarisations: 'bool')[source]
has_polarisations: bool

Indicator for whether file format includes the selected polarisations.

has_weights: bool

Indicator for whether file format has weights statistics or not.

header_dtype: nptyping.Void

The Numpy structured array data type.

class ska_pst.stat.hdf5.StatisticsData(*, mean_frequency_avg: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32), mean_frequency_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32), variance_frequency_avg: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32), variance_frequency_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32), mean_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.Float32), variance_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.Float32), mean_spectral_power: nptyping.NDArray.(typing.Literal['NPol, NChan'], nptyping.Float32), max_spectral_power: nptyping.NDArray.(typing.Literal['NPol, NChan'], nptyping.Float32), histogram_1d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NDim, NBin'], nptyping.UInt32), histogram_1d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim, NBin'], nptyping.UInt32), rebinned_histogram_2d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NRebin, NRebin'], nptyping.UInt32), rebinned_histogram_2d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NRebin, NRebin'], nptyping.UInt32), rebinned_histogram_1d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NDim, NRebin'], nptyping.UInt32), rebinned_histogram_1d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim, NRebin'], nptyping.UInt32), num_clipped_samples_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.UInt32), num_clipped_samples: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.UInt32), num_clipped_samples_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.UInt32), spectrogram: nptyping.NDArray.(typing.Literal['NPol, NFreqBin, NTimeBin'], nptyping.Float32), timeseries: nptyping.NDArray.(typing.Literal['NPol, NTimeBin, 3'], nptyping.Float32), timeseries_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NTimeBin, 3'], nptyping.Float32), min_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32), max_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32), mean_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32))[source]

A data class that represents the statistics loaded from the HDF5 file.

histogram_1d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NDim, NBin'], nptyping.UInt32)

Histogram of the input data integer states for each polarisation and dimension, averaged over all channels.

histogram_1d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim, NBin'], nptyping.UInt32)

Histogram of the input data integer states for each polarisation and dimension, averaged over all channels, expect those flagged for RFI.

max_spectral_power: nptyping.NDArray.(typing.Literal['NPol, NChan'], nptyping.Float32)

Maximum power spectra of the data for each polarisation and channel.

max_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32)

The maximum of the weights for each channel.

mean_frequency_avg: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32)

The mean of the data for each polarisation and dimension, averaged over all channels.

mean_frequency_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32)

The mean of the data for each polarisation and dimension, averaged over all channels, expect those flagged for RFI.

mean_spectral_power: nptyping.NDArray.(typing.Literal['NPol, NChan'], nptyping.Float32)

Mean power spectra of the data for each polarisation and channel.

mean_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.Float32)

The mean of the data for each polarisation, dimension and channel.

mean_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32)

The mean of the weights for each channel.

min_weights: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float32)

The minimum of the weights for each channel.

num_clipped_samples: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.UInt32)

Number of clipped input samples (maximum level) for each polarisation, dimension, averaged over all channels.

num_clipped_samples_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.UInt32)

Number of clipped input samples (maximum level) for each polarisation, dimension, averaged over all channels, except those flagged for RFI.

num_clipped_samples_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.UInt32)

Number of clipped input samples (maximum level) for each polarisation, dimension and channel.

rebinned_histogram_1d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NDim, NRebin'], nptyping.UInt32)

Rebinned histogram of the input data integer states for each polarisation and dimension, averaged over all channels.

rebinned_histogram_1d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim, NRebin'], nptyping.UInt32)

Rebinned histogram of the input data integer states for each polarisation and dimension, averaged over all channels, expect those flagged for RFI.

rebinned_histogram_2d_freq_avg: nptyping.NDArray.(typing.Literal['NPol, NRebin, NRebin'], nptyping.UInt32)

Rebinned 2D histogram of the input data integer states for each polarisation, averaged over all channels.

rebinned_histogram_2d_freq_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NRebin, NRebin'], nptyping.UInt32)

Rebinned 2D histogram of the input data integer states for each polarisation, averaged over all channels, expect those flagged for RFI.

spectrogram: nptyping.NDArray.(typing.Literal['NPol, NFreqBin, NTimeBin'], nptyping.Float32)

Spectrogram of the data for each polarisation, averaged a configurable number of temporal and spectral bins (default ~1000).

timeseries: nptyping.NDArray.(typing.Literal['NPol, NTimeBin, 3'], nptyping.Float32)

Time series of the data for each polarisation, rebinned in time to ntime_bins, averaged over all frequency channels.

timeseries_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NTimeBin, 3'], nptyping.Float32)

Time series of the data for each polarisation, re-binned in time.

variance_frequency_avg: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32)

The variance of the data for each polarisation and dimension, averaged over all channels.

variance_frequency_avg_rfi_excised: nptyping.NDArray.(typing.Literal['NPol, NDim'], nptyping.Float32)

The variance of the data for each polarisation and dimension, averaged over all channels, expect those flagged for RFI.

variance_spectrum: nptyping.NDArray.(typing.Literal['NPol, NDim, NChan'], nptyping.Float32)

The variance of the data for each polarisation, dimension and channel.

class ska_pst.stat.hdf5.StatisticsMetadata(*, file_format_version: str = '1.1.0', eb_id: str, telescope: str, scan_id: int, beam_id: str, utc_start: str, t_min: float, t_max: float, frequency_mhz: float, bandwidth_mhz: float, start_chan: int, npol: int, ndim: int, nchan: int, nchan_ds: int, ndat_ds: int, histogram_nbin: int, nrebin: int, channel_freq_mhz: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float64), timeseries_bins: nptyping.NDArray.(typing.Literal['NTimeBin'], nptyping.Float64), frequency_bins: nptyping.NDArray.(typing.Literal['NFreqBin'], nptyping.Float64), num_samples: int, num_samples_rfi_excised: int, num_samples_spectrum: nptyping.NDArray.(typing.Literal['NChan'], nptyping.UInt32), num_invalid_packets: int, num_weight_samples: int = 0, has_weights: bool = False, polarisations: str = 'Both')[source]

Data class modeling the metadata from a HDF5 STAT data file.

bandwidth_mhz: float

The bandwidth of data

beam_id: str

The beam id for the generated data file

channel_freq_mhz: nptyping.NDArray.(typing.Literal['NChan'], nptyping.Float64)

The centre frequencies of each channel (MHz).

eb_id: str

The execution block id the file relates to.

property end_chan: int

Get the last channel that the header is for.

file_format_version: str = '1.1.0'

The format of the HDF5 STAT file. Default is 1.1.0.

frequency_bins: nptyping.NDArray.(typing.Literal['NFreqBin'], nptyping.Float64)

The frequency bins used for the spectrogram attribute (MHz).

frequency_mhz: float

The centre frequency for the data as a whole

has_weights: bool = False

Indicator of whether weights are included in the statistics or not.

histogram_nbin: int

The number of bins in the histogram data.

nchan: int

Number of channels in the data.

nchan_ds: int

The number of frequency bins in the spectrogram data.

ndat_ds: int

The number of temporal bins in the spectrogram and timeseries data.

ndim: int

Number of dimensions in the data (should be 2 for complex data).

npol: int

Number of polarisations.

nrebin: int

Number of bins to use for rebinned histograms

num_invalid_packets: int

The number invalid packets received while calculating the statistics.

num_samples: int

The total number of samples used to calculate the sample statistics.

num_samples_rfi_excised: int

The total number of samples used to calculate the sample statistics, expect those flagged for RFI.

num_samples_spectrum: nptyping.NDArray.(typing.Literal['NChan'], nptyping.UInt32)

The number of samples, per channel, to calculate the sample statistics.

num_weight_samples: int = 0

The number of samples used to calculate the weight statisitics.

polarisations: str = 'Both'

Get a string representation of the polarisations.

Values are either A, B or Both.

property polarisations_list: List[Polarisation]

Get a list of polarisations of the STAT HDF5.

For version 1.0.0 it is assumed both Pol A and Pol B are valid. However, since version 1.1.0 and Flow Through stats the output stats could be for Pol A, Pol B or both.

scan_id: int

The scan id for the generated data file

start_chan: int

The starting channel number.

t_max: float

The time offset, in seconds, from the UTC start time to represent the time at the end the file.

t_min: float

The time offset, in seconds, from the UTC start time to represent the time at the start the file.

telescope: str

The telescope the data were collected for. Should be SKALow or SKAMid

timeseries_bins: nptyping.NDArray.(typing.Literal['NTimeBin'], nptyping.Float64)

The timestamp offsets for each temporal bin.

utc_start: str

The UTC ISO formatted start time in of scan to the nearest second.

class ska_pst.stat.hdf5.TimeseriesDimension(value)[source]

An enum used to represent which index to use for max/min/mean in timeseries data.

MAX = 0
MEAN = 2
MIN = 1
ska_pst.stat.hdf5.get_stat_file_format(version: str) StatFileFormat[source]
ska_pst.stat.hdf5.map_hdf5_key(hdf5_key: str) str[source]

Map a key from a HDF5 attribute/dataset to a model dataclass property.