ska_pst.testutils.stats

Submodule for STAT related code.

class ska_pst.testutils.stats.SampleStatistics(mean: float, variance: float, num_samples: int)[source]

Data class that models the statistics of a sample.

Variables
  • mean (float) – the mean of the sample

  • variance (float) – the variance of the sample

  • num_samples (float) – the number of samples used to calculate the statistics

mean: float
num_samples: int
variance: float
class ska_pst.testutils.stats.ScanStatFileWatcher(*args: Any, **kwargs: Any)[source]

Class to watch for when STAT file files are created.

Instances of this class watches a scan directory for real time monitoring STAT HDF5 files to be created and stores the events for later.

event_time_diffs() List[StatFileEventDifference][source]

Get a list of differences between file creation events.

property events: List[StatFileCreatedEvent]

Get the list of file created events.

on_created(event: watchdog.events.FileSystemEvent) None[source]

Handle an on created system event.

The event comes from watchdog and this method converts the event to a StatFileCreatedEvent instance and saves the event that can then later be retrieved from events.

stop() None[source]

Stop watching for STAT files.

watch() None[source]

Start watching for STAT files.

class ska_pst.testutils.stats.StatFileCreatedEvent(*, file_path: Path, create_datetime: float)[source]

Data class capturing a file creation event.

Variables
  • file_path (pathlib.Path) – the full path to the file that was created.

  • create_datetime (float) – the time, in seconds from epoch, when the file was created.

create_datetime: float
file_path: Path
class ska_pst.testutils.stats.StatFileEventDifference(*, first_file_event: InitVar[StatFileCreatedEvent], second_file_event: InitVar[StatFileCreatedEvent])[source]

A data class used to calculate differences in file creation events.

Variables
  • first_file_path (pathlib.Path) – the path to the file that was created first

  • second_file_path (pathlib.Path) – the path to the file that was created second

  • creation_time_difference (float) – the difference in creation time of the files

creation_time_difference: float
first_file_event: InitVar[StatFileCreatedEvent]
first_file_path: pathlib.Path
second_file_event: InitVar[StatFileCreatedEvent]
second_file_path: pathlib.Path
ska_pst.testutils.stats.assert_statistics(population_mean: float, population_var: float, sample_stats: SampleStatistics, channel: int, pol: int, tolerance: float = 6.0) None[source]

Assert that sample mean and var are within a given tolerance of population stats.

Parameters
  • population_mean (float) – the mean of the population

  • population_var (float) – the variance of the population

  • sample_stats (SampleStatistics) – the samples statistics to assert against the population stats.

  • num_samples (int) – the sample size

  • channel (int) – the channel that is being tested

  • pol (int) – the polarisation that is being tested

  • tolerance (float, optional) – the number of sigma to allow being away from population value, defaults to 6.0

ska_pst.testutils.stats.assert_statistics_for_channels(channel_data: pandas.DataFrame, population_mean: float, population_var: float, pol: str, tolerance: float = 6.0) None[source]

Assert that sample mean and var are within a given tolerance of population stats for each channel.

Parameters
  • channel_data (pd.DataFrame) – a data frame with statistics split by channel. This must include the following columns: “Mean”, “Var.”, “Num Samples”. This should also be specific for a given polarisation and complex data dimension (e.g. for Pol A real data).

  • population_mean (float) – the mean of the population

  • population_var (float) – the variance of the population

  • pol (str) – the polarisation to be tested, A or B

  • tolerance (float, optional) – the number of sigma to allow being away from population value, defaults to 6.0

ska_pst.testutils.stats.assert_statistics_for_digitised_data(data: numpy.ndarray, nbit: int, tolerance: float = 9.0) None[source]

Assert that sample mean and var are within a given tolerance of population stats for TFP data.

This function asserts that the given Numpy array of data has a mean and variance within a given tolerance of the population mean and variance based on the number of bits used in the digitisation of the data.

Parameters
  • data (np.ndarray) – an array of either real or complex value floating point data.

  • nbit (int) – the number of bits used in the digitisation of the data.

  • tolerance (float, optional) – the number of sigma to allow being away from population value, defaults to 9.0