ska_pst.testutils.stats
Submodule for STAT related code.
- class ska_pst.testutils.stats.SampleStatistics(mean: float, variance: float, num_samples: int)[source]
Data class that models the statistics of a sample.
- Variables
mean (float) – the mean of the sample
variance (float) – the variance of the sample
num_samples (float) – the number of samples used to calculate the statistics
- mean: float
- num_samples: int
- variance: float
- class ska_pst.testutils.stats.ScanStatFileWatcher(*args: Any, **kwargs: Any)[source]
Class to watch for when STAT file files are created.
Instances of this class watches a scan directory for real time monitoring STAT HDF5 files to be created and stores the events for later.
- event_time_diffs() List[StatFileEventDifference][source]
Get a list of differences between file creation events.
- property events: List[StatFileCreatedEvent]
Get the list of file created events.
- on_created(event: watchdog.events.FileSystemEvent) None[source]
Handle an on created system event.
The event comes from watchdog and this method converts the event to a
StatFileCreatedEventinstance and saves the event that can then later be retrieved fromevents.
- class ska_pst.testutils.stats.StatFileCreatedEvent(*, file_path: Path, create_datetime: float)[source]
Data class capturing a file creation event.
- Variables
file_path (pathlib.Path) – the full path to the file that was created.
create_datetime (float) – the time, in seconds from epoch, when the file was created.
- create_datetime: float
- file_path: Path
- class ska_pst.testutils.stats.StatFileEventDifference(*, first_file_event: InitVar[StatFileCreatedEvent], second_file_event: InitVar[StatFileCreatedEvent])[source]
A data class used to calculate differences in file creation events.
- Variables
first_file_path (pathlib.Path) – the path to the file that was created first
second_file_path (pathlib.Path) – the path to the file that was created second
creation_time_difference (float) – the difference in creation time of the files
- creation_time_difference: float
- first_file_event: InitVar[StatFileCreatedEvent]
- first_file_path: pathlib.Path
- second_file_event: InitVar[StatFileCreatedEvent]
- second_file_path: pathlib.Path
- ska_pst.testutils.stats.assert_statistics(population_mean: float, population_var: float, sample_stats: SampleStatistics, channel: int, pol: int, tolerance: float = 6.0) None[source]
Assert that sample mean and var are within a given tolerance of population stats.
- Parameters
population_mean (float) – the mean of the population
population_var (float) – the variance of the population
sample_stats (SampleStatistics) – the samples statistics to assert against the population stats.
num_samples (int) – the sample size
channel (int) – the channel that is being tested
pol (int) – the polarisation that is being tested
tolerance (float, optional) – the number of sigma to allow being away from population value, defaults to 6.0
- ska_pst.testutils.stats.assert_statistics_for_channels(channel_data: pandas.DataFrame, population_mean: float, population_var: float, pol: str, tolerance: float = 6.0) None[source]
Assert that sample mean and var are within a given tolerance of population stats for each channel.
- Parameters
channel_data (pd.DataFrame) – a data frame with statistics split by channel. This must include the following columns: “Mean”, “Var.”, “Num Samples”. This should also be specific for a given polarisation and complex data dimension (e.g. for Pol A real data).
population_mean (float) – the mean of the population
population_var (float) – the variance of the population
pol (str) – the polarisation to be tested, A or B
tolerance (float, optional) – the number of sigma to allow being away from population value, defaults to 6.0
- ska_pst.testutils.stats.assert_statistics_for_digitised_data(data: numpy.ndarray, nbit: int, tolerance: float = 9.0) None[source]
Assert that sample mean and var are within a given tolerance of population stats for TFP data.
This function asserts that the given Numpy array of data has a mean and variance within a given tolerance of the population mean and variance based on the number of bits used in the digitisation of the data.
- Parameters
data (np.ndarray) – an array of either real or complex value floating point data.
nbit (int) – the number of bits used in the digitisation of the data.
tolerance (float, optional) – the number of sigma to allow being away from population value, defaults to 9.0