ska_pydada

Module init code.

class ska_pydada.AsciiHeader(header_size: int = 4096, **kwargs: Any)[source]

A utility class to abstract over a DADA header.

This class extends an ordered dictionary to allow inserting and retrieving values. Values are stored as strings.

static from_bytes(data: bytes) → AsciiHeader[source]

Get an instance of an AsciiHeader from supplied bytes.

This converts the data to a string using decode and then calls the AsciiHeader.from_str().

Parameters: data (bytes) – the data to parse.
Returns: the bytes parsed as an AsciiHeader
Return type: AsciiHeader

static from_file(file: pathlib.Path | str) → AsciiHeader[source]

Load a header from a file.

The input file must but a text file, such as a config file. This will not handle a DADA file that has binary data.

Parameters: file (pathlib.Path | str) – the path to the file to load.
Returns: the file parsed as an AsciiHeader
Return type: AsciiHeader
Raises: AssertionError – if file size is more than 512KB in size.

static from_str(data: str) → AsciiHeader[source]

Get an instance of an AsciiHeader from supplied string.

Entries in the header file are delimited by a newline character, and the header and value pair are in turn delimited by a white space character.

Parameters: data (str) – the data to parse.
Returns: the string parsed as an AsciiHeader
Return type: AsciiHeader

get_float(key: str) → float[source]

Get the header value as a float.

Parameters

key (str) – the key of the record to get.

Returns

the value of the record as a float value.

Return type

float

Raises

KeyError – if key does not exist
ValueError – if value cannot be converted to a float.

get_int(key: str) → int[source]

Get the header value as an integer.

Parameters

key (str) – the key of the record to get.

Returns

the value of the record as a int value.

Return type

int

Raises

KeyError – if key does not exist
ValueError – if value cannot be converted to an integer.

get_value(key: str) → str[source]

Get the value of a header record given a key.

Parameters: key (str) – the key of the record to get.
Returns: the value of the record as a string value.
Return type: str
Raises: KeyError – if key does not exist

property header_size: int

The size of header in bytes.

This represents the number of bytes that a serialised header would be if a part of a DADA file. The header would be NULL filled if the output length is less than this value.

Returns: the header size, in bytes.
Return type: int

property resolution: int

Get the calculated resolution based on values in the header.

This is the number of bytes in a stride of data.

If the RESOLUTION key exists in the header than that value is used, else this is determined by NDIM, NBIT, NPOL, NCHAN and UDP_NSAMP. If not all the values exist then a value of 1 is returned.

Returns: the number of bytes for a stride of data.
Return type: int

set_value(key: str, value: Any) → None[source]

Set a value in the header.

Parameters

key (str) – the key of the header record to set.
value (Any) – the value of the record.

to_bytes() → bytes[source]

Convert the header to bytes.

Returns: the header converted to bytes but padded with NULL chars to be header_size in bytes long.
Return type: bytes

class ska_pydada.DadaFile(header: Optional[AsciiHeader] = None, raw_data: Optional[bytes] = None, logger: Optional[Logger] = None, **kwargs: Any)[source]

Class that can be used to read a PSR DADA file.

as_time_freq_pol() → numpy.ndarray[source]

Get the data as time, frequency and polarisation 3 dimensional array.

This returns the raw data as a 3 dimensional Numpy array with the following dimensions:

time

frequency

polarisation

The NCHAN header value defines the number of frequency channels. The NPOL parameter defines the number of polarisations.

This may return real or complex values based on the NDIM value in the header. If NDIM is 1 then real floating point data is returned, if 2 then complex value data is returned. In both cases NBIT is assumed to be 32. For all other values of NDIM then an assertion error is raised.

The number of time samples is defined as a free dimension the the shape of the data. This uses Numpy’s standard of passing -1 as the size of the dimension and lets Numpy determine the shape.

Returns: the raw data converted into a TFP Numpy array.
Return type: np.ndarray

data(shape: np._ShapeType | None = None, dtype: npt.DTypeLike = numpy.uint8) → np.ndarray[source]

Get the data as a numpy array.

This will return the raw byte data as a Numpy array with a data type of dtype.

If the shape parameter is specified then the array will be reshaped using row major (i.e. ‘C’ format). If no shape is provided a 1-dimensional array is returned and the client will need to perform the reshaping themselves.

Parameters

shape (np._ShapeType | None, optional) – the required shape of the output array, defaults to None. If no shape provided then a 1D array is returned.
dtype (npt.DTypeLike, optional) – the data type to have the raw bytes converted to, defaults to np.uint8.

Returns

the raw data converted to a Numpy array with a given type and shape.

Return type

np.ndarray

data_bytes(shape: np._ShapeType | None = None) → np.ndarray[source]

Get the raw data as a Numpy byte array.

This gets the raw data bytes and converts it to a Numpy array with an optional shape.

Parameters: shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
Returns: the raw data as a Numpy byte array.
Return type: np.ndarray

data_c64(shape: np._ShapeType | None = None) → np.ndarray[source]

Get the data as a 64-bit complex valued Numpy array.

Numpy’s complex64 is stored as 2 32-bit floating point numbers, this is why this is c64 as 64 bits are used to represent the number.

This parses the raw data as 64-bit complex numbers and returns it as a Numpy array with an optional shape.

Parameters: shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
Returns: the raw data as a 64-bit complex value Numpy array.
Return type: np.ndarray

data_f32(shape: np._ShapeType | None = None) → np.ndarray[source]

Get the data as a 32-bit floating point Numpy array.

This parses the raw data as 32-bit floating point numbers and returns it as a Numpy array with an optional shape.

Parameters: shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
Returns: the raw data as a 32-bit floating point Numpy array.
Return type: np.ndarray

data_i16(shape: np._ShapeType | None = None) → np.ndarray[source]

Get the data as a signed 16-bit integer Numpy array.

This parses the raw data as signed 16-bit integers and returns it as a Numpy array with an optional shape.

Parameters: shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
Returns: the raw data as a signed 16-bit integer Numpy array.
Return type: np.ndarray

data_i32(shape: np._ShapeType | None = None) → np.ndarray[source]

Get the data as a signed 32-bit integer Numpy array.

This parses the raw data as signed 32-bit integers and returns it as a Numpy array with an optional shape.

Parameters: shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
Returns: the raw data as a signed 32-bit integer Numpy array.
Return type: np.ndarray

data_i8(shape: np._ShapeType | None = None) → np.ndarray[source]

Get the data as a signed 8-bit integer Numpy array.

This parses the raw data as signed 8-bit integers and returns it as a Numpy array with an optional shape.

Parameters: shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
Returns: the raw data as a signed 8-bit integer Numpy array.
Return type: np.ndarray

property data_size: int

Get the overall size of the data block of the DADA file.

This value is equal the total file size minus the size of the header. If this instance was not loaded from a file (i.e. currently creating a file before dumping to the file system) then this value returns the size of the raw data that has been added to the instance.

Returns: the size of the data block with the output file in bytes.
Return type: int

dump(file: pathlib.Path | str) → None[source]

Dump the data to an external file.

This method takes a path to file location to write to. This method will overwrite an existing file if it exists.

Parameters: file (pathlib.Path | str) – the path to the file to write to.

est_num_chunks(chunk_size: int = 4194304) → int[source]

Get an estimate of number of data chunks given the chunk_size.

This method calculates the estimate number of chucks of data the whole file has given the chunk_size parameter. This does not take into account that the load_next() method rounds this value up to the nearest RESOLUTION.

Parameters: chunk_size (int, optional) – the size of a chunk in bytes, defaults to DEFAULT_DATA_CHUNK_SIZE
Returns: the estimated number of chunks of data.
Return type: int

get_header_float(key: str) → float[source]

Get the header value as a float value.

Parameters

key (str) – the header key to get the value of.

Returns

the value as a float

Return type

float

Raises

KeyError – if key doesn’t exist
ValueError – if value cannot be converted to a float.

get_header_int(key: str) → int[source]

Get the header value as an integer value.

Parameters

key (str) – the header key to get the value of.

Returns

the value as an integer

Return type

int

Raises

KeyError – if key doesn’t exist
ValueError – if value cannot be converted to an integer.

get_header_value(key: str) → str[source]

Get a header value as a string value.

Parameters: key (str) – the header key to get the value of.
Returns: the value as a string
Return type: str
Raises: KeyError – if key doesn’t exist

property header: AsciiHeader

Get the header for the DADA file.

Returns: the header of the file.
Return type: AsciiHeader

property header_size: int

Get the size of the header, in bytes.

Returns: the size of the header in bytes.
Return type: int

static load_from_file(file: pathlib.Path | str, chunk_size: int = 4194304, logger: Optional[Logger] = None) → DadaFile[source]

Load a DADA file and create an instance of a DadaFile.

Parameters

file (pathlib.Path | str) – a path to the file to load.
chunk_size (int, optional) – the maximum amount of data to load, defaults to DEFAULT_DATA_CHUNK_SIZE. If the file is more than the maximum amount then more data can be read by calling load_next() on the instance returned.
logger (logging.Logger | None, optional) – the logger to use for debugging, defaults to None

Returns

an instance of a DadaFile in which the header and data can be read.

Return type

DadaFile

load_next(*, chunk_size: int = 4194304) → int[source]

Load the next chunk of data.

This will load the next chunk of data as a multiple of the RESOLUTION of the data, which comes from AsciiHeader.resolution. The amount of data that can be loaded can be set by passing through a chunk_size parameter, the default value is 4MB of dada.

Parameters: chunk_size (int, optional) – the amount of data to load, defaults to DEFAULT_DATA_CHUNK_SIZE. This method will round up to the nearest RESOLUTION or to the end of the file depending if there is not enough data left to read.
Returns: the amount of data loaded.
Return type: int

property raw_data: bytes

Get the currently loaded data as a byte array.

Returns: the currently loaded data as a byte array.
Return type: bytes

property resolution: int

Get the calculated resolution of the file.

See AsciiHeader.resolution for details.

Returns: the resolution of the data.
Return type: int

set_data(data: numpy.ndarray) → None[source]

Set the data of the file using a Numpy array.

This does not persist the data. A call to dump() is required to store the data.

Note that data is serialised to bytes using native endianness.

Parameters: data (np.ndarray) – a Numpy array of the data to store. This can be in any shape or have any data type that can be converted to numerical data as bytes.

set_header_value(key: str, value: Any) → None[source]

Set a header value to a given value.

See AsciiHeader.set_value() for more details.

Parameters

key (str) – the key of the header to set.
value (Any) – the value to set the header record to.