ska_pydada
Module init code.
- class ska_pydada.AsciiHeader(header_size: int = 4096, **kwargs: Any)[source]
A utility class to abstract over a DADA header.
This class extends an ordered dictionary to allow inserting and retrieving values. Values are stored as strings.
- static from_bytes(data: bytes) AsciiHeader [source]
Get an instance of an
AsciiHeader
from supplied bytes.This converts the data to a string using
decode
and then calls theAsciiHeader.from_str()
.- Parameters
data (bytes) – the data to parse.
- Returns
the bytes parsed as an
AsciiHeader
- Return type
- static from_file(file: pathlib.Path | str) AsciiHeader [source]
Load a header from a file.
The input file must but a text file, such as a config file. This will not handle a DADA file that has binary data.
- Parameters
file (pathlib.Path | str) – the path to the file to load.
- Returns
the file parsed as an
AsciiHeader
- Return type
- Raises
AssertionError – if file size is more than 512KB in size.
- static from_str(data: str) AsciiHeader [source]
Get an instance of an
AsciiHeader
from supplied string.Entries in the header file are delimited by a newline character, and the header and value pair are in turn delimited by a white space character.
- Parameters
data (str) – the data to parse.
- Returns
the string parsed as an
AsciiHeader
- Return type
- get_float(key: str) float [source]
Get the header value as a float.
- Parameters
key (str) – the key of the record to get.
- Returns
the value of the record as a float value.
- Return type
- Raises
KeyError – if key does not exist
ValueError – if value cannot be converted to a float.
- get_int(key: str) int [source]
Get the header value as an integer.
- Parameters
key (str) – the key of the record to get.
- Returns
the value of the record as a int value.
- Return type
- Raises
KeyError – if key does not exist
ValueError – if value cannot be converted to an integer.
- property header_size: int
The size of header in bytes.
This represents the number of bytes that a serialised header would be if a part of a DADA file. The header would be NULL filled if the output length is less than this value.
- Returns
the header size, in bytes.
- Return type
- property resolution: int
Get the calculated resolution based on values in the header.
This is the number of bytes in a stride of data.
If the
RESOLUTION
key exists in the header than that value is used, else this is determined byNDIM
,NBIT
,NPOL
,NCHAN
andUDP_NSAMP
. If not all the values exist then a value of1
is returned.- Returns
the number of bytes for a stride of data.
- Return type
- set_value(key: str, value: Any) None [source]
Set a value in the header.
- Parameters
key (str) – the key of the header record to set.
value (Any) – the value of the record.
- to_bytes() bytes [source]
Convert the header to bytes.
- Returns
the header converted to bytes but padded with NULL chars to be
header_size
in bytes long.- Return type
- class ska_pydada.DadaFile(header: Optional[AsciiHeader] = None, raw_data: Optional[bytes] = None, logger: Optional[Logger] = None, **kwargs: Any)[source]
Class that can be used to read a PSR DADA file.
- as_time_freq_pol() numpy.ndarray [source]
Get the data as time, frequency and polarisation 3 dimensional array.
This returns the raw data as a 3 dimensional Numpy array with the following dimensions:
time
frequency
polarisation
The
NCHAN
header value defines the number of frequency channels. TheNPOL
parameter defines the number of polarisations.This may return real or complex values based on the
NDIM
value in the header. IfNDIM
is 1 then real floating point data is returned, if 2 then complex value data is returned. In both casesNBIT
is assumed to be 32. For all other values ofNDIM
then an assertion error is raised.The number of time samples is defined as a free dimension the the shape of the data. This uses Numpy’s standard of passing
-1
as the size of the dimension and lets Numpy determine the shape.- Returns
the raw data converted into a TFP Numpy array.
- Return type
np.ndarray
- data(shape: np._ShapeType | None = None, dtype: npt.DTypeLike = numpy.uint8) np.ndarray [source]
Get the data as a numpy array.
This will return the raw byte data as a Numpy array with a data type of
dtype
.If the
shape
parameter is specified then the array will be reshaped using row major (i.e. ‘C’ format). If no shape is provided a 1-dimensional array is returned and the client will need to perform the reshaping themselves.- Parameters
shape (np._ShapeType | None, optional) – the required shape of the output array, defaults to None. If no shape provided then a 1D array is returned.
dtype (npt.DTypeLike, optional) – the data type to have the raw bytes converted to, defaults to np.uint8.
- Returns
the raw data converted to a Numpy array with a given type and shape.
- Return type
np.ndarray
- data_bytes(shape: np._ShapeType | None = None) np.ndarray [source]
Get the raw data as a Numpy byte array.
This gets the raw data bytes and converts it to a Numpy array with an optional shape.
- Parameters
shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
- Returns
the raw data as a Numpy byte array.
- Return type
np.ndarray
- data_c64(shape: np._ShapeType | None = None) np.ndarray [source]
Get the data as a 64-bit complex valued Numpy array.
Numpy’s
complex64
is stored as 2 32-bit floating point numbers, this is why this isc64
as 64 bits are used to represent the number.This parses the raw data as 64-bit complex numbers and returns it as a Numpy array with an optional shape.
- Parameters
shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
- Returns
the raw data as a 64-bit complex value Numpy array.
- Return type
np.ndarray
- data_f32(shape: np._ShapeType | None = None) np.ndarray [source]
Get the data as a 32-bit floating point Numpy array.
This parses the raw data as 32-bit floating point numbers and returns it as a Numpy array with an optional shape.
- Parameters
shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
- Returns
the raw data as a 32-bit floating point Numpy array.
- Return type
np.ndarray
- data_i16(shape: np._ShapeType | None = None) np.ndarray [source]
Get the data as a signed 16-bit integer Numpy array.
This parses the raw data as signed 16-bit integers and returns it as a Numpy array with an optional shape.
- Parameters
shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
- Returns
the raw data as a signed 16-bit integer Numpy array.
- Return type
np.ndarray
- data_i32(shape: np._ShapeType | None = None) np.ndarray [source]
Get the data as a signed 32-bit integer Numpy array.
This parses the raw data as signed 32-bit integers and returns it as a Numpy array with an optional shape.
- Parameters
shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
- Returns
the raw data as a signed 32-bit integer Numpy array.
- Return type
np.ndarray
- data_i8(shape: np._ShapeType | None = None) np.ndarray [source]
Get the data as a signed 8-bit integer Numpy array.
This parses the raw data as signed 8-bit integers and returns it as a Numpy array with an optional shape.
- Parameters
shape (np._ShapeType | None, optional) – the desired output shape, defaults to None. If not set this will return a 1D array of the full size.
- Returns
the raw data as a signed 8-bit integer Numpy array.
- Return type
np.ndarray
- property data_size: int
Get the overall size of the data block of the DADA file.
This value is equal the total file size minus the size of the header. If this instance was not loaded from a file (i.e. currently creating a file before dumping to the file system) then this value returns the size of the raw data that has been added to the instance.
- Returns
the size of the data block with the output file in bytes.
- Return type
- dump(file: pathlib.Path | str) None [source]
Dump the data to an external file.
This method takes a path to file location to write to. This method will overwrite an existing file if it exists.
- Parameters
file (pathlib.Path | str) – the path to the file to write to.
- est_num_chunks(chunk_size: int = 4194304) int [source]
Get an estimate of number of data chunks given the
chunk_size
.This method calculates the estimate number of chucks of data the whole file has given the
chunk_size
parameter. This does not take into account that theload_next()
method rounds this value up to the nearestRESOLUTION
.
- get_header_float(key: str) float [source]
Get the header value as a float value.
- Parameters
key (str) – the header key to get the value of.
- Returns
the value as a float
- Return type
- Raises
KeyError – if key doesn’t exist
ValueError – if value cannot be converted to a float.
- get_header_int(key: str) int [source]
Get the header value as an integer value.
- Parameters
key (str) – the header key to get the value of.
- Returns
the value as an integer
- Return type
- Raises
KeyError – if key doesn’t exist
ValueError – if value cannot be converted to an integer.
- property header: AsciiHeader
Get the header for the DADA file.
- Returns
the header of the file.
- Return type
- property header_size: int
Get the size of the header, in bytes.
- Returns
the size of the header in bytes.
- Return type
- static load_from_file(file: pathlib.Path | str, chunk_size: int = 4194304, logger: Optional[Logger] = None) DadaFile [source]
Load a DADA file and create an instance of a
DadaFile
.- Parameters
file (pathlib.Path | str) – a path to the file to load.
chunk_size (int, optional) – the maximum amount of data to load, defaults to DEFAULT_DATA_CHUNK_SIZE. If the file is more than the maximum amount then more data can be read by calling
load_next()
on the instance returned.logger (logging.Logger | None, optional) – the logger to use for debugging, defaults to None
- Returns
an instance of a
DadaFile
in which the header and data can be read.- Return type
- load_next(*, chunk_size: int = 4194304) int [source]
Load the next chunk of data.
This will load the next chunk of data as a multiple of the
RESOLUTION
of the data, which comes fromAsciiHeader.resolution
. The amount of data that can be loaded can be set by passing through achunk_size
parameter, the default value is 4MB of dada.
- property raw_data: bytes
Get the currently loaded data as a byte array.
- Returns
the currently loaded data as a byte array.
- Return type
- property resolution: int
Get the calculated resolution of the file.
See
AsciiHeader.resolution
for details.- Returns
the resolution of the data.
- Return type
- set_data(data: numpy.ndarray) None [source]
Set the data of the file using a Numpy array.
This does not persist the data. A call to
dump()
is required to store the data.Note that data is serialised to bytes using native endianness.
- Parameters
data (np.ndarray) – a Numpy array of the data to store. This can be in any shape or have any data type that can be converted to numerical data as bytes.