SDP Data Product Metadata
SDP Data Product Metadata is a Python package to record SKA-specific metadata alongside the data products. It creates metadata files containing:
Execution block ID
The context of the execution block provided by the Observation Execution Tool (OET)
The configuration of the processing block used to generate the data products
A list of data product files with description, status, size and a cyclic redundancy check (CRC) for error detection
Associated IVOA ObsCore attributes used for querying astronomical observations
Contents of Metadata file
Below shows the contents of the metadata and brief description about them:
interface
: Giving the schema. Currently there is no schema for this (just a placeholder). It will be added to the Telescope Model.execution_block
: Identifies the observationcontext
: This contains free-form data provided by OET. The data is meant to be passed verbatim through from OET/TMC as part of AssignResources (SDP) or Configure (other sub-systems). Currently this is just a placeholder.config
: This is the configuration used for executing the processing scriptfiles
: This contains a list of data products (typically MS or HDF files) which have been generated.crc
: A cyclic redundancy check (CRC) generated from the contents of the file and used to detect errors in the datadescription
: Useful details about the filepath
: Path where the data product file is locatedsize
: Size of the file (in KB)status
: Indicate current file statusworking
: Processes still running, files might be missing or incompletedone
: Processes finished, files should be completefailure
: Not finished successfully, files might be incomplete or corrupt
obscore
: This contains attributes as specified by the IVOA recommendation for Observation Data Models. It defines core components that are necessary to perform data discovery when querying for astronomical observations. Details of the attributes can be found at IVOA ObsCore.
More details can be found in ADR-55
Note - If the metadata filename needs to be updated, you can do that by publishing it on METADATA_FILENAME environment variable.
API
MetaData
- class ska_sdp_dataproduct_metadata.metadata.MetaData(path=None)[source]
Class for generating the metadata file
- Parameters:
path – location of the metadata file to read
- exception ValidationError(message, errors)[source]
An exception indicating an error during validation of metadata against the schema.
- load_processing_block(pb_id=None, mount_path=None)[source]
Configure a MetaData object based on the data in a processing block
- Parameters:
pb_id – processing block ID
- metadata_schema = <_io.TextIOWrapper name='/home/docs/checkouts/readthedocs.org/user_builds/sdp-data-product-metadata/checkouts/latest/src/ska_sdp_dataproduct_metadata/schema/metadata.json' mode='r' encoding='utf-8'>
- new_file(dp_path=None, description=None, crc=None)[source]
Creates a new file into the metadata and add current file status.
- Parameters:
dp_path – path of the data product Not to be confused with path of the metadata file
description – Description of the file
crc – CRC (Cyclic Redundancy Check) checksum for the file. NB: CRC is supplied, not calculated
- Returns:
instance of the File class
- property output_path
Output metadata path
- read(file)[source]
Read input metadata file and load in yaml.
- Parameters:
file – input metadata file
- Returns:
Returns the yaml loaded metadata file
- runtime_abspath(path)[source]
The absolute path of path relative to the standard prefix. This value is valid at runtime; i.e., it maps to the filesystem in use.
- Parameters:
path – A path relative to the standard prefix.
- set_config(script)[source]
Set configuration of generating software.
- Parameters:
script – Processing script details
- set_execution_block_id(execution_block_id)[source]
Set the execution_block_id for this MetaData object NB: If this MetaData object describes a dataproduct that was not generated from an execution_block, then it is possible to use any SKA Unique Identifier (https://gitlab.com/ska-telescope/ska-ser-skuid)
- Parameters:
execution_block_id – an execution_block_id
- validate() list [source]
Validate the current contents of the metadata against the schema.
- Returns:
A list of errors.
- validator = {'additionalProperties': True, 'properties': {'config': {'additionalProperties': True, 'properties': {'cmdline': {'type': ['string', 'null']}, 'commit': {'type': ['string', 'null']}, 'image': {'type': ['string', 'null']}, 'processing_block': {'type': ['string', 'null']}, 'processing_script': {'type': ['string', 'null']}, 'version': {'type': ['string', 'null']}}, 'required': [], 'type': 'object'}, 'context': {'additionalProperties': True, 'properties': {'intent': {'type': 'string'}, 'notes': {'type': 'string'}, 'observer': {'type': 'string'}}, 'required': [], 'type': 'object'}, 'execution_block': {'type': 'string'}, 'files': {'items': {'additionalProperties': True, 'properties': {'crc': {'type': ['string', 'null']}, 'description': {'type': 'string'}, 'path': {'type': 'string'}, 'size': {'type': 'integer'}, 'status': {'enum': ['done', 'failure', 'working'], 'type': 'string'}}, 'required': [], 'type': 'object'}, 'type': 'array'}, 'interface': {'format': 'uri', 'type': 'string'}, 'obscore': {'additionalProperties': False, 'properties': {'access_estsize': {'type': 'integer'}, 'access_format': {'type': 'string'}, 'access_url': {'format': 'uri', 'qt-uri-protocols': ['https'], 'type': 'string'}, 'bib_reference': {'type': 'string'}, 'calib_level': {'enum': [0, 1, 2, 3, 4], 'type': 'integer'}, 'data_rights': {'type': 'string'}, 'dataproduct_subtype': {'type': 'string'}, 'dataproduct_type': {'type': 'string'}, 'em_calib_status': {'type': 'string'}, 'em_max': {'type': 'number'}, 'em_min': {'type': 'number'}, 'em_res_power': {'type': 'number'}, 'em_res_power_max': {'type': 'number'}, 'em_res_power_min': {'type': 'number'}, 'em_resolution': {'type': 'number'}, 'em_stat_error': {'type': 'number'}, 'em_ucd': {'type': 'string'}, 'em_unit': {'type': 'string'}, 'em_xel': {'type': 'integer'}, 'facility_name': {'type': 'string'}, 'instrument_name': {'type': 'string'}, 'o_calib_status': {'type': 'string'}, 'o_stat_error': {'type': 'number'}, 'o_ucd': {'type': 'string'}, 'o_unit': {'type': 'string'}, 'obs_collection': {'type': 'string'}, 'obs_creation_date': {'type': 'string'}, 'obs_creator_did': {'type': 'string'}, 'obs_creator_name': {'type': 'string'}, 'obs_id': {'type': 'string'}, 'obs_publisher_did': {'type': 'string'}, 'obs_release_date': {'type': 'string'}, 'obs_title': {'type': 'string'}, 'pol_states': {'type': 'string'}, 'pol_xel': {'type': 'integer'}, 'proposal_id': {'type': 'string'}, 'publisher_id': {'type': 'string'}, 's_calib_status': {'type': 'string'}, 's_dec': {'type': 'number'}, 's_fov': {'type': 'number'}, 's_pixel_scale': {'type': 'number'}, 's_ra': {'type': 'number'}, 's_region': {'type': 'string'}, 's_resolution': {'type': 'number'}, 's_resolution_max': {'type': 'number'}, 's_resolution_min': {'type': 'number'}, 's_stat_error': {'type': 'number'}, 's_ucd': {'type': 'string'}, 's_unit': {'type': 'string'}, 's_xel1': {'type': 'integer'}, 's_xel2': {'type': 'integer'}, 't_calib_status': {'type': 'string'}, 't_exptime': {'type': 'number'}, 't_max': {'type': 'number'}, 't_min': {'type': 'number'}, 't_refpos': {'type': 'string'}, 't_resolution': {'type': 'number'}, 't_stat_error': {'type': 'number'}, 't_xel': {'type': 'integer'}, 'target_class': {'type': 'string'}, 'target_name': {'type': 'string'}}, 'required': [], 'type': 'object'}}, 'required': ['execution_block'], 'type': 'object'}
File
ObsCore
- class ska_sdp_dataproduct_metadata.obscore.ObsCore[source]
SKA-specific possible values for ObsCore attributes
- class AccessFormat(value)[source]
The format (mime-type) of the data product if downloaded as a file
- BINARY = 'application/octet-stream'
- FITS = 'image/fits'
- HDF5 = 'application/x-hdf5'
- JPEG = 'image/jpeg'
- PNG = 'image/png'
- TAR_GZ = 'application/x-tar-gzip'
- UNKNOWN = 'application/unknown'
- class CalibrationLevel(value)[source]
The amount of calibration processing that has been applied to create the data product Refer to the IVOA standard for a full description of the categories
- LEVEL_0 = 0
- LEVEL_1 = 1
- LEVEL_2 = 2
- LEVEL_3 = 3
- LEVEL_4 = 4
- class DataProductType(value)[source]
A simple string value describing the primary nature of the data product
- MS = 'MS'
- POINTING = 'POINTING-OFFSETS'
- UNKNOWN = 'Unknown'
- class ObservationCollection(value)[source]
A string identifying the data collection to which the data product belongs
- SIMULATION = 'Simulation'
- UNKNOWN = 'Unknown'
- SKA = 'SKA-Observatory'
- SKA_LOW = 'SKA-LOW'
- SKA_MID = 'SKA-MID'
- class UCD(value)[source]
A list of Unified Content Descriptors (Preite Martinez, et al. 2007) describing the nature of the observable within the data product https://www.ivoa.net/documents/latest/UCDlist.html
- COUNT = 'phot.count'
- FLUX_DENSITY = 'phot.flux.density'
- FOURIER = 'stat.fourier'
- UNKNOWN = 'Unknown'