.. _configuration: ******************* Configuration Guide ******************* .. note:: This page covers standalone mode. For running the pipeline as part of the end-to-end SDP production system via a Science Data Model directory, see :ref:`sdm_mode`. Format ====== The batch pre-processing pipeline application translates the configuration file into a sequence of calls to DP3, one per frequency interval of each input MeasurementSet, and execute them as subprocesses. The configuration file schema reflects this: it provides the means to specify a list of DP3 steps and their parameters. Example ------- .. literalinclude:: ../../config/config.yaml :language: yaml Schema ------ The config file layout rules are: - There *must* be a ``steps`` section, which must be a list of step specifications (see below) or an empty list. An empty list corresponds to a pipeline that just copies the input data. - Each step is specified as a dictionary with a ``step`` key indicating the step type, followed by the step parameters: .. code-block:: yaml # Customise params steps: - step: aoflagger strategy: kind: preset name: ska_low_sharp - Steps are executed in the order they are specified. - An optional ``input`` section controls parameters for reading the input data (e.g. the data column). - An optional ``output`` section controls parameters for writing the output data (e.g. Dysco compression settings). The list of available steps, along with the parameters that each will accept are descibed in :ref:`pipeline_steps`. The ``input`` and ``output`` sections are documented at :ref:`input` and :ref:`output` respectively. Notes on ApplyCal ================= DP3 can apply existing calibration solutions stored in so-called H5Parm files, which are HDF5 files following a certain schema. There are a few things to be aware of: - H5Parm files can store an arbitrary number of solution tables, and DP3 needs to be told which one(s) to apply. - The exact ApplyCal options that must be given to DP3 depend on the type of solution table to apply -- there are at least 3 different cases to handle. The caller of DP3 must therefore know precisely what is inside an H5Parm file to properly configure ApplyCal step(s). The good news is that the batch pre-processing pipeline takes care of this process; one only needs to provide the H5Parm file path to apply when specifying an ApplyCal step, via the ``path`` configuration parameter. Here are two valid examples: .. code-block:: yaml steps: - step: applycal table: kind: h5parm path: /absolute/path/to/somefile.h5 .. code-block:: yaml steps: - step: applycal table: kind: h5parm # Relative paths are resolved against the current working directory path: somefile.h5 **This ease of use, however, comes at the following price:** .. warning:: The batch pre-processing pipeline will only accept H5Parm files with a schema/layout such that there is only one possible way of applying them. An error message will be raised if the ApplyCal configuration cannot be deduced from the contents of the H5Parm. H5Parm restrictions ------------------- Some documentation about H5Parm and its schema can be found in the `LOFAR Imaging Cookbook `_. The batch pre-processing pipeline enforces the following additional restrictions on the H5Parm files it accepts for its ApplyCal steps: - Only one solution set (solset) - Either 1 or 2 solution tables (soltab) in the solset. - Soltab types must be either "amplitude" or "phase"; the soltab type is stored in its ``TITLE`` attribute. - If there are 2 soltabs, they must represent amplitude and phase, and their number of polarisations must be identical. - If there is only 1 soltab, it can only represent the phase or amplitude part of a scalar or diagonal solution table. Notes on Demixing ================= The Demixer step is one way of performing subtraction of distant bright sources. Demixing for large bandwidths ----------------------------- Internally, the Demixer step fits gains to the model visibilities of each bright source, as an unconstrained Jones matrix that is allowed to vary as a function of time (see ``demix_timestep`` parameter), but **not** frequency, despite of the existence of a ``demix_freqstep`` parameter. Demixer thus implicitly expects the input dataset to have a bandwidth small enough for gains to be considered uniform in frequency, including the primary beam response; for LOFAR or SKA Low, "small enough" means no more than a few MHz. For datasets with a wider band, you will need to specify the ``--frequency-chunk-hz`` argument so that processing is split along the frequency dimension in chunks small enough for Demixing to work as advertised. Practical details ----------------- Demixing requires a sky model in `SourceDB format `_. SourceDB contains two types of entries: - Sky components, which are either points or gaussians, with various parameters such as position, flux, spectral index, but also the "patch" it belongs to. - So-called "patches", which are special entries that are effectively associated with one group of sky components and one calibration direction / gain table. Below is a basic example of SourceDB sky model to use for bright source subtraction: .. code-block:: text FORMAT = Name, Type, Patch, Ra, Dec, I, SpectralIndex, LogarithmicSI, ReferenceFrequency , , bright_a, 52.052625deg, -28.5875deg , , bright_b, 53.052625deg, -27.5875deg point_a, POINT, bright_a, 52.052625deg, -28.5875deg, 1.0, [0.0], true, 959969726.5625 point_b, POINT, bright_b, 53.052625deg, -27.5875deg, 1.0, [0.0], true, 959969726.5625 .. note:: We may implement a more user-friendly data schema for bright source sky models in the future. Here is what a ``Demixer`` step configuration may look like: .. code-block:: yaml steps: - step: demixer sky_model: kind: sourcedb # A relative path is resolved against the current working directory path: bright_sources.txt # List of sources to subtract, must all refer to existing "patches" in the sky model file sources_to_subtract: ["bright_a", "bright_b"] # Internal averaging factors when fitting bright source gains demix_timestep: 4 demix_freqstep: 8 Please also refer to the :ref:`demixer` step documentation for the full parameter reference.