Pipeline Steps
This page documents the parameters accepted by each pipeline step.
Note
Parameters labelled required have no default value and must always be provided. All other parameters are optional and fall back to the listed default when omitted.
Preflagger
- pydantic model ska_sdp_batch_preprocess.config.steps.Preflagger
Configuration for Preflagger step.
- Fields:
step (Literal['preflagger'])frequency_ranges_mhz (list[ska_sdp_batch_preprocess.config.steps.preflagger.FrequencyRangeMHz])
- field step: Literal['preflagger'] = 'preflagger'
- Preflagger.frequency_ranges_mhz: list[FrequencyRangeMHz] = required
List of frequency ranges to flag (at least one required). Each entry is a FrequencyRangeMHz with
startandstopvalues in MHz.
FrequencyRangeMHz
- pydantic model ska_sdp_batch_preprocess.config.steps.preflagger.FrequencyRangeMHz
A frequency range defined by a start and stop value in MHz.
- Fields:
- field start: float [Required]
Start of the frequency range in MHz.
- Validated by:
start_must_be_strictly_less_than_stop
- field stop: float [Required]
End of the frequency range in MHz.
- Validated by:
start_must_be_strictly_less_than_stop
AOFlagger
- pydantic model ska_sdp_batch_preprocess.config.steps.AOFlagger
Configuration for AOFlagger step.
- Fields:
step (Literal['aoflagger'])strategy (ska_sdp_batch_preprocess.config.steps.aoflagger.PresetStrategy | ska_sdp_batch_preprocess.config.steps.aoflagger.FileStrategy)memory_max_gb (float | None)
- field memory_max_gb: float | None = None
Maximum amount of memory in GB that AOFlagger should use. No limit if omitted.
- Constraints:
gt = 0
- field step: Literal['aoflagger'] = 'aoflagger'
- AOFlagger.strategy: PresetStrategy | FileStrategy = required
AOFlagger strategy, discriminated by
kind. Use PresetStrategy to select a bundled preset, or FileStrategy to supply a custom file.
PresetStrategy
- pydantic model ska_sdp_batch_preprocess.config.steps.aoflagger.PresetStrategy
Use a bundled AOFlagger strategy preset.
- Fields:
- field kind: Literal['preset'] [Required]
- field name: Literal['lofar_default', 'ska_low_extended', 'ska_low_sharp'] [Required]
Name of the bundled preset (e.g.
ska_low_sharp).
Note
See AOFlagger Strategy Presets for the list of available preset names.
FileStrategy
Applycal
- pydantic model ska_sdp_batch_preprocess.config.steps.Applycal
Configuration for Applycal step.
- Fields:
step (Literal['applycal'])table (ska_sdp_batch_preprocess.config.steps.applycal.H5ParmTable | ska_sdp_batch_preprocess.config.steps.applycal.SDMTable)
- field step: Literal['applycal'] = 'applycal'
- Applycal.table: H5ParmTable | SDMTable = required
Calibration table source, discriminated by
kind. Use H5ParmTable (standalone mode) in standalone mode or SDMTable (SDM mode) in SDM mode.
H5ParmTable (standalone mode)
SDMTable (SDM mode)
Averager
- pydantic model ska_sdp_batch_preprocess.config.steps.Averager
Configuration for Averager step.
- Fields:
step (Literal['averager'])timestep (int)freqstep (int)
- field freqstep: int = 1
Averaging factor in frequency.
- Constraints:
gt = 0
- field step: Literal['averager'] = 'averager'
- field timestep: int = 1
Averaging factor in time.
- Constraints:
gt = 0
Demixer
- pydantic model ska_sdp_batch_preprocess.config.steps.Demixer
Configuration for Demixer step.
- Fields:
step (Literal['demixer'])sky_model (ska_sdp_batch_preprocess.config.steps.demixer.SourceDbSkyModel)subtract_timestep (int)subtract_freqstep (int)demix_timestep (int | None)demix_freqstep (int | None)time_chunk_size (int)
- field demix_freqstep: int | None = None
Internal averaging factor in frequency when fitting bright source gains. Defaults to
subtract_freqstepif not set. Must be a multiple ofsubtract_freqstep.- Constraints:
gt = 0
- Validated by:
demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqsteptime_chunk_size_must_align_with_averaging_timesteps
- field demix_timestep: int | None = None
Internal averaging factor in time when fitting bright source gains. Defaults to
subtract_timestepif not set.- Constraints:
gt = 0
- Validated by:
demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqsteptime_chunk_size_must_align_with_averaging_timesteps
- field step: Literal['demixer'] = 'demixer'
- Validated by:
demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqsteptime_chunk_size_must_align_with_averaging_timesteps
- field subtract_freqstep: int = 1
Output averaging factor in frequency when subtracting.
- Constraints:
gt = 0
- Validated by:
demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqsteptime_chunk_size_must_align_with_averaging_timesteps
- field subtract_timestep: int = 1
Output averaging factor in time when subtracting.
- Constraints:
gt = 0
- Validated by:
demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqsteptime_chunk_size_must_align_with_averaging_timesteps
- field time_chunk_size: int = 1
Number of time samples (after averaging) that are processed jointly. Larger values improve performance at the cost of higher RAM usage. Maps to
ntimechunkin DP3. Must satisfy:(time_chunk_size * demix_timestep) % subtract_timestep == 0- Constraints:
gt = 0
- Validated by:
demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqsteptime_chunk_size_must_align_with_averaging_timesteps
- Demixer.sky_model: SourceDbSkyModel = required
Sky model for demixing, discriminated by
kind. Currently only SourceDbSkyModel is available.
SourceDbSkyModel
- pydantic model ska_sdp_batch_preprocess.config.steps.demixer.SourceDbSkyModel
Sky model in SourceDB format.
- field kind: Literal['sourcedb'] [Required]
- field path: Path [Required]
Path to the local sky model file in SourceDB format. Relative paths are resolved against the current working directory.
- field sources_to_subtract: list[str] [Required]
List of source patch names to subtract from the data.
Input
Output
- pydantic model ska_sdp_batch_preprocess.config.output.OutputConfig
Output configuration (controls the implicit DP3 msout step).
- Fields:
- field tile_nchan: int | None = None
Maximum number of channels per tile in output MS.
- Constraints:
gt = 0
- field tile_size_kb: int | None = None
Tile size in KB for the data columns in the output MS.
- Constraints:
gt = 0
- OutputConfig.storage_manager: StorageManager | None = None
Optional Dysco storage manager settings. Set this block to enable compression; see StorageManager.
StorageManager
- pydantic model ska_sdp_batch_preprocess.config.output.StorageManager
Dysco storage manager settings for the output Measurement Set.
- Fields:
- field data_bits_per_sample: int | None = None
Number of bits per float used for columns containing visibilities. Set to zero to compress weights only.
- Constraints:
ge = 0
- field dist_truncation: float | None = None
Truncation level for compression with the Truncated Gaussian distribution.
- field distribution: Literal['Uniform', 'TruncatedGaussian', 'Gaussian', 'StudentsT'] | None = None
Assumed distribution for compression.
- field kind: Literal['dysco'] [Required]
Storage manager type. Currently only
dyscois supported.
- field normalization: Literal['AF', 'RF', 'Row'] | None = None
Compression normalization method.
- field weight_bits_per_sample: int | None = None
Number of bits per float used for the WEIGHT_SPECTRUM column. Set to zero to compress data only.
- Constraints:
ge = 0