Pipeline Steps

This page documents the parameters accepted by each pipeline step.

Note

Parameters labelled required have no default value and must always be provided. All other parameters are optional and fall back to the listed default when omitted.

Preflagger

pydantic model ska_sdp_batch_preprocess.config.steps.Preflagger

Configuration for Preflagger step.

Fields:
  • step (Literal['preflagger'])

  • frequency_ranges_mhz (list[ska_sdp_batch_preprocess.config.steps.preflagger.FrequencyRangeMHz])

field step: Literal['preflagger'] = 'preflagger'
Preflagger.frequency_ranges_mhz: list[FrequencyRangeMHz] = required

List of frequency ranges to flag (at least one required). Each entry is a FrequencyRangeMHz with start and stop values in MHz.

FrequencyRangeMHz

pydantic model ska_sdp_batch_preprocess.config.steps.preflagger.FrequencyRangeMHz

A frequency range defined by a start and stop value in MHz.

Fields:
field start: float [Required]

Start of the frequency range in MHz.

Validated by:
  • start_must_be_strictly_less_than_stop

field stop: float [Required]

End of the frequency range in MHz.

Validated by:
  • start_must_be_strictly_less_than_stop

AOFlagger

pydantic model ska_sdp_batch_preprocess.config.steps.AOFlagger

Configuration for AOFlagger step.

Fields:
  • step (Literal['aoflagger'])

  • strategy (ska_sdp_batch_preprocess.config.steps.aoflagger.PresetStrategy | ska_sdp_batch_preprocess.config.steps.aoflagger.FileStrategy)

  • memory_max_gb (float | None)

field memory_max_gb: float | None = None

Maximum amount of memory in GB that AOFlagger should use. No limit if omitted.

Constraints:
  • gt = 0

field step: Literal['aoflagger'] = 'aoflagger'
AOFlagger.strategy: PresetStrategy | FileStrategy = required

AOFlagger strategy, discriminated by kind. Use PresetStrategy to select a bundled preset, or FileStrategy to supply a custom file.

PresetStrategy

pydantic model ska_sdp_batch_preprocess.config.steps.aoflagger.PresetStrategy

Use a bundled AOFlagger strategy preset.

Fields:
field kind: Literal['preset'] [Required]
field name: Literal['lofar_default', 'ska_low_extended', 'ska_low_sharp'] [Required]

Name of the bundled preset (e.g. ska_low_sharp).

Note

See AOFlagger Strategy Presets for the list of available preset names.

FileStrategy

pydantic model ska_sdp_batch_preprocess.config.steps.aoflagger.FileStrategy

Use a custom AOFlagger strategy file.

Fields:
field kind: Literal['file'] [Required]
field path: Path [Required]

Path to the AOFlagger strategy .lua file. Relative paths are resolved against the current working directory.

Applycal

pydantic model ska_sdp_batch_preprocess.config.steps.Applycal

Configuration for Applycal step.

Fields:
  • step (Literal['applycal'])

  • table (ska_sdp_batch_preprocess.config.steps.applycal.H5ParmTable | ska_sdp_batch_preprocess.config.steps.applycal.SDMTable)

field step: Literal['applycal'] = 'applycal'
Applycal.table: H5ParmTable | SDMTable = required

Calibration table source, discriminated by kind. Use H5ParmTable (standalone mode) in standalone mode or SDMTable (SDM mode) in SDM mode.

H5ParmTable (standalone mode)

pydantic model ska_sdp_batch_preprocess.config.steps.applycal.H5ParmTable

Calibration table stored as an H5Parm file (standalone mode).

Fields:
field kind: Literal['h5parm'] [Required]
field path: Path [Required]

Path to the H5Parm file. Relative paths are resolved against the current working directory.

SDMTable (SDM mode)

pydantic model ska_sdp_batch_preprocess.config.steps.applycal.SDMTable

Calibration table resolved from the SDM directory (SDM mode).

Fields:
field field_id: str [Required]

Observed field identifier.

field kind: Literal['sdm'] [Required]
field purpose: str [Required]

Calibration purpose (e.g. bandpass).

Averager

pydantic model ska_sdp_batch_preprocess.config.steps.Averager

Configuration for Averager step.

Fields:
  • step (Literal['averager'])

  • timestep (int)

  • freqstep (int)

field freqstep: int = 1

Averaging factor in frequency.

Constraints:
  • gt = 0

field step: Literal['averager'] = 'averager'
field timestep: int = 1

Averaging factor in time.

Constraints:
  • gt = 0

Demixer

pydantic model ska_sdp_batch_preprocess.config.steps.Demixer

Configuration for Demixer step.

Fields:
  • step (Literal['demixer'])

  • sky_model (ska_sdp_batch_preprocess.config.steps.demixer.SourceDbSkyModel)

  • subtract_timestep (int)

  • subtract_freqstep (int)

  • demix_timestep (int | None)

  • demix_freqstep (int | None)

  • time_chunk_size (int)

field demix_freqstep: int | None = None

Internal averaging factor in frequency when fitting bright source gains. Defaults to subtract_freqstep if not set. Must be a multiple of subtract_freqstep.

Constraints:
  • gt = 0

Validated by:
  • demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqstep

  • time_chunk_size_must_align_with_averaging_timesteps

field demix_timestep: int | None = None

Internal averaging factor in time when fitting bright source gains. Defaults to subtract_timestep if not set.

Constraints:
  • gt = 0

Validated by:
  • demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqstep

  • time_chunk_size_must_align_with_averaging_timesteps

field step: Literal['demixer'] = 'demixer'
Validated by:
  • demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqstep

  • time_chunk_size_must_align_with_averaging_timesteps

field subtract_freqstep: int = 1

Output averaging factor in frequency when subtracting.

Constraints:
  • gt = 0

Validated by:
  • demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqstep

  • time_chunk_size_must_align_with_averaging_timesteps

field subtract_timestep: int = 1

Output averaging factor in time when subtracting.

Constraints:
  • gt = 0

Validated by:
  • demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqstep

  • time_chunk_size_must_align_with_averaging_timesteps

field time_chunk_size: int = 1

Number of time samples (after averaging) that are processed jointly. Larger values improve performance at the cost of higher RAM usage. Maps to ntimechunk in DP3. Must satisfy: (time_chunk_size * demix_timestep) % subtract_timestep == 0

Constraints:
  • gt = 0

Validated by:
  • demix_freqstep_must_be_none_or_a_multiple_of_subtract_freqstep

  • time_chunk_size_must_align_with_averaging_timesteps

Demixer.sky_model: SourceDbSkyModel = required

Sky model for demixing, discriminated by kind. Currently only SourceDbSkyModel is available.

SourceDbSkyModel

pydantic model ska_sdp_batch_preprocess.config.steps.demixer.SourceDbSkyModel

Sky model in SourceDB format.

Fields:
field kind: Literal['sourcedb'] [Required]
field path: Path [Required]

Path to the local sky model file in SourceDB format. Relative paths are resolved against the current working directory.

field sources_to_subtract: list[str] [Required]

List of source patch names to subtract from the data.

Input

pydantic model ska_sdp_batch_preprocess.config.input.InputConfig

Input configuration (controls the implicit DP3 msin step).

Fields:
field data_column: str = 'DATA'

Name of the data column to read from the input Measurement Set.

Output

pydantic model ska_sdp_batch_preprocess.config.output.OutputConfig

Output configuration (controls the implicit DP3 msout step).

Fields:
field tile_nchan: int | None = None

Maximum number of channels per tile in output MS.

Constraints:
  • gt = 0

field tile_size_kb: int | None = None

Tile size in KB for the data columns in the output MS.

Constraints:
  • gt = 0

OutputConfig.storage_manager: StorageManager | None = None

Optional Dysco storage manager settings. Set this block to enable compression; see StorageManager.

StorageManager

pydantic model ska_sdp_batch_preprocess.config.output.StorageManager

Dysco storage manager settings for the output Measurement Set.

Fields:
field data_bits_per_sample: int | None = None

Number of bits per float used for columns containing visibilities. Set to zero to compress weights only.

Constraints:
  • ge = 0

field dist_truncation: float | None = None

Truncation level for compression with the Truncated Gaussian distribution.

field distribution: Literal['Uniform', 'TruncatedGaussian', 'Gaussian', 'StudentsT'] | None = None

Assumed distribution for compression.

field kind: Literal['dysco'] [Required]

Storage manager type. Currently only dysco is supported.

field normalization: Literal['AF', 'RF', 'Row'] | None = None

Compression normalization method.

field weight_bits_per_sample: int | None = None

Number of bits per float used for the WEIGHT_SPECTRUM column. Set to zero to compress data only.

Constraints:
  • ge = 0