CSP.LMC scan consistency policy

Purpose

The CSP.LMC scan consistency policy defines the CSP-specific rules used to validate subsystem behaviour while a subarray is scanning.

Its purpose is to detect subsystem states that are unsafe, unexpected, or inconsistent with the current observation, and to determine whether scanning can continue, should continue in a degraded condition, or must transition to FAULT.

The policy can be configured to:

determine whether non-recoverable inconsistencies should trigger a hard fault;
specify which subsystems are required for a scan to be considered valid.

Behaviour

The policy is evaluated while the aggregated subarray ObsState is SCANNING.

It is also evaluated when the aggregated state unexpectedly collapses from SCANNING to EMPTY or IDLE. This allows the policy to classify restart-like conditions even when the aggregation has already dropped out of SCANNING.

During evaluation, the policy:

refines the set of required subsystems according to the active observing modes;
filters out subsystem components that are not relevant to the current subarray scan;
classifies subsystems whose ObsState is not SCANNING;
produces a final decision containing:
- the final subarray ObsState to apply,
- whether the inconsistency is a hard fault,
- a diagnostic message,
- a severity level.

Required subsystems

The default required subsystem set is:

cbf
pss
pst

This set is refined dynamically from the active observing modes:

if PULSAR_TIMING is not active, pst is not required;
if neither PULSAR_SEARCH nor TRANSIENT_SEARCH is active, pss is not required.

This ensures that only subsystems relevant to the current observation are validated by the scan consistency policy.

Subsystem filtering

Before validation, the policy filters out subsystem components that are not assigned to the current subarray scan.

In particular, PST beams with subarray_id == 0 are ignored, because they are not part of the active scan and must not influence the subarray consistency decision.

Classification of inconsistencies

Each required subsystem whose ObsState differs from SCANNING is classified into a structured inconsistency containing:

the subsystem FQDN,
the observed ObsState,
a diagnostic code,
a human-readable description,
a severity.

Examples of classified conditions include:

subsystem timing mismatches;
unexpected restarts;
subsystem faults;
generic state mismatches.

Severity levels

The policy classifies inconsistencies using three severity levels:

LOW: the scan can continue and the inconsistency is considered recoverable or transient;
MEDIUM: the scan can continue, but in degraded conditions;
HIGH: the inconsistency is considered non-recoverable and forces a transition to FAULT.

The final decision is derived from the highest severity found across all invalid subsystems.

PST handling

PST inconsistencies are handled separately from other subsystems because their impact depends on the active observing modes.

PULSAR_TIMING-only scans

When the active observing mode set is exactly {PULSAR_TIMING}, PST beams are treated as fault-critical for the scan.

The policy computes a PST failure threshold from the number of PST beams participating in the scan:

if only one PST beam is present, the threshold is 0 and failure of that beam is sufficient to trigger FAULT;
otherwise, the threshold is pst_count // 2.

A FAULT decision is generated only when the number of failing PST beams is strictly greater than the threshold.

If the threshold is not exceeded, the scan can continue and the individual PST beam inconsistencies are reported with their own severity.

Commensal observing modes

When PULSAR_TIMING is active together with one or more additional observing modes, PST inconsistencies do not force the subarray ObsState to FAULT on their own.

In this case, PST failures are treated as degraded commensal scan conditions:

PST beam inconsistencies are downgraded to MEDIUM severity;
a warning is logged explaining that scanning can continue because the observation is not PULSAR_TIMING-only;
the scan continues for the non-PST observing modes that remain active.

Diagnostic reporting

When inconsistencies are detected, the policy builds a human-readable diagnostic message containing:

the active observing modes;
the list of inconsistent subsystems;
the description of each inconsistency;
the associated severity.

This message is used both for logging and for publication by the observation supervisor through the subarray diagnostic attributes.

Interaction with the observation supervisor

The CspObservationSupervisor is responsible for applying the policy decision to the subarray observation model and for reporting the result of each evaluation cycle back to the generic supervision loop.

For each evaluation cycle, the supervisor:

takes an atomic snapshot of subsystem ObsState values from the state store;
computes the aggregated candidate ObsState through the subarray observation model;
evaluates the scan consistency policy using:
- the candidate aggregated state,
- the active observing modes,
- the subsystem snapshot,
- the previously published state,
- the active command context selected by the observation supervisor;
updates diagnostic attributes related to scan consistency;
interprets the returned Decision.action;
executes the corresponding domain-specific behaviour;
returns an EvaluationOutcome to the generic supervisor.

Action handling

The supervisor interprets policy actions as follows:

APPLY: The final observation state is applied to the observation model. Diagnostic attributes are updated accordingly, and the evaluation cycle completes successfully.
WAIT: No final state is applied yet. The supervisor keeps the evaluation cycle pending so that the generic supervision loop can trigger a new evaluation later.
FAULT: The supervisor transitions the subarray into FAULT and reports the corresponding diagnostic information. The evaluation cycle then completes in fault.
REFRESH_AND_REEVALUATE: No final state is applied yet. The supervisor requests a refresh of the subsystem information and then performs a new policy evaluation using the refreshed snapshot. This action is used when the current state is still not conclusive after reconciliation timeout, but an additional refresh may provide enough information to reach a final decision.

This action-based handling makes the policy intent explicit and avoids inferring control flow indirectly from severity or fault flags alone.

The command context supplied to the policy is the supervisor’s current active context. Internally, the supervisor may retain more than one CommandContext so that a delayed outcome can still be correlated to an older command. However, policy evaluation continues to operate on the single active context that best represents the current supervision focus.

Hard consistency faults

If the policy returns an action of FAULT, the supervisor latches the subarray into FAULT with cause CONSISTENCY.

This condition is reported through:

scanConsistencyErrorFlag = True
scanConsistencyErrorMsg containing the policy diagnostic message

The consistency fault is automatically cleared only when the inconsistency is no longer present and the current fault cause is CONSISTENCY.

Medium-severity inconsistencies

If the policy returns MEDIUM severity, the operational behaviour depends on the associated PolicyAction.

A medium-severity condition may either:

be applied immediately as a degraded but valid state (for example, SCANNING with degraded subsystem participation); or
remain pending if the policy determines that the evaluation is not yet conclusive.

In both cases, the diagnostic attributes report the degraded condition.

Low-severity inconsistencies

If the policy returns only LOW severity and the selected action is APPLY, the final ObsState remains SCANNING and the scan continues normally.

As with medium severity, low severity is diagnostic in nature and should not be interpreted on its own as a control-flow instruction.

Summary

The CSP.LMC scan consistency actively validates subsystem behaviour during scanning, adapts the set of required subsystems to the active observing modes, handles PST beams with mode-specific logic, and cooperates with the observation supervisor to:

continue scanning when inconsistencies are tolerable,
mark degraded scans through diagnostic information,
force a transition to FAULT when inconsistencies are non-recoverable.