Aggregation Policy ================== This section describes how the Task Tracker aggregates subtask updates into a single command outcome. The aggregation logic is delegated to the :class:`~ska_csp_lmc_common.commands.aggregation.TaskAggregationPolicy` and operates on collections of :class:`~ska_csp_lmc_common.commands.aggregation.SubtaskResult` objects. Aggregation rules and precedence -------------------------------- .. note:: The current aggregation policy represents an initial proposal and is expected to be reviewed and refined. Consumers should treat the output semantics as stable at a high level (completion vs progress, device lists, and presence of contextual fields), but should not rely on the exact wording or precedence rules of the aggregated message. The current policy applies a set of precedence rules to determine the final outcome. The main goals are: - ensure critical subsystem failures dominate the task outcome; - support partial-success semantics for PST group operations; - report non-critical failures as degraded outcomes where applicable. The key rules are: 1. **Aborted dominates** If any subtask is ``ABORTED``, the overall outcome is ``ABORTED``. 2. **In-progress dominates** If any subtask is ``IN_PROGRESS``, the overall outcome is ``IN_PROGRESS`` (result typically ``STARTED``). 3. **CBF failures dominate** If a CBF-associated subtask fails or is rejected, the overall outcome is ``FAILED`` / ``REJECTED`` respectively, regardless of other subsystem results. 4. **PST group tasks apply quorum-like logic** For tasks identified as PST group tasks (based on task name patterns), partial success may be treated as acceptable. The current identification relies on task naming conventions and may be refined in future iterations. - if all PST subtasks fail → overall failure (severity depends on whether failures are abnormal); - if some PST subtasks succeed → overall ``COMPLETED`` with an FAILED result, but with a message indicating degraded/partial execution. 5. **Remaining failures** For failures outside of the above rules: - severe failures lead to overall ``FAILED`` with health ``FAILED``; - other failures may lead to ``COMPLETED`` with a failure result and health ``DEGRADED``. 6. **No failures** If no subtask failures are detected, the overall outcome is ``COMPLETED`` with OK result. CBF rejection result normalization ----------------------------------- When a CBF-associated subtask is rejected, the aggregation policy always produces an aggregated outcome with: - ``TaskStatus.REJECTED``; - a non-``UNKNOWN`` ``ResultCode``. If the originating CBF subtask reports ``ResultCode.UNKNOWN`` (or does not provide a meaningful result code), the aggregation policy normalizes the result to ``ResultCode.REJECTED``. This ensures that aggregated outcomes are semantically consistent and do not propagate ``UNKNOWN`` result codes in terminal states. Any more specific CBF result codes (e.g. ``NOT_ALLOWED``) are preserved when available. This normalization currently applies only to CBF rejections and is intended to avoid ambiguous outcomes (``TaskStatus.REJECTED`` with an ``UNKNOWN`` result code). .. note:: At present, result-code normalization is implemented only for CBF rejection outcomes. Other subsystems may still propagate ``UNKNOWN`` result codes depending on the producer behaviour. Device classification and failure reporting ------------------------------------------- For reporting and diagnostics, failed devices are classified by subsystem. Classification is based on substring matching against the device name: - devices containing ``cbf`` → ``CBF`` - devices containing ``pst`` → ``PST`` - devices containing ``pss`` → ``PSS`` - otherwise → ``OTHER`` The policy builds deterministic device lists: - duplicates are removed while preserving order; - final payloads use stable ordering (typically sorted by the tracker). Failure causes extraction ------------------------- In addition to device lists, the policy extracts human-readable failure causes from the subtask messages and normalizes them into a single consolidated list. Normalization rules include: - stripping an outer ``"Causes:"`` header when present; - flattening bullet lists (lines starting with ``-``); - removing empty lines; - removing duplicates while preserving order. The aggregated message uses a single consolidated ``Causes:`` block, with one bullet per normalized cause line. Further details on aggregated message semantics are described in :ref:`aggregated-message-semantics`.