Health Diagnostics
Overview
The HealthDiagnostics component is responsible for generating
HealthInfo messages that explain the aggregated HealthState.
It does not participate in the computation of the health condition.
Instead, it interprets an already evaluated HealthState and
produces operator-facing diagnostic information describing the
current system condition.
This design enforces a clear separation between:
Health classification (performed by the evaluation model)
Health explanation (performed by the diagnostics component)
HealthDiagnostics produces only the local diagnostics payload for the
target CSP.LMC device entry. Forwarded subsystem HealthInfo payloads
are merged separately by the supervision/publication layer.
Execution Flow
HealthDiagnostics is executed after the Health Evaluation Model
computes the aggregated HealthState.
Inputs:
A stable snapshot of subsystem
HealthSampleobjectsThe final aggregated
HealthStateOptional contextual information provided by the model
Output:
A mapping
device_fqdn -> List[str]containing the diagnostic messages to be published asHealthInfo.
If the aggregated health state is OK, the diagnostics payload for the
target device is empty ({device_fqdn: []}).
Design Principles
Deterministic Behaviour
Diagnostics are derived exclusively from the provided snapshot and the computed health state. The component does not manage timing, retries, or transitional states.
Given the same inputs, it produces the same output.
Forced Conditions
The Health Evaluation Model may signal forced operational conditions, such as fault or disabled states.
When such context is provided:
The explicit reason takes precedence.
Snapshot-based diagnostics are not evaluated.
The forced reason becomes the published
HealthInfomessage.
Critical Infrastructure (CBF)
CBF devices represent critical infrastructure within CSP operation.
When a CBF device affects the aggregated health state, its condition is explicitly reflected in the diagnostic output.
If the aggregated health is FAILED and no CBF device is present
in the snapshot, a specific diagnostic message is generated to
indicate the absence of required critical infrastructure.
This ensures that failures or absence of CBF components are always visible to operators.
Non-Critical Components
Non-critical components contribute to diagnostics according to their operational status.
When administratively disabled, they do not influence the
aggregated HealthState and do not generate diagnostic messages.
When administratively online, their state and health condition may contribute to both aggregation and diagnostics.
Their impact on the aggregated health is typically limited.
Issues in non-critical components generally result in a
DEGRADED condition rather than a FAILED state.
Diagnostic messages are generated only when their condition meaningfully affects the aggregated result, ensuring visibility of operational degradation without unnecessary escalation.
Diagnostic Generation
Diagnostic messages are derived from the subsystem snapshot in a manner consistent with the aggregation semantics.
The component:
Identifies components in problematic state.
Identifies components reporting non-OK
HealthState.Applies rules related to critical infrastructure.
Ensures messages are unique and consistently ordered.
Messages are plain strings intended for direct publication
in the HealthInfo attribute.
Interaction with the Health Evaluation Model
The interaction between the two components follows a strict sequence:
The Health Evaluation Model computes the aggregated state.
The model optionally provides contextual information.
HealthDiagnosticsgenerates explanatory messages.The supervision layer publishes both attributes.
The diagnostics component never re-evaluates or overrides
the aggregated HealthState.
Operational Characteristics
Stateless: no internal persistence across evaluations.
Deterministic: identical inputs produce identical output.
Snapshot-based: operates only on stable supervision data.
Independent: unaware of supervision timing or transitions.
Summary
HealthDiagnostics translates the aggregated HealthState
into clear, operator-facing diagnostic information.
While the Health Evaluation Model determines what the system health is, the diagnostics component explains why it is in that condition, without influencing the aggregation logic.