Component Snapshots in Health Supervision

Snapshot-based evaluation

Health evaluation is performed on a coherent view of the system rather than on individual asynchronous events.

As subsystem updates (health, operational state, admin mode, observation state) may arrive at different times, evaluating directly on event streams would lead to non-deterministic and potentially misleading intermediate outcomes.

For this reason, the health supervision pipeline evaluates snapshots: stable, point-in-time views of the latest known values for all relevant components.

This approach follows the same principles described in StateStore and snapshot-based evaluation (snapshot-based evaluation and thread-safe storage), while extending the stored payload to support health-specific needs.

Why component snapshots

Health evaluation requires more context than a single scalar state.

In particular, the diagnostic information reported through HealthInfo may depend on:

  • component health state,

  • component operational state,

  • administrative mode,

  • component role/weight in aggregation logic,

  • optional observing state context.

To support this, the health supervision pipeline stores component snapshots that combine all relevant last-known values into a single, immutable record per component.

This ensures that each evaluation cycle operates on a consistent set of inputs for every component.

Partial updates and stability

Subsystem events often update only a subset of the tracked fields (e.g., operational state changes without a health state change).

The snapshot store supports partial updates by merging new information into the previously stored snapshot, preserving existing values when a field is not updated.

This avoids accidental loss of context and prevents evaluations based on incomplete records.

Revision tracking

To reduce unnecessary evaluations, the store maintains a monotonically increasing revision counter.

The revision is incremented only when the effective snapshot content changes. This allows the supervision layer to skip evaluations when incoming events do not materially affect the stored view of the system.

Summary

Component snapshots provide the data foundation for stable and deterministic health evaluation.

They enable:

  • consistent aggregation across asynchronous subsystem updates,

  • reliable generation of HealthInfo explanations,

  • reduced redundant evaluations through revision tracking,

  • a clear separation between event ingestion and evaluation logic.