Getting started
Installation
- To install this package use:
pip install --extra-index-url=https://artefact.skao.int/repository/pypi-internal/simple ska-sdp-realtime-calibration.
The RCalProcessor derives from SDP Receive Processors
class BaseProcessor
and requires a number of SKA SDP packages for real-time data access and
calibration routines:
See the SDP Receive Processors documentation for detailed information about the class and how it interacts with other processing components.
Running the RCal processor
The SDP Receive Processors package provides the
program used to load and run the processors. This program is called
plasma-processor. It loads the user-supplied processor class and sets up
the necessary infrastructure to connect it to the Plasma store. See the
processors documentation for more information.
With access to the Plasma store taken care of, a Data Queues producer can be
enabled for transmission of calibration datasets from a new RCalProcessor:
producer = AIOKafkaProducer(bootstrap_servers=KAFKA_HOST)
producer.start()
rcal_processor = RCalProcessor(
max_calibration_intervals=NUM_TIMESTAMPS,
rcal_producer=RCalProducer(
producer,
bandpass_topic,
),
)
This processor can be used within a Processor runner:
rcal_runner = Runner(
PLASMA_SOCKET,
rcal_processor,
polling_rate=0.001,
use_sdp_metadata=False,
)
and run alongside the Plasma store and receiver. Another processing component, such as the CBF beamformer, can launch a Data Queues consumer to receive the calibration datasets.
Calibration datasets
Data Queues
The RCalProcessor fills a
GainTable dataset
with antenna-based gain solutions of jones_type “B”. These can be
re-channelised to have spectral sampling that is appropriate for CBF
beamforming, as determined by SDP func-python
calibration.beamformer_utils. To help determine what sampling is appropriate,
an additional RCalProcessor constructor argument, array, can be set to
“LOW” or “MID”.
The bandpass Jones matrices are combined with antenna beam matrices—and in the future also with separate ionospheric delay and differential Faraday rotation fits—then inverted to form correction matrices. The correction matrices can also be scaled to a suitable range for application in CBF. Any matrices with zero weights are set to zero so that they are interpreted by CBF as being flagged, and associated voltages will be excluded from beamforming.
The stripped-back xarray dataset that is sent to the Data Queues, for a small test dataset, has the following form:
<xarray.Dataset> Size: 1kB
Dimensions: (antenna: 4, frequency: 8, receptor1: 2, receptor2: 2)
Coordinates:
* antenna (antenna) int64 32B 0 1 2 3
* frequency (frequency) float64 64B 1.5e+08 1.508e+08 ... 1.547e+08 1.555e+08
* receptor1 (receptor1) <U1 8B 'X' 'Y'
* receptor2 (receptor2) <U1 8B 'X' 'Y'
Data variables:
gain (antenna, frequency, receptor1, receptor2) complex64 1kB (-0.1...
Attributes:
time: 5246337590.452263
solution_number: 5
antenna_names: ['s8-1', 's8-6', 's9-2', 's10-3']
QA Data Queues
RCal now publishes bandpass calibration solutions to a dedicated QA Kafka topic
after each solve. The QA flow is discovered via the SDP Config DB using the
function name vis-receive:rcal-processor:bandpass-calibration-generation, and
configured at runtime via the --qa-kafka-topic CLI argument. A single flow is
shared across all visibility beams. Solutions are published as a GainTable
(time axis squeezed, gains as complex64) with visibility_beam_id, scan_type_id,
time, and solution_number as dataset attributes.
QA Metrics
RCal also publishes calibration QA metrics to Tango. These are under development a will change as we learn more about the system and work with the commissioning team. The current state of the current metrics is:
visibility_chisq: the weighted residual visibility variance, with weights coming from the visibility dataset weight array. However, while these weights contain relative noise inverse variance variations related to time and frequency averages, they are not based on an absolute noise level. The upshot is that the chisq value at the convergence limit will not be one. It will be related to the average sample variance level. This will be improved in coming PIs.
bandpass_converged: set to True if calibration succeeds and the visibility_chisq value remains steady. This will also be improved in coming PIs.
H5Parm files
Calibration solutions can also be written to HDF5 files in the H5Parm format.
Solutions are appended to the files at the end of each RCAL cycle. H5Parm output
is enabled using option output_h5. If output_h5=output.h5, the following
files will be created:
output.h5: contains the time-dependent bandpass calibration solutions.
output_combined.h5: contains bandpass matrices multiplied with the station beam matrices at beam centre.
output_final.h5: contains the final inverted solution matrices, including any CBF scaling.
After opening, e.g. h5file=h5py.File("output.h5", "r"), the files have the
following form:
table = h5file["sol000"]["amplitude000"] # or h5file["sol000"]["phase000"]
table["time"][:] # 1D numpy array of times [MJDS]
table["ant"][:] # 1D numpy array of antenna names
table["freq"][:] # 1D numpy array of frequencies [Hz]
table["pol"][:] # 1D numpy array of polarisations corresponding to [J00, J01, J10, J11]
table["val"][:] # [ntime, nant, nfreq, npol] numpy array of Jones amplitudes or phases
table["weight"][:] # [ntime, nant, nfreq, npol] numpy array of weights