Solar Drift Plots

Functions for analysis of antenna flagging history.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.add_dataframe_to_googlesheet(df_antennahist: DataFrame)

Add data to google sheet for df_antenna_flag_hist.

Parameters:

df_antennahist – a pandas dataframe with the antenna flagging history. First argument from running get_next_steps_for_tickets_flagged_antennas().

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.cleanup_data_for_write2googlesheet(df: DataFrame)

Clean up any numpy arrays or timestamps.

This allows the data to be written to json to be written to a google sheet.

Parameters:

df – the dataframe to clean up.

Returns:

the cleaned up dataframe.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.get_bad_gain_antennas(gains)

Return antenna indices with gains that are too low or too high.

This function has been copied from the calibration.ipynb notebook in the ska-low repo.

Parameters:

gains – a vector with shape (num_antennas, 4) with the gains obtained from calibration solutions for polarizations for a single frequency channel.

Returns:

a numpy array of indices for antennas that have gains that are too high or too low in the XX or YY pols.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.get_data_coarsedelays_intercepts_overtime(station: str, CAL_SUMMARY_BASEDIR: str = '/home/jovyan/shared/calibration_solution_summaries', start_date: str = '', end_date: str = '')

Get the coarse delays of antennas over time (slope of phase/freq plot).

Likewise get the offsets (intercept of phase/freq plot). The data is stored separately for antennas that are very frequently flagged in calibrations. To obtain the coarse delays in appropriate units, multiple by -1e3 to have the correct coarse delay in nanoseconds.

Parameters:
  • CAL_SUMMARY_BASEDIR – directory to find calibration solution summaries csv.

  • station – a string for the station you want to plot history for.

  • start_date – a string for a start date if you only want to look at flagging history after this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

  • end_date – a string for an end date if you only want to look at flagging history prior to this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

Returns:

a tuple with multiple arrays. The first 4 arrays are: - a 2d array of shape given by # dates and # antennas containing coarse delays (for XX pol) for frequently flagged antennas - a 2d array of shape given by # dates and # antennas containing coarse delays (for YY pol) for frequently flagged antennas - a 2d array of shape given by # dates and # antennas containing coarse delays (for XX pol) for all other flagged antennas - a 2d array of shape given by # dates and # antennas containing coarse delays (for YY pol) for all other flagged antennas - a list with the dates corresponding to the coarse delays saved - a 1D array with the mean antenna coarse delay for XX pol - a 1D array with the mean antenna coarse delay for YY pol - a 1D array with the standard deviation of the antenna coarse delay for XX pol - a 1D array with the standard deviation of the antenna coarse delay for YY pol - a 1D array with the indices for antennas that are frequently flagged, so they can be mapped to antenna names - a 1D array with the indices for antennas that are not in the list of those frequently flagged, so they can be mapped to antenna names

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.get_flagged_antenna_reason_stations(start_date: str = '', end_date: str = '', CAL_SUMMARY_BASEDIR: str = '/home/jovyan/shared/calibration_solution_summaries', replace_basedir: bool = False, basedir_data: str = '')

Get details about reason for antenna flagging.

Parameters:
  • start_date – Earliest date to start looking for calibration summaries. Format is %Y-%m-%d

  • end_date – Latest date to start looking for calibration summaries. Format is %Y-%m-%d

  • replace_basedir – this replaces the base directory (as specified by the dir given as CAL_SUMMARY_BASEDIR) in the string for the directory read from the cal summary sols file. Useful for testing.

  • basedir_data – directory to use as base to look for data, only if replace_basedir = True

  • CAL_SUMMARY_BASEDIR – directory to look for calibration solution summaries.

Returns:

a dataframe with the reasons antennas were flagged over multiple stations and dates for analysis.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.get_history_flagging_separated_by_antennas(station: str, CAL_SUMMARY_BASEDIR: str = '/home/jovyan/shared/calibration_solution_summaries', start_date: str = '', end_date: str = '', replace_basedir: bool = False, basedir_data: str = '')

For a single station, get a dataframe with information about antenna flagging.

Shows reasons for antenna flagging for individual antennas and dates per row. Can be filtered to study behaviour of a single antenna over various dates.

Parameters:
  • station – the name of the station of interest

  • CAL_SUMMARY_BASEDIR – directory to look for calibration solution summaries.

  • start_date – Earliest date to start looking for calibration summaries. Format is %Y-%m-%d %H:%M:%S %z.

  • end_date – Latest date to start looking for calibration summaries. Format is %Y-%m-%d %H:%M:%S %z.

  • replace_basedir – this replaces the base directory (as specified by the dir given as CAL_SUMMARY_BASEDIR) in the string for the directory read from the cal summary sols file. Useful for testing.

  • basedir_data – directory to use as base to look for data, only if replace_basedir = True

Returns:

a dataframe with information about why antennas were flagged over multiple datetimes.

Raises:

ValueError – if the calibration solution summaries cannot be found in the given directory for CAL_SUMMARY_BASEDIR.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.get_information_flagged_antennas(station: str, CAL_SUMMARY_DIR: str)

Get a dataframe that gives info about flagged antennas (one cal summary).

E.g. List each antenna from a calibration solution for a single station, then have a column for the delay measured from the phase-gain solution, a column stating if the antenna is masked in the Tel Model, and finally a column on whether the gain is significantly high or low across channels.

Parameters:
  • station – A string for the station name.

  • CAL_SUMMARY_DIR – The directory for the calibration solutions of interest.

Returns:

a dataframe with some indicators for why an antenna was flagged by calibration solutions.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.get_next_steps_for_tickets_flagged_antennas(df_antenna_flag_hist: DataFrame, df_latest_log_date: str = '')

Take dataframe about antenna flagging reasons and determine next steps.

Ultimately, which antennas will need a ticket raised due to consistent poor behaviour.

The user should note that early access stations may appear to have antennas that are flagged many times (from the output of this function) - but should take into account this may not be accurate representation if there have not been a large number of observations for those stations (and thus one may see close to 100% rate of flagging for an antenna).

This function should be passed the dataframe output from the function get_flagged_antenna_reason_stations(). Then it will go through and check whether antennas that have been flagged have been repeatedly flagged due to high/low gain solutions or large residual delays, and tell return a dataframe to tell the user if the antenna needs a ticket / should have the static delay updated in the Tel Model / should be masked in the Tel Model. Antennas that are simply turned off in the Tel Model at present will be filtered out since they cannot be studied for issues. For antennas that don’t have a consistent history associated with flagging but are flagged due to a large residual delay, the code will check if these antennas have a large spread in their residual delays over time. It will also check if these antennas are associated with flagged antennas that share the same SMARTbox or TPM.

Parameters:
  • df_antenna_flag_hist – pandas DataFrame that contains information about reasons antennas were flagged. For any antenna names that are duplicated in this dataframe, it will filter to have only unique instances of any antenna names flagged for the same reason. If the antenna name appears more than once but the reason for flagging is different, checks will be applied to the antenna based on the multiple reasons available, then the final information about steps for the antenna will be simplified to a single row.

  • df_latest_log_date – a string with the datetime for the last time the Tel Model was updated; this ensures the history about antennas prior to any changes in the Tel Model are not used to make future changes. If left blank, all available history about the antenna performance is used. The format is %Y-%m-%d %H:%M:%S %z.

Returns:

a tuple consisting of: - a dataframe with information on useful flagging information and history for antennas - another dataframe that indicates how often antennas have been flagged (mostly) individually with indicators on whether a Jira ticket should be raised - a dataframe that indicates how many times almost all the antennas associated with a smartbox have been flagged - a dataframe that indicates how many times almost all the antennas associated with a TPM have been flagged - a dataframe that indicates how many times 50% of all the antennas associated with a station have been flagged

Raises:

ValueError – if the dataframe passed to df_antenna_flag_hist does not have the right format (column names).

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.group_details_antenna_flagging(dfs: list = [])

Highlight/group flagged antennas that indicate smartbox/TPM/SPS/Station issues.

The user should pass the following list as the argument for dfs, returned

from running get_next_steps_for_tickets_flagged_antennas: [df_SBs, df_TPMs, df_SPSs, df_Sts]

Parameters:

dfs – list of dataframes that have information about the antennas that have been flagged from calibration solutions.

Returns:

a dictionary of dataframes that summarise the flagged status of smartboxes, tpms, SPS subrack, stations, if the initial list contains dataframes to operate on. Otherwise, returns an empty dataframe. Also returns the names of flagged devices as a simple list (return is a tuple with dictionary + list).

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.make_multiple_plots_antennas_flagged(df_flagging: DataFrame, dfs_list: list = [])

Loop through dataframe with details on flagged antennas and plot.

The user should pass the following list as the argument for dfs, returned

from running get_next_steps_for_tickets_flagged_antennas: [df_SBs, df_TPMs, df_SPSs, df_Sts]

Parameters:
  • df_flagging – the dataframe with flagging info output from get_next_steps_for_tickets_flagged_antennas().

  • dfs_list – list of dataframes that have information about the antennas that have been flagged from calibration solutions.

Returns:

returns a summary dataframe containing useful info about highly flagged devices given a station SFT.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.multiple_operations(group: DataFrame)

Perform operations on groups in dataframe to get summary stats for antenna flagging.

Parameters:

group – the dataframe that should be operated on. This function is ideally called like df.groupby([‘whatever you group by’]).apply(multiple_operations) which is passing the dataframe ‘df’ to the function after a groupby() call.

Returns:

a pandas series with the operations applied to the dataframe.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.plot_antenna_map_flagged(station: str, flagged_ants_list: list[str], datetime_str: str)

Plot map of flagged antennas from calibration solutions.

Parameters:
  • station – the name of the station.

  • flagged_ants_list – list of flagged antennas from cal sols.

  • datetime_str – string with the datetime of the SFT.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.plot_barchart_antenna_flag_counts_history(station: str, CAL_SUMMARY_BASEDIR: str = '/home/jovyan/shared/calibration_solution_summaries', start_date: str = '', end_date: str = '', verbose: bool = True)

Plot a bar chart to display the number of times an antenna is flagged.

Antennas flagged more than 10% of the time in the observations found in calibration solution summaries will be included in the plot. If no antennas are flagged at least this frequently, no plot will be generated.

Parameters:
  • CAL_SUMMARY_BASEDIR – directory to find calibration solution summaries csv.

  • station – a string for the station you want to plot history for.

  • start_date – a string for a start date if you only want to look at flagging history after this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

  • end_date – a string for an end date if you only want to look at flagging history prior to this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

  • verbose – a boolean to print stats from summary_stats_antenna_flag_counts_history that is useful to the user if True. verbose = False suppresses this.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.plot_chart_antenna_flag_counts_history_over_time(station: str, CAL_SUMMARY_BASEDIR: str = '/home/jovyan/shared/calibration_solution_summaries', start_date: str = '', end_date: str = '', verbose: bool = True)

Plot a chart to display which antennas are being flagged over time.

Antennas flagged more than 10% of the time in the observations found in calibration solution summaries will be included in the plot. If no antennas are flagged at least this frequently, no plot will be generated.

Parameters:
  • CAL_SUMMARY_BASEDIR – directory to find calibration solution summaries csv.

  • station – a string for the station you want to plot history for.

  • start_date – a string for a start date if you only want to look at flagging history after this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

  • end_date – a string for an end date if you only want to look at flagging history prior to this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

  • verbose – a boolean to print stats from summary_stats_antenna_flag_counts_history that is useful to the user if True. verbose = False suppresses this.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.plot_coarsedelayantennas_overtime_with_rolling_average(station: str, CAL_SUMMARY_BASEDIR: str = '/home/jovyan/shared/calibration_solution_summaries', start_date: str = '', end_date: str = '', supress_plots: bool = False)

Plot the coarse delays over time individually for antennas in a station.

The plot will also show the mean coarse delay, the rolling average over 5 observations, and the mean coarse delay from the last 30 observations. This plotting function WILL plot 256 plots for each antenna, so may be slow.

The shifts seen in the mean coarse delay may be used to update the Tel Model. The function also returns a dataframe with a suggested shift to the delay to update the Tel Model for each antenna.

The user should be careful about updating the Tel Model based on historical coarse delays from any dates, because the historical data for coarse delays reflect the delay measured at the time, given the delay in the Tel Model corresponding to that datetime. This would suggest to only use the data from observations that occurred after the last time the model was updated.

Parameters:
  • CAL_SUMMARY_BASEDIR – directory to find calibration solution summaries csv.

  • station – a string for the station you want to plot history for.

  • start_date – a string for a start date if you only want to look at flagging history after this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

  • end_date – a string for an end date if you only want to look at flagging history prior to this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

  • supress_plots – a boolean which defaults to False. If set to True, then the plots will not be made for every antenna and only the dataframe summary will be returned.

Returns:

a dataframe that gives the suggested shift in the delay for the Tel Model, with a separate delay for X and Y polarizations.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.plot_coarsedelays_dist_overtime(station: str, CAL_SUMMARY_BASEDIR: str = '/home/jovyan/shared/calibration_solution_summaries', start_date: str = '', end_date: str = '')

Plot a chart to display the coarse delays of antennas over time.

This shows all the antennas distribution.

Antennas flagged more than 10% of the time in the observations found in calibration solution summaries will be highlighted to distinguish them from antennas that are not frequently flagged in calibration solution summaries.

Parameters:
  • CAL_SUMMARY_BASEDIR – directory to find calibration solution summaries csv.

  • station – a string for the station you want to plot history for.

  • start_date – a string for a start date if you only want to look at flagging history after this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

  • end_date – a string for an end date if you only want to look at flagging history prior to this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

Raises:

ValueError – if there is no data to plot at all.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.rolling_mean(time_series_data, weights=None, n=5)

Get the rolling mean of a vector with the last n observations.

The rolling mean will not be computed for the first n observations, the first n elements will be identical to the input time_series_data.

Parameters:
  • time_series_data – the array for which to obtain the rolling mean

  • weights – a set of weights to down/upweight as desired

  • n – the number of observations prior to the ith observation to use for the rolling mean.

Returns:

an array with the rolling mean.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.summary_stats_antenna_flag_counts_history(station: str, CAL_SUMMARY_BASEDIR: str = '/home/jovyan/shared/calibration_solution_summaries', start_date: str = '', end_date: str = '', verbose: bool = True)

Summarise historical data for flagged antennas.

Prints some useful stats and returns a tuple.

Parameters:
  • CAL_SUMMARY_BASEDIR – directory to find calibration solution summaries csv.

  • station – a string for the station you want to summarise history for.

  • start_date – a string for a start date if you only want to look at flagging history after this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

  • end_date – a string for an end date if you only want to look at flagging history prior to this date. Format is %Y-%m-%d %H:%M:%S %z - default empty string means all dates with data found are used.

  • verbose – a boolean to print information that is useful to the user if True. verbose = False suppresses this.

Returns:

a tuple with a dictionary of the SummaryCalSolutions within start_date and end_date for the station. The tuple also returns 2 lists, one with the names of antennas that have been flagged more than 10% of the time and the other with the counts of flags for the corresponding antennas. It also returns integers for the number of counts required to be flagged >10% of the time, the number of antennas that have been flagged >50% of the time and the dates for all the SummaryCalSolutions found.

Raises:

ValueError – If path to csv for calibration solutions doesn’t exist, or if there is no historical data in the calbration solution summaries.

ska_sci_ops_data_analysis.inspect_antenna_flagging_history.useful_flagging_stats_for_masternotebook(start_date: str = '', end_date: str = '', CAL_SUMMARY_BASEDIR: str = '/home/jovyan/shared/calibration_solution_summaries')

Get general details about antenna performance for operators.

The user should choose a date range to look at the antennas that have been flagged a lot from calibration solutions.

Parameters:
  • start_date – Earliest date to start looking for calibration summaries. Format is %Y-%m-%d

  • end_date – Latest date to start looking for calibration summaries. Format is %Y-%m-%d

  • CAL_SUMMARY_BASEDIR – directory to look for calibration solution summaries.