Memories and Arrays

This section uses the SKAO Tile Beamformer module as an example. This is a part of the Signal Processing System and is responsible for combining channelised signals from 8 antennas into a single phased beam. The Tile Beamformer utilises memory to store and reorder the input channelized data frames, providing the spectral regions into the Beamformer. The XML2VHDL XML Register Map description is shown below.

External Memory

<node id="beamformer_fd">
    <node id="date_code" address="0x00" mask="0xFFFFFFFF" permission="r"  hw_rst="0x20210522" description="Compile Date yyyymmdd"/>
    <node id="control"   address="0x04" mask="0x0000000F" permission="rw" hw_rst="no"         description="General control" >
        <node id="reset"                mask="0x00000001" permission="rw" hw_rst="no"         description="General reset"/>
        <node id="load_delay_immediate" mask="0x00000002" permission="rw" hw_rst="no"         description="Load delay table, immediate"/>
        <node id="load_delay"           mask="0x00000004" permission="rw" hw_rst="no"         description="Load delay table, delayed"/>
    </node>
    <node id="load_time" address="0x08" mask="0xFFFFFFFF" permission="rw" hw_rst="no" description="Time for load delay, in frames"/>
    <node id="nof_chans" address="0x0c" mask="0x000001F8" permission="rw" hw_rst="no" description="N.of processed channels"/>
    <node id="tp_sel"    address="0x14" mask="0x000000FF" permission="rw" hw_rst="0"  description="Test point select"/>
    <node id="f2f_latency" address="0x18"  description="F2F latency">
        <node id="count"       mask="0x0000FFFF" permission="r" hw_rst="no" hw_permission="w" description="F2F latency"/>
        <node id="count_start" mask="0x00010000" permission="r" hw_rst="no" hw_permission="w" description="F2F latency count start"/>
        <node id="count_stop"  mask="0x00020000" permission="r" hw_rst="no" hw_permission="w" description="F2F latency count stop"/>
    </node>
    <node id="errors" address="0x1C" description="Error register">
        <node id="fifo_read"         mask="0x00000001" permission="r"  hw_rst="no"  hw_permission="w"               description="FIFO read when empty"/>
        <node id="fifo_write"        mask="0x00000002" permission="r"  hw_rst="no"  hw_permission="w"               description="FIFO written when full"/>
        <node id="tlast_not_aligned" mask="0x00000004" permission="r"  hw_rst="no"  hw_permission="w"               description="Not aligned tlast"/>
        <node id="errors_rst"        mask="0x00000008" permission="rw" hw_rst="0x0" hw_permission="w" hw_prio="bus" description="Reset Error register"/>
    </node>
    <node id="region_off" address="0x80"  mask="0xFFFFFFFF" permission="rw" hw_rst="no" size="16" hw_dp_ram="no" description="Region offset table"/>
    <node id="beam_index" address="0xc0"  mask="0xFFFFFFFF" permission="rw" hw_rst="no" size="16" hw_dp_ram="no" description="Beam index table"/>
    <node id="region_sel" address="0x100" mask="0xFFFFFFFF" permission="rw" hw_rst="no" size="64" hw_dp_ram="no" description="Region select table"/>
    <node id="delay"      address="0x200" array="8" array_offset="0x20" description="Delay table">
        <node id="beam"     address="0x0" array="8" array_offset="4" description="Beam delay">
            <node id="antenna"  address="0x0"   mask="0xFFFFFFFF" permission="w" hw_rst="no" description="Antenna delay"/>
        </node>
    </node>
</node>

The Tile Beamformer requires three external memories providing storage for the following data: region_off, beam_index, and region_sel.

It can be seen that each of these nodes has the following attributes set:

  • size="16": By setting this attribute to a value greater than 1 this node will be attached to a dual-port RAM. By default XML2VHDL will generate its own instance (see Internal Memory). To use external memory, instead of the default internal memory, can be forced by also setting:

  • hw_dp_ram="no". By setting this attribute to "no", XML2VHDL will generate an independent IPbus interface, which the end-user is expected to connect to an external memory component (see External Memory). This component MUST BE supplied and managed outside of XML2VHDL.

External Memory Wrapper

    --! @brief AXI4 lite slave for slow control
    axi4lite_beamformer_fd_inst : entity beamf_fd_lib.axi4lite_beamformer_fd
        port map(
            axi4lite_aclk                     => axi4lite_aclk,
            axi4lite_aresetn                  => axi4lite_aresetn,
            axi4lite_mosi                     => axi4lite_mosi,
            axi4lite_miso                     => axi4lite_miso,
            ipb_beamformer_fd_region_off_miso => s_region_off_miso,
            ipb_beamformer_fd_region_off_mosi => s_region_off_mosi,
            ipb_beamformer_fd_beam_index_miso => s_beam_index_miso,
            ipb_beamformer_fd_beam_index_mosi => s_beam_index_mosi,
            ipb_beamformer_fd_region_sel_miso => s_region_sel_miso,
            ipb_beamformer_fd_region_sel_mosi => s_region_sel_mosi,
            axi4lite_beamformer_fd_out_we     => s_beamformer_fd_out_we,
            axi4lite_beamformer_fd_out        => s_beamformer_fd_out,
            axi4lite_beamformer_fd_in_we      => s_beamformer_fd_in_we,
            axi4lite_beamformer_fd_in         => s_beamformer_fd_in
        );
    
    --! @brief Beam index, Region selection and region offset table
    ces_beamformer_table_bank_inst : entity beamf_fd_lib.ces_beamformer_table_bank
        generic map(
            g_region_off_tab_addr_w => g_region_off_tab_addr_w,
            g_region_off_tab_data_w => g_region_off_tab_data_w,
            g_region_sel_tab_addr_w => g_region_sel_tab_addr_w,
            g_region_sel_tab_data_w => C_REGION_SEL_TAB_DATA_W,
            g_beam_idx_tab_addr_w   => g_beam_idx_tab_addr_w,
            g_beam_idx_tab_data_w   => g_beam_idx_tab_data_w
        )
        port map(
            axi4lite_aclk                       => axi4lite_aclk,
            main_clk_i                          => axi4s_clk_out_i,
            arstn_i                             => s_rstn,
            ipb_beamformer_fd_region_off_miso_o => s_region_off_miso,
            ipb_beamformer_fd_region_off_mosi_i => s_region_off_mosi,
            ipb_beamformer_fd_beam_index_miso_o => s_beam_index_miso,
            ipb_beamformer_fd_beam_index_mosi_i => s_beam_index_mosi,
            ipb_beamformer_fd_region_sel_miso_o => s_region_sel_miso,
            ipb_beamformer_fd_region_sel_mosi_i => s_region_sel_mosi,
            region_off_table_rden_i             => s_region_off_table_rden,
            region_off_table_addr_i             => s_region_off_table_addr,
            region_off_table_data_o             => s_region_off_table_data,
            region_sel_table_rden_i             => s_region_sel_table_rden,
            region_sel_table_addr_i             => s_region_sel_table_addr,
            region_sel_table_data_o             => s_region_sel_table_data,
            beam_index_table_rden_i             => s_beam_index_table_rden,
            beam_index_table_addr_i             => s_beam_index_table_addr,
            beam_index_table_data_o             => s_beam_index_table_data
        );

The code above shows a snippet of the VHDL wrapper file for the Tile Beamformer module used to instantiate and link the generated XML2VHDL output with the end-user host design.

  • axi4lite_beamformer_fd_inst: The generated VHDL. In addition to the usual AXI4-Lite interface, and the expected axi4lite_beamformer_fd_* record structures (see Register Maps). There are also additional IPbus interfaces named ipb_beamformer_fd_*. These result from the XML node size= attribute being set to a value greater than 1. One IPbus interface is created for two of the three external memories described above. Using these interfaces, this generated VHDL component is connected to:

  • ces_beamformer_table_bank_inst: The VHDL wrapper file, containing each of the three external memories` and associated logic to write to them. This allows the each external memory to be targeted in the address region configured in the original XML description.

--! @brief Region selection table address management
    s_region_sel_addr <= ipb_beamformer_fd_region_sel_mosi_i.addr(s_region_sel_addr'length+2-1 downto 2);
    s_region_sel_data <= ipb_beamformer_fd_region_sel_mosi_i.wdat(g_region_sel_tab_data_w-1 downto 0);
    s_region_sel_wreq <= ipb_beamformer_fd_region_sel_mosi_i.wreq;
    ipb_beamformer_fd_region_sel_miso_o.wack <= '1';

    --! @brief Region selection table implemented using distributed RAM
    inst_dram_region_sel_table: entity common_mem_lib.common_ram_sdp
       generic map(
            g_wr_dat_width    => g_region_sel_tab_data_w,
            g_wr_adr_width    => g_region_sel_tab_addr_w,
            g_rd_dat_width    => g_region_sel_tab_data_w,
            g_rd_latency      => C_REGION_SEL_T_LATENCY,
            g_fpga_family     => c_technology_fpga_family,
            g_fpga_vendor     => c_technology_fpga_vendor,
            g_implementation  => "distributed",
            g_partition_width => 0
       )
       port map(
            wr_clk           => axi4lite_aclk,
            rd_clk           => main_clk_i,
            wr_en            => s_region_sel_wreq,
            rd_en            => region_sel_table_rden_i,
            wr_adr           => s_region_sel_addr,
            rd_adr           => region_sel_table_addr_i,
            wr_dat           => s_region_sel_data,
            rd_dat           => region_sel_table_data_o
        );

The remaining "region_sel" external_memory connects to a common_ram_sdp component. This is a distributed single-port RAM, created using a Vivado instantiation template. The VHDL code snippet above shows how the IPbus record interface can be expanded to connect to ports on the common_ram_sdp.

This end-user created external memory does not have a read acknowledge, instead it has a write acknowledge which is implemented through the statement:

ipb_beamformer_fd_region_sel_miso_o.wack <= '1';

The expanded signals are used to write into the common_ram_sdp memory. This is a single-port memory, write is connected to the Bus-side AXI4-Lite and read is connected to downstream Logic-side which are read during normal operation of the Tile Beamformer.

Caution

It is not possible to read from the AXI4-Lite interface, due to no read side connectivity on the Bus-side. For this reason, if a software attempts a read access via the AXI4-Lite Bus-side interface, there WILL NOT be a corresponding read acknowledge, causing the read transaction to fail, resulting in an error.

It is the responsibility of the Software to prevent reads to avoid this error condition.

Arrays

<node id="beamformer_fd">
    <node id="date_code" address="0x00" mask="0xFFFFFFFF" permission="r"  hw_rst="0x20210522" description="Compile Date yyyymmdd"/>
    <node id="control"   address="0x04" mask="0x0000000F" permission="rw" hw_rst="no"         description="General control" >
        <node id="reset"                mask="0x00000001" permission="rw" hw_rst="no"         description="General reset"/>
        <node id="load_delay_immediate" mask="0x00000002" permission="rw" hw_rst="no"         description="Load delay table, immediate"/>
        <node id="load_delay"           mask="0x00000004" permission="rw" hw_rst="no"         description="Load delay table, delayed"/>
    </node>
    <node id="load_time" address="0x08" mask="0xFFFFFFFF" permission="rw" hw_rst="no" description="Time for load delay, in frames"/>
    <node id="nof_chans" address="0x0c" mask="0x000001F8" permission="rw" hw_rst="no" description="N.of processed channels"/>
    <node id="tp_sel"    address="0x14" mask="0x000000FF" permission="rw" hw_rst="0"  description="Test point select"/>
    <node id="f2f_latency" address="0x18"  description="F2F latency">
        <node id="count"       mask="0x0000FFFF" permission="r" hw_rst="no" hw_permission="w" description="F2F latency"/>
        <node id="count_start" mask="0x00010000" permission="r" hw_rst="no" hw_permission="w" description="F2F latency count start"/>
        <node id="count_stop"  mask="0x00020000" permission="r" hw_rst="no" hw_permission="w" description="F2F latency count stop"/>
    </node>
    <node id="errors" address="0x1C" description="Error register">
        <node id="fifo_read"         mask="0x00000001" permission="r"  hw_rst="no"  hw_permission="w"               description="FIFO read when empty"/>
        <node id="fifo_write"        mask="0x00000002" permission="r"  hw_rst="no"  hw_permission="w"               description="FIFO written when full"/>
        <node id="tlast_not_aligned" mask="0x00000004" permission="r"  hw_rst="no"  hw_permission="w"               description="Not aligned tlast"/>
        <node id="errors_rst"        mask="0x00000008" permission="rw" hw_rst="0x0" hw_permission="w" hw_prio="bus" description="Reset Error register"/>
    </node>
    <node id="region_off" address="0x80"  mask="0xFFFFFFFF" permission="rw" hw_rst="no" size="16" hw_dp_ram="no" description="Region offset table"/>
    <node id="beam_index" address="0xc0"  mask="0xFFFFFFFF" permission="rw" hw_rst="no" size="16" hw_dp_ram="no" description="Beam index table"/>
    <node id="region_sel" address="0x100" mask="0xFFFFFFFF" permission="rw" hw_rst="no" size="64" hw_dp_ram="no" description="Region select table"/>
    <node id="delay"      address="0x200" array="8" array_offset="0x20" description="Delay table">
        <node id="beam"     address="0x0" array="8" array_offset="4" description="Beam delay">
            <node id="antenna"  address="0x0"   mask="0xFFFFFFFF" permission="w" hw_rst="no" description="Antenna delay"/>
        </node>
    </node>
</node>

The final node description, is the antenna-beam-delay. It can be seen that this is generated using the following attributes using two nodes: array="8" and array_size="4".

The total size of the arrays, in terms of 32 bit registers, is calculated by:

\[beam_{ARRAY} \times delay_{ARRAY}\]

Unlike the external memories, these nodes can be written and read from the Bus-side. However, they are not suitable for storing large amounts of data because they can consume significant FPGA resource.

Caution

Using arrays can also lead to issues as the physical address-space could conflict with the other locations causing an error when running the XML2VHDL scripts. Proceed with care, and it is recommended to use internal memory as an alternative when creating large contiguous address regions.

Internal & External Memory

Using SKAO as an example. Both external memory and internal memories are declared and initialised using XML2VHDL. In this case, the BRAMs are described via the Interconnect shown below.

<node id="tpm_test" address="0x00000000" hw_type="ic">
    <node id="io"        address="0x00000000" link="board"                   hw_type="transparent_ic/">
    <node id="bram64k"   address="0x01000000" link="bram64k_output.xml"/>
    <node id="bram2k"    address="0x01100000" link="bram2k_output.xml"/>
    <node id="dsp"       address="0x02000000" link="tpm_test_dsp_output.xml" hw_type="transparent_ic"/>
</node>

The two nodes of note are:

  • id=bram64k: to connect a 64K Byte BRAM. This is an internal memory, used to store the compressed top-level XML output file (using the HEX file generated by XML2VHDL). This allows the FPGA to self-describe its own address-space.

    The downstream Register Map named bram64k.xml, linked to this Interconnect node which describes an internal memory is as follows (see Internal Memory):

    <node id="bram64k">
        <node id="size64kbyte"  address="0x0000"  mask="0xFFFFFFFF" permission="rw" hw_rst="no" size="16384" hw_dp_ram="yes"
              hw_dp_ram_init_file="BOARD_XML_HEX_FILE" hw_dp_ram_init_file_format="hex" hw_dp_ram_init_file_check="no"
              description="64 Kbyte BRAM."/>
    </node>
    
  • bram2k: to connect a 2K Byte BRAM. This is an external memory, which contains the metadata information about the current firmware build, including:

    • the compile date;

    • version; and

    • the name of machine that compiled the FPGA Bitstream.

    The downstream Register Map, named bram2k.xml, linked to this Interconnect node which describes an external memory is as follows (see External Memory):

    <node id="bram2k">
        <node id="size2kbyte" address="0x0" mask="0xFFFFFFFF" permission="rw" hw_rst="no" size="512" description="2 Kbyte BRAM."/>
    </node>
    

Attention

<node id="tpm_test" address="0x00000000" hw_type="ic">
    <node id="io"        address="0x00000000" link="board"                   hw_type="transparent_ic/">
    <node id="bram64k"   address="0x01000000" link="bram64k_output.xml"/>
    <node id="bram2k"    address="0x01100000" link="bram2k_output.xml"/>
    <node id="dsp"       address="0x02000000" link="tpm_test_dsp_output.xml" hw_type="transparent_ic"/>
</node>

It can be seen that hw_type="transparent_ic" is set for the id="io" and id="dsp". Setting this node attribute, specifies that the downstream Register Map or Interconnect without hw_type="transparent_ic" being selected, will be placed on the same hierarchical node as its parent node.

In this example, to reference registers within id="io" or id="dsp", The parent node (Transparent Interconnect) can be omitted.

    --***********************************************************************
    -- AXI4LITE ROOT IC: This IC drives the IO section, DSP section and BRAM
    --***********************************************************************
    axi4lite_root_ic_inst: entity tpm_test_lib.axi4lite_tpm_test_ic
        port map(
            axi4lite_aclk     => c2c_mm_clk,
            axi4lite_aresetn  => c2c_mm_rstn_sync,
            axi4lite_mosi     => axi4lite_root_mosi,
            axi4lite_mosi_arr => axi4lite_mosi_arr,
            axi4lite_miso_arr => axi4lite_miso_arr,
            axi4lite_miso     => axi4lite_root_miso
        );
    
    --***********************************************************************
    -- AXI4LITE BRAM: This BRAM contains the XML, do not remove
    --***********************************************************************
    axi4lite_bram64k_inst: entity tpm_test_lib.axi4lite_bram64k
        port map (
            axi4lite_aclk           => c2c_mm_clk,
            axi4lite_aresetn        => c2c_mm_rstn_sync,
            axi4lite_mosi           => axi4lite_mosi_arr(axi4lite_mmap_get_id(id_bram64k)),
            axi4lite_miso           => axi4lite_miso_arr(axi4lite_mmap_get_id(id_bram64k)),
            bram64k_size64kbyte_clk => c2c_mm_clk
        );
    
    --***********************************************************************
    -- AXI4LITE BRAM: This BRAM contains the extended info, do not remove
    --***********************************************************************
    extended_info_bram_inst : entity tpm_test_lib.extended_info_bram
        port map (
            axi4lite_aclk    => c2c_mm_clk,
            axi4lite_aresetn => c2c_mm_rstn_sync,
            axi4lite_mosi    => axi4lite_mosi_arr(axi4lite_mmap_get_id(id_bram2k)),
            axi4lite_miso    => axi4lite_miso_arr(axi4lite_mmap_get_id(id_bram2k))
        );

As the id="bram2k", generates an external memory, additional logic is required to connect it to FPGA Primitives. In this example a RAMB18E2 (see Links) is used, as it is possible to initialise the contents of this BRAM using post-write-bitstream hooks in Vivado.