RTL Kernels

In the Vitis application acceleration development flow, RTL IP from the Vivado® Design Suite can be packaged as XO files that can be linked into an FPGA executable (.xclbin), as long as they adhere to Vivado IP Packaging guidelines, and requirements of the Vitis compiler for linking the system.

As explained in Kernel Properties, RTL kernels can be user-managed kernels that do not adhere to XRT requirements for execution control, but rather implement any number of possible control schemes specified by existing RTL designs. Alternatively, RTL kernels can adhere to the requirements of the ap_ctrl_chain or ap_ctrl_hs control protocols needed for XRT-managed kernels.

The following sections describe the kernel interface requirements for the Vitis compiler to link kernels into a system. These requirements are common to software controllable and non-software controlled kernels. The control requirements for XRT-managed kernels are also described, as well as any additional requirements. Finally, the development flow is described to help you package RTL IP in the Vivado® Design Suite as RTL kernels for use in the Vitis environment.

Requirements of an RTL Kernel

To be integrated into the Vitis tool flow, an RTL module must minimally meet the requirements enumerated in Kernel Interface Requirements. The need to meet the kernel interface requirements applies to both XRT-managed and user-managed kernels.

In addition, XRT-managed kernels must satisfy the requirements described in Control Requirements for XRT-Managed Kernels to be executed and profiled by XRT.

User-managed kernels must have the signal interfaces needed by the Vitis compiler to allow it to link the kernels to other kernels and to the target platform, but do not need to adhere to the strict execution protocol of XRT. In this way, existing RTL IP can be more rapidly and simply integrated into the Vitis environment.

It might be necessary to revise your RTL module to meet the kernel requirements outlined in the following sections.

Kernel Interface Requirements

To enable the Vitis compiler to connect kernels into the target platform, an RTL kernel must adhere to the requirements described in Kernel Properties. The various interface requirements are summarized in the following table.

IMPORTANT: In some cases, the port names must be defined exactly as shown.
Table 1. RTL Kernel Interface and Port Requirements
Port or Interface Description Comment
Clock One or more clock inputs.
  • At least one clock is required for the kernel.1
  • Can be named anything, but must be packaged with a bus interface.
    IMPORTANT: All ports in the RTL IP must be associated with an interface when packaging the RTL for use in the Vitis environment. If this is not the case, an error similar to the following occurs:
    ERROR: UNDEF When packaging for Vitis, pins that are not part of an interface are not supported
Reset Primary active-Low reset input port
  • Optional port.
  • Can be named anything, but must be associated with a Clock signal through the ASSOCIATED_RESET property on the Clock.
  • This signal should be internally pipelined to improve timing.
  • The signal is driven by a synchronous reset in the associated Clock domain.
interrupt Active-High interrupt.
  • Optional port.
  • When used, the name must be exactly as shown.
s_axi_control One (and only one) AXI4-Lite slave control interface
  • Required port. The s_axilite interface is generally required with exception for some cases using AXI4-Stream interfaces. It is not required for non-software controlled kernels.
  • When used, the name must be exactly as shown, and is case-sensitive.
AXI4_Memory Mapped Interface (m_axi) AXI4 memory mapped interfaces for global memory access
  • Optional port.
  • All AXI4 memory mapped interfaces must have 64-bit addresses (32 bits on Zynq-7000 devices).
  • The RTL kernel developer is responsible for partitioning global memory spaces. Each partition in the global memory becomes a kernel argument. The memory offset for each partition must be provided by the SW applications to the kernel through a register in the AXI4-Lite interface.
  • AXI4 memory mapped must not use Wrap or Fixed burst types and must not use narrow (sub-size) bursts. This means that AxSIZE should match the width of the AXI data bus.
  • Any user logic or RTL code that does not conform to the requirements above, must be wrapped or bridged to satisfy these requirements.
AXI4_STREAM (axis) AXI4-Stream interfaces for one-way data transfers between kernels or between the host application and kernels.
  • Optional port.
  • Cannot be used with bi-directional ports.
  • Use the STREAM interface template in the Vivado Design Suite.
  • Refer to AXI4-Stream Interfaces in Vitis High-Level Synthesis User Guide (UG1399) for additional information on interface requirements.
  1. The clock requirements listed here are for newer platform shells which include fixed clocks as discussed in Managing Clock Frequencies. RTL kernels for use on legacy platforms support two clocks named ap_clk and ap_clk_2 specifically, and two optional resets named ap_rst_n and ap_rst_n_2.

Control Requirements for XRT-Managed Kernels

IMPORTANT: User-managed kernels do not require the control registers and signals described below, but they can implement a control structure using registers in an s_axilite interface as discussed in Creating User-Managed RTL Kernels. If your RTL module implements a different control structure, you can define it as a user_managed kernel or it must be adapted to conform to the XRT-managed requirements described here.

The following table outlines the required register map for an XRT-managed kernel to be used within the Vitis tools and XRT. The control register is required by kernels that specify ap_ctrl_hs and ap_ctrl_chain control protocols as described in Execution Modes. Kernels that implement ap_ctrl_none and user_managed control protocols do not require the control registers described below.

TIP: The interrupt related registers are only required for designs that implement interrupts.

All user-defined registers must begin at location 0x10; locations below this are reserved. These include registers for kernel arguments such as scalar values and address offsets passed to memory mapped interfaces.

Table 2. Register Address Map
Offset Name Description
0x0 Control Controls and provides kernel status.
0x4 Global Interrupt Enable Used to enable interrupt to the host.
0x8 IP Interrupt Enable Used to control which IP generated signals are used to generate an interrupt.
0xC IP Interrupt Status Provides interrupt status.
0x10 Kernel arguments This would include scalars and global memory arguments for example.

The following table shows the control signals that are accessed through the control register (offset 0x0). The control register and its signals are determined by the kernel execution mode, ap_ctrl_hs and ap_ctrl_chain.

The available signals are used by the different control protocols as explained in Supported Kernel Execution Models in the XRT documentation. For example, for the sequential execution mode ap_ctrl_hs the host typically writes 0x00000001 to the offset 0 control register which sets Bit 0, clears Bits 1 and 2, and polls on reading ap_done signal until it is a 1.

Table 3. Control Register Signals
Bit Name Description
0 ap_start Asserted when the kernel can start processing data. Cleared on handshake with ap_done being asserted.
1 ap_done Asserted when the kernel has completed operation. Cleared on read.
2 ap_idle Asserted when the kernel is idle.
3 ap_ready Asserted by the kernel when it is ready to accept the new data
4 ap_continue Asserted by the XRT to allow kernel keep running
7 auto_restart Used to enable automatic kernel restart as described in Streaming Data in User-Managed Never-Ending Kernels.
31:5 Reserved Reserved

The following interrupt related registers are only required if the kernel has an interrupt.

Table 4. Global Interrupt Enable (0x4)
Bit Name Description
0 Global Interrupt Enable When asserted, along with the IP Interrupt Enable bit, the interrupt is enabled.
31:1 Reserved Reserved
Table 5. IP Interrupt Enable (0x8)
Bit Name Description
0 Interrupt Enable When asserted, along with the Global Interrupt Enable bit, the interrupt is enabled.
31:1 Reserved Reserved
Table 6. IP Interrupt Status (0xC)
Bit Name Description
0 Interrupt Status Toggle on write.
31:1 Reserved Reserved

Interrupt

XRT-managed RTL kernels can optionally have an interrupt port containing a single interrupt. The port name must be called interrupt and be active-High. It is enabled when both the global interrupt enable (GIE) and interrupt enable register (IER) bits are asserted in the Control Register block.

By default, the IER uses the internal ap_done signal to trigger an interrupt. Further, the interrupt is cleared only when writing a 1 to bit-0 of the IP Interrupt Status Register.

This logic should be reflected in the Verilog code for the RTL kernel, and also in the associated component.xml and kernel.xml files. The kernel.xml file is stored inside the kernel.xo file and is generated automatically when using the package_xo command or RTL Kernel Wizard.

Creating User-Managed RTL Kernels

If your RTL IP does not satisfy the AXI interface requirements for the Vitis compiler as outlined in Kernel Interface Requirements, you must modify the IP to implement the required interfaces. However, if your RTL IP does not satisfy the XRT control protocols of ap_ctrl_hs or ap_ctrl_chain, you can define it as a user-managed kernel rather than having to rewrite your IP.

A user-managed kernel does not need to satisfy the control requirements of XRT, and can implement any of a variety of execution mechanisms. User-managed kernels are meant to let you take advantage of the system building capabilities of the Vitis compiler, while letting your kernel implement your own control scheme. There is no prescribed method of starting or stopping, or otherwise controlling your kernel. This is largely up to you, and the specific requirements of your application or system. Some of the available control schemes include:

  • Accessing registers through an s_axilite control interface, similar to the method used by XRT though open to your own implementation
  • Accessing the hardware through software drivers, such as UIO drivers, implemented in your host application
  • Triggering the start or stop response of your kernel from a signal provided by a separate component, or from another kernel
  • Providing a data-driven approach, as described in Streaming Data in User-Managed Never-Ending Kernels
IMPORTANT: One limitation of implementing a control register in an s_axilite interface for a user-managed kernel is that the control register cannot be named CTRL. That name is specifically reserved for XRT-managed kernels, and returns a Critical Warning when found on a user-managed kernel, or ap_ctrl_none kernel.

RTL Kernel Development Flow

This section explains the process of creating RTL kernels using the Package IP feature inside the Vivado Design Suite. The Package IP command provides a Package for Vitis option which greatly simplifies packaging an existing RTL IP as a Xilinx Object (XO) file for use in the Vitis environment.

The packaged XO file is a container encapsulating the Vivado IP object (including source files) and associated kernel XML file. Using the Vitis compiler, the XO file can be combined with other kernels, and linked with the target platform and built for hardware or hardware emulation flows.

The Package for Vitis feature provides DRCs to check the completeness of the packaged IP prior to generating the XO file, and also automates the package_xo command to simplify the production of the packaged RTL kernel.

Package the RTL Code as a Vitis XO

IMPORTANT: The RTL IP should first be thoroughly verified with traditional RTL verification methods before being packaged as a kernel.

As discussed in Kernel Interface Requirements, the RTL kernel must be packaged with the following required interfaces:

  • The AXI4-Lite interface name must be packaged as S_AXI_CONTROL, but the underlying AXI ports can be named differently.
  • Any memory-mapped AXI4 interfaces must be packaged as AXI4 master endpoints with 64-bit address support.
    Note: Xilinx strongly recommends that AXI4 interfaces be packaged with AXI meta data HAS_BURST=0 and SUPPORTS_NARROW_BURST=0. These properties can be set in an IP-level bd.tcl file. This indicates wrap and fixed burst type is not used, and narrow (sub-size burst) is not used.
  • You can also implement the AXI4-Stream interface.
  • At least one clock is required for the kernel, though it can support multiple clocks.
    • Each clock must have an associated Bus Interface identifying it as a clock.
    • Each clock can have an optional active-Low reset, specified by the ASSOCIATED_RESET property on the clock.
    • A clock must be associated with each AXI4-Lite, AXI4, and AXI4-Stream interface on the kernel.

To package the IP, use the following steps:

  1. Create and package a new IP.
    1. From a Vivado project, with your RTL source files added, select Tools > Create and Package New IP.
    2. Select Package your current project, and click Next.
    3. Specify the location for your packaged IP. You can select the default location, or choose a different location.
    4. Review the Summary page and click Finish to open the Package IP window.

    The Package IP window opens to display the Identification page. For details on working with the IP packager in the Vivado tool, refer to the Vivado Design Suite User Guide: Creating and Packaging Custom IP (UG1118).

  2. Select Compatibility under the Packaging Steps. This displays the dialog box as shown in the following figure.

    1. Select the Package for Vitis check box to enable the process of packaging the RTL IP as an XO for use in the Vitis environment.
    2. Select the Control Protocol for the RTL Kernel. This determines the control mechanism used to operate the kernel. The choices are:
      • user_managed: Defines a SW-controllable kernel, that is user-managed rather than XRT-managed. This is the preferred option. Refer to Creating User-Managed RTL Kernels for additional information.
      • ap_ctrl_hs: This is the default, and specifies the simple sequential execution model for XRT -managed kernels as described in SW-Controllable Kernels.
      • ap_ctrl_chain: Specifies a pipelined execution model for XRT-managed kernels.
      • ap_ctrl_none: Indicates no control protocol as described in Non-Software Controlled Kernels.
    3. Check to ensure that both Package for IPI and Ignore Freq_Hz are enabled as well.

    Enabling these check boxes enables design rule checks (DRC) that the ipx::check_integrity command runs prior to packaging the IP and generating the XO. The DRCs include checks for required signals as described in Requirements of an RTL Kernel, and checks for control protocols and registers for XRT-managed kernels . As shown in the figure above, any issues are reported to the Package IP tool as they are encountered.

  3. Associate the clock to the AXI interfaces.

    Select the Ports and Interfaces step of the Package IP window, you can associate the primary kernel clock with the AXI4 interfaces, and reset signal if needed.

    1. Right-click an AXI4 interface, and select Associate Clocks.

      This opens the Associate Clocks dialog box which lists any identified clock signals.

    2. Select the appropriate clock and click OK to associate it with the interface.
    3. Ensure to repeat this step to a clock signal with each of the AXI interfaces.
  4. Click the Addressing and Memory step to add control registers and offsets.

    XRT-managed kernels using the ap_ctrl_hs or ap_ctrl_chain control protocol require control registers as discussed in Control Requirements for XRT-Managed Kernels. The following table shows a list of the required registers.

    TIP: While ap_ctrl_none and user_managed control protocols do not require control registers, they can still use them if an s_axilite interface is included as part of the RTL design. In this case, the specific registers can differ from the table below, but the process of assigning names, offsets, and widths is the same.
    Table 7. Address Map
    Register Name Description Address Offset Size
    CTRL Control Signals.
    IMPORTANT: The CTRL register and <kernel_args> are required on all kernels. The interrupt related registers are only required for designs with interrupts.
    0x000 32
    GIER Global Interrupt Enable Register. Used to enable interrupt to the host. 0x004 32
    IP_IER IP Interrupt Enable Register. Used to control which IP generated signal are used to generate an interrupt. 0x008 32
    IP_ISR IP Interrupt Status Register. Provides interrupt status. 0x00C 32
    <kernel_args> This includes a separate entry for each kernel argument as needed on the software function interface. All user-defined registers must begin at location 0x10; locations below this are reserved. 0x010 32/64

    Scalar arguments are 32-bits wide.m_axi and axis interfaces are 64 bits wide.

    1. To create the address map described in the table, right-click in the Address Blocks and select the Add Register command.

      This opens the Add Register dialog box in which you can enter one of the register names from the table above.

    2. Repeat as needed to add all required registers.
      This creates a Registers table in the Addressing and Memory section. You can edit the table to add the Description, Address Offset, and Size to each register. The Registers table should look similar to the following example.

      TIP: The Tcl commands for each step of this process are written to the Tcl Console. You can use this fact to execute the process, and then use the Tcl transcript to create scripts to automate the process for future iterations.
    3. Finally, select the register for each of the pointer arguments from your table, right-click and select the Add Register Parameter command. Enter the name ASSOCIATED_BUSIF into the dialog box that opens, and click OK.

      This lets you define an association between the register and the AXI4 Interface. In the value field of the added parameter, enter the name of the m_axi interface assigned to the specific argument you are defining. In the example above, the argument A uses the m00_axi interface, and the argument B uses the m01_axi interface.

  5. At this point you should be ready to package your IP.
    1. Select the Review and Package section of the Package IP window, review the Summary and After Packaging sections, and make whatever changes are needed.
      IMPORTANT: You must enable the generation of an IP archive file. If the After Packaging section indicates An archive will not be generated., you must select the Edit packaging settings link and enable the Create archive of IP setting.
    2. When you are ready, click Package IP.

      The Vivado tool packages your kernel IP, automatically runs the package_xo command as needed to produce the XO file, and opens a dialog box to inform you of success.

      The generated XO file for the RTL kernel can be used by the Vitis compiler during the linking process to connect to other HLS or RTL kernels, and for linking with the target platform to complete the system. Refer to Building and Running the Application for more information.

    3. If your RTL kernel has some custom features that are not standard for the package_xo command that is run automatically, you can run the command manually to regenerate the XO file and kernel with custom settings. Refer to package_xo Command for details of the command. Some specific reasons why you may need to manually run the package_xo command include:
      • Specify a different IP directory or XO path
      • Output a copy of the kernel.xml file to edit and reuse later
      • Include a C-model using the package_xo -kernel_files option to enable software emulation for your RTL kernel
  6. Optional: Test the Packaged IP.

    To test if the RTL kernel is packaged correctly for the IP integrator, try to instantiate the packaged kernel IP into a block design in the IP integrator. For information on the tool, refer to Vivado Design Suite User Guide: Designing IP Subsystems Using IP Integrator (UG994).

    The kernel IP should show the various interfaces described above. Examine the IP in the canvas view. The properties of the AXI interface can be viewed by selecting the interface on the canvas. Then in the Block Interface Properties window, select the Properties tab and expand the CONFIG table entry. If an interface is to be read-only or write-only, the unused AXI channels can be removed and the READ_WRITE_MODE is set to read-only or write-only.

  7. Optional: Configure Design Constraints.

    If the RTL kernel has design constraints (.xdc) which refer to elements of the static region of the platform, such as clocks, then the constraint file needs to be marked as late processing order to ensure RTL kernel constraints are correctly applied.

    There are two methods to mark constraints for late processing:

    1. If the constraints are given in a .ttcl file, add <: setFileProcessingOrder "late" :> to the .ttcl preamble section of the file as follows:
      <: set ComponentName [getComponentNameString] :>
      <: setOutputDirectory "./" :>
      <: setFileName $ComponentName :>
      <: setFileExtension ".xdc" :>
      <: setFileProcessingOrder "late" :>
      
    2. If constraints are defined in an .xdc file, then add the following four lines starting at <spirit:define> in the component.xml. The four lines in the component.xml need to be next to the area where the .xdc file is called. In the following example, my_ip_constraint.xdc file is being called with the subsequent late processing order defined.
      <spirit:file>
              <spirit:name>ttcl/my_ip_constraint.xdc</spirit:name>
              <spirit:userFileType>ttcl</spirit:userFileType>
              <spirit:userFileType>USED_IN_implementation</spirit:userFileType>
              <spirit:userFileType>USED_IN_synthesis</spirit:userFileType>
              <spirit:define>
                   <spirit:name>processing_order</spirit:name>
                   <spirit:value>late</spirit:value>
              </spirit:define>
      </spirit:file>

Design Recommendations for RTL Kernels

While the RTL Kernel Wizard assists in packaging RTL designs for use within the Vitis core development kit, the underlying RTL kernels should be designed with recommendations from the UltraFast Design Methodology Guide for Xilinx FPGAs and SoCs (UG949).

In addition to adhering to the interface and packaging requirements, the kernels should be designed with the following performance goals in mind:

Memory Performance Optimizations for AXI4 Interface

The AXI4 interfaces typically connects to DDR memory controllers in the platform.

Note: For optimal frequency and resource usage, it is recommended that one interface is used per memory controller.

For best performance from the memory controller, the following is the recommended AXI interface behavior:

  • Use an AXI data width that matches the native memory controller AXI data width, typically 512-bits.
  • Do not use WRAP, FIXED, or sub-sized bursts.
  • Use burst transfer as large as possible (up to 4k byte AXI4 protocol limit).
  • Avoid use of deasserted write strobes. Deasserted write strobes can cause error-correction code (ECC) logic in the DDR memory controller to perform read-modify-write operations.
  • Use pipelined AXI transactions.
  • Avoid using threads if an AXI interface is only connected to one DDR controller.
  • Avoid generating write address commands if the kernel does not have the ability to deliver the full write transaction (non-blocking write requests).
  • Avoid generating read address commands if the kernel does not have the capacity to accept all the read data without back pressure (non-blocking read requests).
  • If a read-only or write-only interfaces are desired, the ports of the unused channels can be commented out in the top level RTL file before the project is packaged into a kernel.
  • Using multiple threads can cause larger resource requirements in the infrastructure IP between the kernel and the memory controllers.

Quality of Results Considerations

The following recommendations help improve results for timing and area:

  • Pipeline all reset inputs and internally distribute resets avoiding high fanout nets.
  • Reset only essential control logic flip-flops.
  • Consider registering input and output signals to the extent possible.
  • Understand the size of the kernel relative to the capacity of the target platforms to ensure fit, especially if multiple kernels will be instantiated.
  • Recognize platforms that use stacked silicon interconnect (SSI) technology. These devices have multiple die and any logic that must cross between them should be flip-flop to flip-flop timing paths.

Debug and Verification Considerations

  • RTL kernels should be verified in their own test bench using advanced verification techniques including verification components, randomization, and protocol checkers. The AXI Verification IP (VIP) is available in the Vivado IP catalog and can help with the verification of AXI interfaces. The RTL kernel example designs contain an AXI VIP-based test bench with sample stimulus files.
  • Hardware emulation should be used to test the host code software integration or to view the interaction between multiple kernels.