RTL Kernels
In the Vitis application acceleration development flow, RTL IP from the Vivado® Design Suite can be packaged as XO files that can be linked into an FPGA executable (.xclbin), as long as they adhere to Vivado IP Packaging guidelines, and requirements of the Vitis compiler for linking the system.
As explained in Kernel Properties, RTL
kernels can be user-managed kernels that do not adhere to XRT requirements for execution
control, but rather implement any number of possible control schemes specified by existing RTL
designs. Alternatively, RTL kernels can adhere to the requirements of the
ap_ctrl_chain
or ap_ctrl_hs
control protocols needed for
XRT-managed kernels.
The following sections describe the kernel interface requirements for the Vitis compiler to link kernels into a system. These requirements are common to software controllable and non-software controlled kernels. The control requirements for XRT-managed kernels are also described, as well as any additional requirements. Finally, the development flow is described to help you package RTL IP in the Vivado® Design Suite as RTL kernels for use in the Vitis environment.
Requirements of an RTL Kernel
To be integrated into the Vitis tool flow, an RTL module must minimally meet the requirements enumerated in Kernel Interface Requirements. The need to meet the kernel interface requirements applies to both XRT-managed and user-managed kernels.
In addition, XRT-managed kernels must satisfy the requirements described in Control Requirements for XRT-Managed Kernels to be executed and profiled by XRT.
User-managed kernels must have the signal interfaces needed by the Vitis compiler to allow it to link the kernels to other kernels and to the target platform, but do not need to adhere to the strict execution protocol of XRT. In this way, existing RTL IP can be more rapidly and simply integrated into the Vitis environment.
It might be necessary to revise your RTL module to meet the kernel requirements outlined in the following sections.
Kernel Interface Requirements
To enable the Vitis compiler to connect kernels into the target platform, an RTL kernel must adhere to the requirements described in Kernel Properties. The various interface requirements are summarized in the following table.
Port or Interface | Description | Comment |
---|---|---|
Clock | One or more clock inputs. |
|
Reset | Primary active-Low reset input port |
|
interrupt | Active-High interrupt. |
|
s_axi_control | One (and only one) AXI4-Lite slave control interface |
|
AXI4_Memory Mapped Interface (m_axi) | AXI4 memory mapped interfaces for global memory access |
|
AXI4_STREAM (axis) | AXI4-Stream interfaces for one-way data transfers between kernels or between the host application and kernels. |
|
|
Control Requirements for XRT-Managed Kernels
s_axilite
interface
as discussed in Creating User-Managed RTL Kernels. If your RTL module
implements a different control structure, you can define it as a user_managed
kernel or it must be adapted to conform to the XRT-managed requirements
described here.The following table outlines the required register map for an XRT-managed
kernel to be used within the Vitis tools and XRT. The
control register is required by kernels that specify ap_ctrl_hs
and ap_ctrl_chain
control protocols as
described in Execution Modes. Kernels that implement ap_ctrl_none
and user_managed
control protocols do
not require the control registers described below.
All user-defined registers must begin at location 0x10
; locations below this are reserved. These include registers for kernel arguments
such as scalar values and address offsets passed to memory mapped interfaces.
Offset | Name | Description |
---|---|---|
0x0 | Control | Controls and provides kernel status. |
0x4 | Global Interrupt Enable | Used to enable interrupt to the host. |
0x8 | IP Interrupt Enable | Used to control which IP generated signals are used to generate an interrupt. |
0xC | IP Interrupt Status | Provides interrupt status. |
0x10 | Kernel arguments | This would include scalars and global memory arguments for example. |
The following table shows the control signals that are accessed through the
control register (offset 0x0
). The control register and its
signals are determined by the kernel execution mode, ap_ctrl_hs
and ap_ctrl_chain
.
The available signals are used by the different control protocols as
explained in Supported Kernel Execution Models in the XRT documentation. For
example, for the sequential execution mode ap_ctrl_hs
the host
typically writes 0x00000001
to the offset 0 control register
which sets Bit 0, clears Bits 1 and 2, and polls on reading ap_done signal until it is a 1.
Bit | Name | Description |
---|---|---|
0 | ap_start | Asserted when the kernel can start processing data. Cleared on handshake with ap_done being asserted. |
1 | ap_done | Asserted when the kernel has completed operation. Cleared on read. |
2 | ap_idle | Asserted when the kernel is idle. |
3 | ap_ready | Asserted by the kernel when it is ready to accept the new data |
4 | ap_continue | Asserted by the XRT to allow kernel keep running |
7 | auto_restart | Used to enable automatic kernel restart as described in Streaming Data in User-Managed Never-Ending Kernels. |
31:5 | Reserved | Reserved |
The following interrupt related registers are only required if the kernel has an interrupt.
Bit | Name | Description |
---|---|---|
0 | Global Interrupt Enable | When asserted, along with the IP Interrupt Enable bit, the interrupt is enabled. |
31:1 | Reserved | Reserved |
Bit | Name | Description |
---|---|---|
0 | Interrupt Enable | When asserted, along with the Global Interrupt Enable bit, the interrupt is enabled. |
31:1 | Reserved | Reserved |
Bit | Name | Description |
---|---|---|
0 | Interrupt Status | Toggle on write. |
31:1 | Reserved | Reserved |
Interrupt
XRT-managed RTL kernels can optionally have an interrupt port containing a single interrupt. The port name must be called
interrupt and be active-High. It is enabled when
both the global interrupt enable (GIE
) and interrupt
enable register (IER
) bits are asserted in the Control
Register block.
By default, the IER uses the internal ap_done signal to trigger an interrupt. Further, the interrupt is cleared only when writing a 1 to bit-0 of the IP Interrupt Status Register.
This logic should be reflected in the Verilog code for the RTL kernel, and
also in the associated component.xml and kernel.xml files. The kernel.xml file is stored inside the kernel.xo file and is generated automatically when using the package_xo
command or RTL Kernel Wizard.
Creating User-Managed RTL Kernels
If your RTL IP does not satisfy the AXI interface requirements for the
Vitis compiler as outlined in Kernel Interface Requirements, you must modify the IP to implement
the required interfaces. However, if your RTL IP does not satisfy the XRT control
protocols of ap_ctrl_hs
or ap_ctrl_chain
, you can define it as a user-managed kernel rather than
having to rewrite your IP.
A user-managed kernel does not need to satisfy the control requirements of XRT, and can implement any of a variety of execution mechanisms. User-managed kernels are meant to let you take advantage of the system building capabilities of the Vitis compiler, while letting your kernel implement your own control scheme. There is no prescribed method of starting or stopping, or otherwise controlling your kernel. This is largely up to you, and the specific requirements of your application or system. Some of the available control schemes include:
- Accessing registers through an
s_axilite
control interface, similar to the method used by XRT though open to your own implementation - Accessing the hardware through software drivers, such as UIO drivers, implemented in your host application
- Triggering the start or stop response of your kernel from a signal provided by a separate component, or from another kernel
- Providing a data-driven approach, as described in Streaming Data in User-Managed Never-Ending Kernels
s_axilite
interface for a user-managed kernel is that the control register cannot be named CTRL.
That name is specifically reserved for XRT-managed kernels, and returns a Critical
Warning when found on a user-managed kernel, or ap_ctrl_none
kernel.RTL Kernel Development Flow
This section explains the process of creating RTL kernels using the Package IP feature inside the Vivado Design Suite. The Package IP command provides a Package for Vitis option which greatly simplifies packaging an existing RTL IP as a Xilinx Object (XO) file for use in the Vitis environment.
The packaged XO file is a container encapsulating the Vivado IP object (including source files) and associated kernel XML file. Using the Vitis compiler, the XO file can be combined with other kernels, and linked with the target platform and built for hardware or hardware emulation flows.
The Package for
Vitis feature provides DRCs to check the completeness of the
packaged IP prior to generating the XO file, and also automates the package_xo
command to simplify the production of the
packaged RTL kernel.
Package the RTL Code as a Vitis XO
As discussed in Kernel Interface Requirements, the RTL kernel must be packaged with the following required interfaces:
- The AXI4-Lite interface name
must be packaged as
S_AXI_CONTROL
, but the underlying AXI ports can be named differently. - Any memory-mapped AXI4
interfaces must be packaged as AXI4 master
endpoints with 64-bit address support.Note: Xilinx strongly recommends that AXI4 interfaces be packaged with AXI meta data
HAS_BURST=0
andSUPPORTS_NARROW_BURST=0
. These properties can be set in an IP-level bd.tcl file. This indicates wrap and fixed burst type is not used, and narrow (sub-size burst) is not used. - You can also implement the AXI4-Stream interface.
- At least one clock is required for the kernel, though it can
support multiple clocks.
- Each clock must have an associated Bus Interface identifying it as a clock.
- Each clock can have an optional active-Low reset, specified by the ASSOCIATED_RESET property on the clock.
- A clock must be associated with each AXI4-Lite, AXI4, and AXI4-Stream interface on the kernel.
To package the IP, use the following steps:
- Create and package a new IP.
- From a Vivado project, with your RTL source files added, select .
- Select Package your current project, and click Next.
- Specify the location for your packaged IP. You can select the default location, or choose a different location.
- Review the Summary page and click Finish to open the Package IP window.
The Package IP window opens to display the Identification page. For details on working with the IP packager in the Vivado tool, refer to the Vivado Design Suite User Guide: Creating and Packaging Custom IP (UG1118).
- Select Compatibility under the
Packaging Steps.
This displays the dialog box as shown in the following figure.
- Select the Package for Vitis check box to enable the process of packaging the RTL IP as an XO for use in the Vitis environment.
- Select the Control
Protocol for the RTL Kernel. This determines the control
mechanism used to operate the kernel. The choices are:
user_managed
: Defines a SW-controllable kernel, that is user-managed rather than XRT-managed. This is the preferred option. Refer to Creating User-Managed RTL Kernels for additional information.ap_ctrl_hs
: This is the default, and specifies the simple sequential execution model for XRT -managed kernels as described in SW-Controllable Kernels.ap_ctrl_chain
: Specifies a pipelined execution model for XRT-managed kernels.ap_ctrl_none
: Indicates no control protocol as described in Non-Software Controlled Kernels.
- Check to ensure that both Package for IPI and Ignore Freq_Hz are enabled as well.
Enabling these check boxes enables design rule checks (DRC) that the
ipx::check_integrity
command runs prior to packaging the IP and generating the XO. The DRCs include checks for required signals as described in Requirements of an RTL Kernel, and checks for control protocols and registers for XRT-managed kernels . As shown in the figure above, any issues are reported to the Package IP tool as they are encountered. - Associate the clock to the AXI interfaces.
Select the Ports and Interfaces step of the Package IP window, you can associate the primary kernel clock with the AXI4 interfaces, and reset signal if needed.
- Right-click an AXI4 interface, and select Associate
Clocks.
This opens the Associate Clocks dialog box which lists any identified clock signals.
- Select the appropriate clock and click OK to associate it with the interface.
- Ensure to repeat this step to a clock signal with each of the AXI interfaces.
- Right-click an AXI4 interface, and select Associate
Clocks.
- Click the Addressing and Memory
step to add control registers and offsets.
XRT-managed kernels using the
ap_ctrl_hs
orap_ctrl_chain
control protocol require control registers as discussed in Control Requirements for XRT-Managed Kernels. The following table shows a list of the required registers.TIP: Whileap_ctrl_none
anduser_managed
control protocols do not require control registers, they can still use them if ans_axilite
interface is included as part of the RTL design. In this case, the specific registers can differ from the table below, but the process of assigningnames
,offsets
, andwidths
is the same.Table 7. Address Map Register Name Description Address Offset Size CTRL Control Signals. IMPORTANT: The CTRL register and <kernel_args> are required on all kernels. The interrupt related registers are only required for designs with interrupts.0x000 32 GIER Global Interrupt Enable Register. Used to enable interrupt to the host. 0x004 32 IP_IER IP Interrupt Enable Register. Used to control which IP generated signal are used to generate an interrupt. 0x008 32 IP_ISR IP Interrupt Status Register. Provides interrupt status. 0x00C 32 <kernel_args> This includes a separate entry for each kernel argument as needed on the software function interface. All user-defined registers must begin at location 0x10
; locations below this are reserved.0x010 32/64 Scalar arguments are 32-bits wide.
m_axi
andaxis
interfaces are 64 bits wide.- To create the address map described in the table,
right-click in the Address
Blocks and select the Add
Register command.
This opens the Add Register dialog box in which you can enter one of the register names from the table above.
- Repeat as needed to add all required registers.This creates a Registers table in the Addressing and Memory section. You can edit the table to add the Description, Address Offset, and Size to each register. The Registers table should look similar to the following example.TIP: The Tcl commands for each step of this process are written to the Tcl Console. You can use this fact to execute the process, and then use the Tcl transcript to create scripts to automate the process for future iterations.
- Finally, select the register for each of the pointer
arguments from your table, right-click and select the Add Register Parameter command. Enter
the name
ASSOCIATED_BUSIF
into the dialog box that opens, and click OK.This lets you define an association between the register and the AXI4 Interface. In the value field of the added parameter, enter the name of the
m_axi
interface assigned to the specific argument you are defining. In the example above, the argumentA
uses them00_axi
interface, and the argumentB
uses them01_axi
interface.
- To create the address map described in the table,
right-click in the Address
Blocks and select the Add
Register command.
- At this point you should be ready to package your IP.
- Select the Review and
Package section of the Package IP window, review the
Summary and After
Packaging sections, and make whatever changes
are needed.IMPORTANT: You must enable the generation of an IP archive file. If the After Packaging section indicates An archive will not be generated., you must select the Edit packaging settings link and enable the Create archive of IP setting.
- When you are ready, click Package IP.
The Vivado tool packages your kernel IP, automatically runs the
package_xo
command as needed to produce the XO file, and opens a dialog box to inform you of success.The generated XO file for the RTL kernel can be used by the Vitis compiler during the linking process to connect to other HLS or RTL kernels, and for linking with the target platform to complete the system. Refer to Building and Running the Application for more information.
- If your RTL kernel has some custom features that are not standard for the
package_xo
command that is run automatically, you can run the command manually to regenerate the XO file and kernel with custom settings. Refer to package_xo Command for details of the command. Some specific reasons why you may need to manually run thepackage_xo
command include:- Specify a different IP directory or XO path
- Output a copy of the kernel.xml file to edit and reuse later
- Include a C-model using the
package_xo -kernel_files
option to enable software emulation for your RTL kernel
- Select the Review and
Package section of the Package IP window, review the
Summary and After
Packaging sections, and make whatever changes
are needed.
- Optional: Test the Packaged IP.
To test if the RTL kernel is packaged correctly for the IP integrator, try to instantiate the packaged kernel IP into a block design in the IP integrator. For information on the tool, refer to Vivado Design Suite User Guide: Designing IP Subsystems Using IP Integrator (UG994).
The kernel IP should show the various interfaces described above. Examine the IP in the canvas view. The properties of the AXI interface can be viewed by selecting the interface on the canvas. Then in the Block Interface Properties window, select the Properties tab and expand the CONFIG table entry. If an interface is to be read-only or write-only, the unused AXI channels can be removed and the
READ_WRITE_MODE
is set to read-only or write-only. - Optional: Configure Design Constraints.
If the RTL kernel has design constraints (.xdc) which refer to elements of the static region of the platform, such as clocks, then the constraint file needs to be marked as late processing order to ensure RTL kernel constraints are correctly applied.
There are two methods to mark constraints for late processing:
- If the constraints are given in a .ttcl file, add
<: setFileProcessingOrder "late" :>
to the .ttcl preamble section of the file as follows:<: set ComponentName [getComponentNameString] :> <: setOutputDirectory "./" :> <: setFileName $ComponentName :> <: setFileExtension ".xdc" :> <: setFileProcessingOrder "late" :>
- If constraints are defined in an .xdc file, then add the following four
lines starting at
<spirit:define>
in the component.xml. The four lines in the component.xml need to be next to the area where the .xdc file is called. In the following example, my_ip_constraint.xdc file is being called with the subsequent late processing order defined.<spirit:file> <spirit:name>ttcl/my_ip_constraint.xdc</spirit:name> <spirit:userFileType>ttcl</spirit:userFileType> <spirit:userFileType>USED_IN_implementation</spirit:userFileType> <spirit:userFileType>USED_IN_synthesis</spirit:userFileType> <spirit:define> <spirit:name>processing_order</spirit:name> <spirit:value>late</spirit:value> </spirit:define> </spirit:file>
- If the constraints are given in a .ttcl file, add
Design Recommendations for RTL Kernels
While the RTL Kernel Wizard assists in packaging RTL designs for use within the Vitis core development kit, the underlying RTL kernels should be designed with recommendations from the UltraFast Design Methodology Guide for Xilinx FPGAs and SoCs (UG949).
In addition to adhering to the interface and packaging requirements, the kernels should be designed with the following performance goals in mind:
Memory Performance Optimizations for AXI4 Interface
The AXI4 interfaces typically connects to DDR memory controllers in the platform.
For best performance from the memory controller, the following is the recommended AXI interface behavior:
- Use an AXI data width that matches the native memory controller AXI data width, typically 512-bits.
- Do not use
WRAP
,FIXED
, or sub-sized bursts. - Use burst transfer as large as possible (up to 4k byte AXI4 protocol limit).
- Avoid use of deasserted write strobes. Deasserted write strobes can cause error-correction code (ECC) logic in the DDR memory controller to perform read-modify-write operations.
- Use pipelined AXI transactions.
- Avoid using threads if an AXI interface is only connected to one DDR controller.
- Avoid generating write address commands if the kernel does not have the ability to deliver the full write transaction (non-blocking write requests).
- Avoid generating read address commands if the kernel does not have the capacity to accept all the read data without back pressure (non-blocking read requests).
- If a read-only or write-only interfaces are desired, the ports of the unused channels can be commented out in the top level RTL file before the project is packaged into a kernel.
- Using multiple threads can cause larger resource requirements in the infrastructure IP between the kernel and the memory controllers.
Quality of Results Considerations
The following recommendations help improve results for timing and area:
- Pipeline all reset inputs and internally distribute resets avoiding high fanout nets.
- Reset only essential control logic flip-flops.
- Consider registering input and output signals to the extent possible.
- Understand the size of the kernel relative to the capacity of the target platforms to ensure fit, especially if multiple kernels will be instantiated.
- Recognize platforms that use stacked silicon interconnect (SSI) technology. These devices have multiple die and any logic that must cross between them should be flip-flop to flip-flop timing paths.
Debug and Verification Considerations
- RTL kernels should be verified in their own test bench using advanced verification techniques including verification components, randomization, and protocol checkers. The AXI Verification IP (VIP) is available in the Vivado IP catalog and can help with the verification of AXI interfaces. The RTL kernel example designs contain an AXI VIP-based test bench with sample stimulus files.
- Hardware emulation should be used to test the host code software integration or to view the interaction between multiple kernels.