Compiling an AI Engine Graph Application

This chapter describes all the command line options passed to the AI Engine compiler (aiecompiler). It takes the code for the data flow graph, the code for individual kernels, and produces an image that can be run on various AI Engine target platforms such as simulators, emulators, and AI Engine devices. The AI Engine compiler statically compiles the graph, map and place the kernels in the AI Engines.

TIP: Unless otherwise specified, all the input file paths are with respect to the current directory, and all the output file paths are with respect to the Work directory.

The AI Engine graph and kernels can be compiled individually, or as a standalone application to run in the AI Engine processor array through emulation or hardware. The graph and kernels can also be used as part of a larger system design, that incorporates the AI Engine graph with ELF application running on the embedded processor of the Versal™ device and programmable logic (PL) kernels running in the programmable logic of the device. The AI Engine compiler is used to compile the graph and kernels, whether in a standalone configuration or as part of a larger system.

As shown in Using the Vitis IDE, the Vitis™ IDE can be used to create and manage project build settings, and run the AI Engine compiler. Alternatively, you can build the project from the command line as discussed in Integrating the Application Using the Vitis Tools Flow, or in a script or Makefile. Either approach lets you perform simulation or emulation to verify the graph application or the integrated system design, debug the design in an interactive debug environment, and build your design to run and deploy on hardware. Whatever method you choose to work with the tools, start by setting up the environment.

Setting Up the Vitis Tool Environment

The AI Engine tools are delivered and installed as part of the Vitis unified software platform. Therefore, when preparing to run the AI Engine tools, for example, the AI Engine compiler and AI Engine simulator, you must set up the Vitis tools. The Vitis unified software platform includes two elements that must be installed and configured, along with a valid Vitis tools license, to work together properly.

  • Vitis tools and AI Engine tools
  • A target Vitis platform such as the xilinx_vck190_base_202110_1 platform used for AI Engine applications

For more information, see Vitis Unified Software Platform Documentation: Application Acceleration Development (UG1393).

When the elements of the Vitis software platform are installed, set up the environment to run in a specific command shell by running the following scripts.

#setup XILINX_VITIS and XILINX_HLS variables
source <Vitis_install_path>/Vitis/<version>/settings64.csh
TIP: settings64.sh and setup.sh scripts are also provided in the same directory.

Finally, define the location of the available target platforms to use with the Vitis IDE and AI Engine tools using the following environment variable:

setenv PLATFORM_REPO_PATHS <path to platforms>
TIP: The PLATFORM_REPO_PATHS environment variable points to directories containing platform files (XPFM). This lets you specify platforms using just the folder name for the platform.

You can validate the installation and setup of the tools by using one of the following commands.

which vitis
which aiecompiler

You can validate the platform installation by using the following command.

platforminfo --list

Inputs

The AI Engine compiler takes inputs in several forms and produces executable applications for running on an AI Engine device. The command line for running AI Engine compiler is as follows:

aiecompiler [options] <Input File>

where:

  • <Input File> specifies the data flow graph code that defines the main() application for the AI Engine graph. The input flow graph is specified using a data flow graph language. Refer to Creating a Data Flow Graph (Including Kernels) for a description of the data flow graph.

An example AI Engine compiler command:

aiecompiler --verbose --pl-freq=100 --workdir=./myWork --platform=xilinx_vck190_202110_1.xpfm\
--include="./" --include="./src" --include="./src/kernels" --include="./data" --include="${XILINX_HLS}/include"  \
./src/graph.cpp

Some additional input options for the command line can include the following:

  • --constraints=<jsonfile> to specify constraints such as location or placement bounding box.

Outputs

By default the AI Engine compiler writes all outputs to a directory called Work and a file libadf.a, where Work is a sub-directory of the current directory where the tool was launched and libadf.a is a file used as an input for the Vitis compiler created in the same directory as the AI Engine compiler was launched from. The type of output and contents of the output directory depend on the --target specified, as described in AI Engine Compiler Options. For more information about the Vitis compiler, see Vitis Compiler Command in the Application Acceleration Development flow of the Vitis Unified Software Platform Documentation (UG1416).

TIP: You can specify a different output directory using the --workdir option.

The structure and contents of the ./Work directory are described in the following table.

Table 1. Work Directory Structure
Directory/Files Description
./Work/
<name>.aiecompile_summary A generated file that can be opened in Vitis analyzer to see a compilation summary.
config/scsim_config.json A JSON script that specifies options to the SystemC simulator. It includes AI Engine array tile geometry, input/output file specifications, and their connections to the stream switches.
arch/
logical_arch_aie.larch This is a JSON file describing the hardware requirements of the AI Engine application.
aieshim_constraints.json If present, this JSON file represents the user-defined physical interface constraints between AI Engine array and programmable logic provided through the AI Engine application.
aieshim_solution.aiesol This is a JSON file describing the mapping from logical to physical channels crossing the interface between the AI Engine array and the programmable logic.
cfgraph.xml This is an XML file describing the hardware requirements of the AI Engine application. This is used by the Vitis tools flow.
aie/
Makefile A Makefile to compile code for all AI Engines.
<n>_<m>/ These are individual AI Engine compilation directories.
Release/ Synopsys release directory for the AI Engine including ELF file.
<n>_<m>.lst Microcode of the kernel at <n>_<m>.
<n>_<m>.map Shows the memory mapping of the kernel at<n>_<m>. It also includes the memory size, width, and offset.
scripts/ Synopsys compiler project and linker scripts.
src/ Source files for the processor including kernels and main.
ps/c_rts/ Directory containing C-based run-time control for modeling PS interaction.
aie_control.cpp This is the AI Engine control code generated implementing the init,run,end graph APIs for the specific graph objects present in the program. This file is linked with the application main to create a PS thread for the simulator and bare metal.
aie_control_xrt.cpp This is the AI Engine control code generated implementing the init,run,end graph APIs for the specific graph objects present in the program. This file is linked with the application main to create a PS thread for the Linux application.
systemC/ Directory containing SystemC models for PS main.
Makefile A Makefile to compile all PS SystemC models.
generated-source/ SystemC wrappers for PS main.
generated-objects/ Compiled shared libraries for PS main.
ps/cdo/ Directory containing generator code for graph configuration and initialization in configuration data object (CDO) format. This is used during SystemC-RTL simulation and during actual hardware execution.
Makefile A Makefile to compile graph CDO
generateAIEConfig A bash script for building graph CDO
generated-sources/ C++ program to generate CDO.
generated-objects/ Compiled program to generate CDO.
pthread/
PthreadSim.c A source-to-source translation of the input data flow graph into a C program implemented using pthreads.
sim.out The GCC compiled binary for PthreadSim.c.
reports/
<graph>_mapping_analysis_report.txt Mapping report describing allocation of kernels to AI Engines and window buffers to AI Engine memory groups.
<graph>.png A bitmap file showing the kernel graph connectivity and partitioning over AI Engines.
<graph>.xpe An XML file describing the estimated power profile of the graph based on hardware resources used. This file is used with the Xilinx® Power Estimator (XPE) tool.
sync_buffer_address.json Shows kernel sync buffer addresses with local and global addresses.
lock_allocation_report.json Describes the ports and what locks and buffers are associated with the kernels.
dma_lock_report.json Shows DMA locks for inputs/outputs to the AI Engine as well as the kernel(s) they connect to with buffer information.
temp/ This directory contains some temporary files generated by the AI Engine compiler that can be useful in debugging. In addition, the CF graph .o file is also created here by default.

AI Engine Compiler Options

Table 2. AI Engine Options
Option Name Description
--constraints=<string> Constraints (location, bounding box, etc.) can be specified using a JSON file. This option lets you specify one or more constraint files.
--heapsize=<int> Heap size (in bytes) used by each AI Engine

The stack, heap, and sync buffer (32 bytes, includes the graph run iteration number information) are allocated up to 32768 bytes of data memory. The default heap size is set to 1024 bytes. Before changing the heap size to a different value, ensure that the sum of the stack, heap, and sync buffer sizes does not exceed 32768 bytes.

Used for allocating any remaining file-scoped data that is not explicitly connected in the user graph.

--stacksize=<int> Stack size (in bytes) used by each AI Engine

The stack, heap, and sync buffer (32 bytes) are allocated up to 32768 bytes of data memory. The default stack size is set to 1024 bytes. Before changing the stack size to a different value, ensure that the sum of the stack, heap, and sync buffer sizes does not exceed 32768 bytes.

Used as a standard compiler calling convention including stack-allocated local variables and register spilling.

--pl-freq=<value> Specifies the interface frequency (in MHz) for all PLIOs. The default frequency is a quarter of the AI Engine frequency and the maximum supported frequency is half of the AI Engine frequency. The PL frequency specific to each interface is provided in the graph.
--pl-register-threshold=<value> Specifies the frequency (in MHz) threshold for registered AI Engine-PL crossings. The default frequency is one-eighth of the AI Engine frequency dependent on the specific device speed grade.
Note: Values above a quarter of the AI Engine array frequency are ignored, and a quarter is used instead.
Table 3. CDO Options
Option Name Description
--enable-ecc-scrubbing Enable ECC Scrubbing on all the AI Engines used. This option enables ECC Scrubbing when generating the AI Engine ELF CDO. (One performance counter per core is used.) ECC Scrubbing is turned on (true) by default.
Table 4. Compiler Debug Options
Option Name Description
--kernel-linting Perform consistency checking between graphs and kernels. The default is false.
--log-level=<int> Log level for verbose logging (0: no logging, 5: all debug messages). The default level is 1.
Note: The default level with -–verbose is 5.
--verbose Verbose output of the AI Engine compiler emits compiler messages at various stages of compilation. These debug and tracing logs provide useful messages regarding the compilation process.
Table 5. Execution Target Options
Option Name Description
--target=<hw|x86sim> The AI Engine compiler supports several build targets (default: hw):
  • The hw target produces a libadf.a for use in the hardware device on a target platform.
  • The x86sim target compiles the code for use in the x86 simulator as described in x86 Functional Simulator.
Table 6. File Options
Option Name Description
--include=<string> This option can be used to include additional directories in the include path for the compiler front-end processing.

Specify one or more include directories.

--output=<string> Specifies an output.json file that is produced by the front end for an input data flow graph file. The output file is passed to the back-end for mapping and code generation of the AI Engine device. This is ignored for other types of input.
--platform=<string>

This is a path to a Vitis platform file that defines the hardware and software components available when doing a hardware design and its RTL co-simulation.

--workdir=<string>

By default, the compiler writes all outputs to a sub-directory of the current directory, called Work. Use this option to specify a different output directory.

Table 7. Generic Options
Option Name Description
--help List the available AI Engine compiler options, sorted in the groups listed here.
--help-list Display an alphabetic list of AI Engine compiler options.
--version Display the version of the AI Engine compiler.
Table 8. Miscellaneous Options
Option Name Description
--no-init This option disables initialization of window buffers in AI Engine data memory. This option enables faster loading of the binary images into the SystemC-RTL co-simulation framework.
TIP: This does not affect the statically initialized lookup tables.
--nodot-graph By default, the AI Engine compiler produces .dot and .png files by default to visualize the user-specified graph and its partitioning onto the AI Engines. This option can be used to eliminate the dot graph output.
Table 9. Module Specific Options
Option Name Description
--Xchess=<string> Can be used to pass kernel specific options to the CHESS compiler that is used to compile code for each AI Engine.

The option string is specified as <kernel-function>:<optionid>=<value>. This option string is included during compilation of generated source files on the AI Engine where the specified kernel function is mapped.

--Xelfgen=<string> Can be used to pass additional command-line options to the ELF generation phase of the compiler, which is currently run as a make command to build all AI Engine ELF files.

For example, to limit the number of parallel compilations to four, you write -Xelfgen="-j4".

Note: If during compilation you see errors with bad_alloc in the log, or if the Vitis IDE crashes, this could be due to insufficient memory on your workstation. A possible workaround (other than increasing the available memory on your machine) is to limit the parallelism used by the compiler during code generation phase. This can be specified in the GUI as the compiler CodeGen option -j1 or -j2, or on the command line as -Xelfgen=-j1 or -Xelfgen=-j2.
--Xmapper=<string> Can be used to pass additional command-line options to the mapper phase of the compiler. For example:
--Xmapper=DisableFloorplanning

These are options to try when the design is either failing to converge in the mapping or routing phase, or when you are trying to achieve better performance via reduction in memory bank conflict.

See the Mapper and Router Options for a list and description of options.

--Xpreproc=<string> Pass general option to the PREPROCESSOR phase for all source code compilations (AIE/PS/PL/x86sim). For example:
--Xpreproc=-D<var>=<value>
--Xpslinker=<string> Pass general option to the PS LINKER phase. For example:
--Xpslinker=-L<libpath> -l<libname>
--Xrouter=<string> Pass general option to the ROUTER phase. For example:
-Xrouter=enableSplitAsBroadcast
Note: Only AI Engine kernels that have been modified are recompiled in subsequent compilations of the AI Engine graph. Any un-modified kernels will not be recompiled.
Table 10. Optimization Options
Option Name Description
--xlopt=<int> Enable a combination of kernel optimizations based on the opt level (allowed values are 0 to 2, default is 1).
  • xlopt=1
    • Automatic computation of heap size: Enables ease of use using kernel analysis to automatically compute the heap requirements for each AI Engine. Therefore you do not need to specify the heap size.
    • Guidance: Guidance is provided to allow the mapper to optimally allocate large global arrays thus minimizing memory conflicts.
  • xlopt=2
    • Automatic inline: Automatically inlines functions if it is practical and possible to do so, even if the functions are not declared as __inline or inline.
    • Pragma insertion: Insert pragmas to kernel code. Apply --Xxloptstr="-annotate-pragma" and --xlopt=2 to enable pragma insertion functionality.
      Note: Compiler optimization (xlopt > 0) reduces debug visibility.
--Xxloptstr=<string> Option string to enable/disable optimizations in xlopt level 2.
  • -xlinline-threshold=T: set the automatic inlining threshold to T (default T = 5000)
  • -annotate-pragma: automatic insertion of loop unrolling, pipelining, and flattening pragma (default = false)

Mapper and Router Options

Table 11. Mapper Options
Options Description
DisableFloorplanning This option disables the auto-floor-planning phase in the mapper. This option is useful for heavily constrained designs where you want to guide mapping phase by using location constraints.
BufferOptLevel[1-9] These options can be used to improve throughput by reducing memory bank conflicts. At higher BufferOptLevel, mapper tries to reduce number of buffers getting mapped into same memory bank, thereby reducing the probability of bank conflicts affecting overall performance. Higher BufferOptLevels can increase the size of the overall mapped region, and in few cases, can fail to find a solution. The default of BufferOptLevels is BufferOptLevel0.
disableSeparateTraceSolve Default trace behavior forces the AI Engine mapper to keep all PLIOs/GMIOs in the original design location when using the trace debug feature. However, if the original solution did not leave any room for trace GMIOs, no solution will be possible unless the design PLIOs are moved. This option is to be used in this case.
Note: You can recirculate the previous design placement in your next compilation. This significantly reduces the mapper run time. When the compiler runs, it generates a placement constraints file, graph_aie_mapped.aiecst, in the Work/temp directory. Xilinx recommends that you save Work/temp/graph_aie_mapped.aiecst if you want to use it in subsequent compilations because the Work folder is regenerated for every compilation. This constraint file can be specified on the command line for the next iteration.
aiecompiler --constraints Work/temp/graph_aie_mapped.aiecst src/graph.cpp
TIP: The mapper is not aware of the 16K program memory per core limitation. One workaround is to change the run-time usage specification to map kernels to different cores.
Table 12. Router Options
Options Description
enableSplitAsBroadcast This option treats all split nets from a split node as a single net, with 100% usage broadcasting to multiple points. This broadcast net does not share resources with any other packet switched net in the design. This option can be used when throughput degradation is observed due to interference on packetstream nets after split node.
dmaFIFOsInFreeBankOnly This option ensures DMA FIFOs are only inserted into memory banks that have no other buffers mapped. This option can be used when memory stalls are observed due to DMA FIFO buffers being accessed at the same time as some other design buffer placed in the same bank.
disableSSFifoSharing Disables the ability of the router to share stream switch FIFOs among two or more terminals of a net. This option should only be used when there are not enough stream switch FIFOs in the device to give each terminal its own individual FIFO(s).

Viewing Compilation Results in the Vitis Analyzer

After the compilation of the AI Engine graph, the AI Engine compiler writes a summary of compilation results called <graph-file-name>.aiecompile_summary to view in the Vitis analyzer. The summary contains a collection of reports, and diagrams reflecting the state of the AI Engine application implemented in the compiled build. The summary is written to the working directory of the AI Engine compiler as specified by the --workdir option, which defaults to ./Work.

To open the AI Engine compiler summary, use the following command:

vitis_analyzer ./Work/graph.aiecompile_summary

The Vitis analyzer opens displaying the Summary page of the report. The Report Navigator view lists the different reports that are available in the Summary. For a complete understanding of the Vitis analyzer, see Using the Vitis Analyzer in the Application Acceleration Development flow of the Vitis Unified Software Platform Documentation (UG1416).

The listed reports include:

Summary
This is the top-level of the report, and reports the details of the build, such as date, tool version, a link to the graph, and the command-line used to create the build.
Kernel Guidance
Shows a variety of messages to provide guidance on kernel optimization.
Graph
Provides a flow diagram of the AI Engine graph that shows the data flow through the various kernels. You can zoom into and pan the graph display as needed. At the bottom of the Reports view, a table summarizes the graph with information related to kernels, buffers, ports, and nets. Clicking on objects in the graph diagram highlights the selected object in the tables. (See Graph and Array Details).
Array
Provides a graphical representation of the AI Engine processor array on the Versal device. The graph kernels and connections are placed within the context of the array. You can zoom into and select elements in the array diagram. Choosing objects in the array also highlights the object chosen in the tables at the bottom of the Reports view.
Note: The Graph and Array reports both share the same tables. When selecting an item in either of the views, it also selects it in both. For example, selecting a net in the Graph View, also selects it in the Array View.
Constraints
Shows all constraints used within the graph.
Mapping Analysis
Displays the text report graph_mapping_analysis_report.txt. Reports the block mapping, port mapping, and memory bank mapping of the graph to the device resources.
DMA Analysis
Displays the text report DMA_report.txt, providing a summary of DMA accesses from the graph.
Lock Allocation
Displays the text report Lock_report.txt, listing DMA locks on port instances.
Core Compilation
Shows the single kernel compilation log file.

The following figure shows the graph.aiecompile_summary report open in the Vitis analyzer, with the Array diagram displayed, an AI Engine kernel selected in the diagram and the table views, and the source code for the kernel displayed in the Source Code view.

Figure 1: Vitis Analyzer Graph Summary


Graph and Array Details

In the graph and array views in Vitis analyzer are several tables highlighting the details of the graph and kernels. The following sections provide a detailed description of each of the tables and the information available in the columns in each of the different tables.

Kernels

The Kernels table shows detailed information about the kernels used by the ADF graph. For example, the following figure shows two kernels, interpolator and classify. The following example code shows the fir_27t_sym_hb_2i and classifier kernel functions being instantiated as kernels in the graph.

interpolator = kernel::create(fir_27t_sym_hb_2i);
classify = kernel::create(classifier);
Figure 2: Kernels Table


Table 13. Column Description
Column Description
Graph Instance Shows a hierarchical view of the design graph along with the sub-graphs and kernels.
ID Unique ID given to the kernel from aiecompiler.
Kernel The kernel function name. This does not need to match the kernel instantiated name in the graph class. For example, the fir_27t_sym_hb_2i is the function name and instantiated as interpolator as seen in the preceding code.
Runs on Where the kernel runs. This is always AI Engine.
Source The kernel source file. Clicking this file name opens up the source file of the kernel.
Column The column in the AI Engine where the kernel is mapped.
Row The row in the AI Engine where the kernel is mapped.
Schedule The order in which kernels, if mapped to the same tile (same Column, Row) executes. A 0 means no scheduling is set.
Runtime Ratio The run-time ratio set in the graph by using runtime<ratio>(<kernel>) = n constraint.
Graph Source The source file (graph.h) with line number where the kernel is instantiated. Clicking on the link opens up the source file at the line number.

Programmable Logic (PL)

The PL table, as shown in the following figure, provides detailed information about the PLIO connections to the ADF graph. For example, in this figure, there are four PLIO objects associated with the graph. The name of the PLIO connection, the width of the PLIO data connection, and the simulation test bench file associated with each PLIO connection is provided in the example.

PLIO *in0 = new PLIO("DataIn1", adf::plio_32_bits,"data/input.txt");
PLIO *ai_to_pl = new PLIO("clip_in",adf::plio_32_bits, "data/output.txt"); 
PLIO *pl_to_ai = new PLIO("clip_out", adf::plio_32_bits,"data/input2.txt"); 
PLIO *out0 = new PLIO("DataOut1",adf::plio_32_bits, "data/output2.txt");
Figure 3: PL Table


Table 14. Column Description
Column Description
Name The port name of a PLIO connection and whether it is an input or output.
Data Width The data width of the PLIO connection defined in the constructor. The width can be either 32 bits, or 64 bits, or 128 bits.
Frequency (MHz) The frequency (in MHz) defined (optionally) in the PLIO constructor for the PLIO connection.
Buffers The number of buffers used in a PLIO connection. If a PLIO port is connected to a Window port of an AI Engine kernel two buffers are used, signifying a ping-pong buffer. A connection from a PLIO port to a stream port of the AI Engine kernel does not consume any buffers.
Connected Ports The number of ports the PLIO is connected to. This PLIO data can be multicasted to multiple destinations in the AI Engine. For more information see Multicast Support.
Column The interface column used by the PLIO, which is assigned by the aiecompiler. The values could be in the 0-49 range.
Channel The channel within the interface column used by the PLIO.
Packet ID The packet switching feature allows you to send packets of data to/from multiple destinations. These packets of data can be sent from/to the PL to/from the AI Engine. This column displays the ID of the packets used when packet switching is used. For more information see Explicit Packet Switching.

Buffers

The Buffers table contains information related to the buffers that are mapped to the ADF graph. Typically, buffers are used in window connections.
Note: The use of buf# and buf#d means it is a ping-pong buffer.
Figure 4: Buffers Table
Table 15. Column Description
Column Description
Name The name of the connection where the buffer is allocated.
ID The unique ID given to the buffer by the AI Engine compiler.
Type The type of buffer being used. This can either be Memory or Stream. The connection to a window uses a ping-pong buffer, and connection to a stream might use a DMA buffer.
Net The net with which the buffer is associated.
Column The column location of the tile where the buffer is mapped by the compiler.
Row The row location of the tile where the buffer is mapped by the compiler.
Bank The bank of the tile where the buffer is mapped. The banks are: 0,1,2, or 3.
Offset The address offset of the buffer with the bank.
Size The size of the buffer in bytes.
Lock ID Unique ID per buffer if placed in the bank.
Lock Name The unique name of the lock associated with the buffer. This can be used to debug a lock stall on a buffer.

Ports

The Ports table contains all the ports of the design which can be GMIO ports, PLIO ports and input, inout, and output ports on a kernel .

Note: FileIO is displayed as PLIO in the Vitis analyzer.
Figure 5: Ports Table
Table 16. Column Description
Column Details
Name The port name of the input, inout, output ports on a kernel, GMIO, or PLIO ports.
ID The unique ID the AI Engine compiler designates the port.
Type Port type. PLIO ports can contain Stream, Packet Switching, GMIO ports contain Global Memory, Function can contain Memory, or Stream.
Direction Port direction. Can be: IN, OUT, INOUT.
Data Type The type definition of the port for kernels. For example, input_window<int16>*, input_stream<int16>*.
Buffers The number of buffers instantiated for the connection. For streaming connection, no buffers are used. For a window connection, it is a ping-pong buffer.
Connected Ports The number of ports the specific port is connected to. Ports can multicast to more than one port. For more information see Multicast Support.

Nets

The Nets table shows details of the net connections made between the AI Engine kernels, or the AI Engine kernel and the PLIO/GMIO ports. For example, in the snippet of graph.cpp file shown in the following you can see examples of the connect constraint used to connect to the stream and window connections between the AI Engine kernels in the graph or to the PLIO/GMIO ports.

connect< window >(in, interpolator.in[0]);
connect< window, stream >(interpolator.out[0], clip_in);
connect< stream >(clip_out, classify.in[0]);
connect< window >(classify.out[0], out);
Figure 6: Nets Table


Table 17. Column Description
Column Description
Name The name of the net that is internally generated.
Variable The name of the net connection (which can be optionally specified in the connect constraint). <unnamed>.net# signifies that the connect<> has no unique naming as part of the connect constraint in the graph .
Source Graph Node The source node of the graph connection which could be a AI Engine kernel, PLIO or GMIO node.
Source Port The source port of the graph connection which could be a AI Engine kernel, PLIO or GMIO port.
Source ID The unique ID the aiecompiler designates the source port.
Destination Graph Node The destination node of the graph connection which could be a AI Engine kernel, PLIO or GMIO node.
Destination Port The destination port of the graph connection which could be a AI Engine kernel, PLIO or GMIO port.
Destination ID The unique ID that the AI Engine compiler designates the destination port.
Latency (Cycles) The minimum cycle count needed to transfer data from the source node to destination node.
FIFO Depth The FIFO memory allocated through routing resources in the net. This includes buffers configured as DMA FIFOs, stream switch ports, and stream switch FIFOs. The unit for FIFO depth is 32-bit words.
FIFO Depth Constraint This reflects the FIFO depth constraint provided in the design.
Buffers The number of buffers used by the net connection.
Switch Count The number of switches traversed by the net connection.
Switch FIFOs The number of stream switch FIFOs used by the net connection.

Tiles

The Tiles table shows all the tiles that have mapped kernels and buffers in the ADF graph. For example, in this design there are five tiles used, where two of them contain kernels (Tile [25,0], and Tile [25,4]), and three of them have buffers mapped (Tile[24,0], Tile[24,4], Tile[25,5]).

Figure 7: Tiles Table


Table 18. Column Description
Column Description
Tile The tile ID.
Column The column location of the tile.
Row The row location of the tile.
Kernels The number of kernels that are mapped to the tile.
Buffers The number of buffers mapped to the tile. This includes buffers on nets and buffers inside the kernel.

AI Engine Compiler Guidance

After the AI Engine compiler completes the compilation of an AI Engine design it analyzes the design and provides guidance on how to improve the design based on AI Engine rules or best software practices. Some guidance might be corrected by AI Engine compiler automatically. The guidance file lists all the findings with severity, category with tile number, details, correction if done by the AI Engine compiler and suggested resolutions.

The guidance file, guidance.html, is located in the Work/reports directory. Use a web browser to review this guidance file. It is recommended that you update your design per the design guidance report before running the design in the simulator or on hardware.

The following are examples of the AI Engine compiler-generated guidance available in Work/reports/guidance.html.

  • Variables are referenced before being initialized.

  • Global variables are initialized locally from a kernel implementation.

  • Array of data is not 128 bits (16 bytes) aligned.

  • Use of __restrict qualifier causes undefined behavior. This undefined behavior is exhibited running on hardware only.

    Note: The xlopt ≥1 option is required to compile a design that allows the AI Engine compiler to generate the appropriate guidance file.