Detailed Kernel Trace
The SDAccel™ Development Environment can generate a detailed kernel trace view when running hardware emulation. It displays in-depth details on the emulation results at system level, compute unit (CU) level, and at function level. The details include data transfers between the kernel and global memory, data flow via inter-kernel pipes as well as data flow via intra-kernel pipes. They provide many insights into the performance bottleneck from the system level down to individual function call to help developers optimize their applications.
Below is a snapshot of the detailed kernel trace view from running hardware emulation of the median filter example.
The detailed kernel trace view is organized hierarchically for easy navigation. Below is the hierarchy tree and descriptions:
Device xilinx:adm-pcie-ku3:2ddr:3.3: target device name. This device has two memory channels.
Binary Container binary_container_1: binary container name
Kernel Data Transfer: data transfers for all kernels
m_axi: data transfers on memory channel 0
Read
Write
m_axi1: data transfers on memory channel 0
Read
Write
Kernel <kernel> <local_size>: kernel name in the binary container
Compute Unit: <cu1> : compute unit name
CU Stalls: stall information on the compute unit
Data Transfers: data transfers on the compute unit
Read
Write
User Functions: user functions in the CU
<func1>: user function name
Function Stalls: stall information on the function
Function I/O: data transfers on the function