Detailed Kernel Trace

The SDAccel™ Development Environment can generate a detailed kernel trace view when running hardware emulation. It displays in-depth details on the emulation results at system level, compute unit (CU) level, and at function level. The details include data transfers between the kernel and global memory, data flow via inter-kernel pipes as well as data flow via intra-kernel pipes. They provide many insights into the performance bottleneck from the system level down to individual function call to help developers optimize their applications.

Below is a snapshot of the detailed kernel trace view from running hardware emulation of the median filter example.

The detailed kernel trace view is organized hierarchically for easy navigation. Below is the hierarchy tree and descriptions:

Device xilinx:adm-pcie-ku3:2ddr:3.3: target device name. This device has two memory channels. 
  Binary Container binary_container_1: binary container name
    Kernel Data Transfer: data transfers for all kernels
	m_axi: data transfers on memory channel 0
          Read
          Write
	m_axi1: data transfers on memory channel 0
          Read
          Write  
    Kernel <kernel> <local_size>: kernel name in the binary container 
      Compute Unit: <cu1> : compute unit name
        CU Stalls: stall information on the compute unit
        Data Transfers: data transfers on the compute unit
          Read
          Write
        User Functions: user functions in the CU
          <func1>: user function name 
            Function Stalls: stall information on the function 
            Function I/O: data transfers on the function