Streaming Data Transfers

Streaming Data Transfers between Kernels (K2K)

The Vitis™ core development kit also supports streaming data transfer between two kernels. Consider the situation where one kernel is performing some part of the computation, and the second kernel completes the operation after receiving the output data from the first kernel. With kernel-to-kernel streaming support, data can move directly from one kernel to another without having to transmit back through the global memory. This results in a significant performance improvement.

Host Coding Guidelines

The kernel ports involved in kernel-to-kernel streaming do not require setup using the clSetKernelArg from the host code. All kernel arguments not involved in the streaming connection should be set up using clSetKernelArg as described in Setting Kernel Arguments. However, kernel ports involved in streaming will be defined within the kernel itself, and are not addressed by the host program.

Streaming Kernel Coding Guidelines

In a kernel, the streaming interface directly sending or receiving data to another kernel streaming interface is defined by hls::stream with the ap_axiu<D,0,0,0> data type. The ap_axiu<D,0,0,0> data type requires the use of the ap_axi_sdata.h header file.

IMPORTANT: Host-to-kernel and kernel-to-host streaming requires the use of the qdma_axis data type. Both the ap_axiu and qdma_axis data types are defined inside the ap_axi_sdata.h header file that is distributed with the Vitis software platform installation.

The following example shows the streaming interfaces of the producer and consumer kernels.

// Producer kernel - provides output as a data stream
// The example kernel code does not show any other inputs or outputs.

void kernel1 (.... , hls::stream<ap_axiu<32, 0, 0, 0> >& stream_out) {
      
  for(int i = 0; i < ...; i++) {
    int a = ...... ;         // Internally generated data
    ap_axiu<32, 0, 0, 0> v;  // temporary storage for ap_axiu
    v.data = a;              // Writing the data
    stream_out.write(v);         // Writing to the output stream.
  }
}
 
// Consumer kernel - reads data stream as input
// The example kernel code does not show any other inputs or outputs.

void kernel2 (hls::stream<ap_axiu<32, 0, 0, 0> >& stream_in, .... ) {
 
  for(int i = 0; i < ....; i++) {
    ap_axiu<32, 0, 0, 0> v = stream_in.read(); // Reading the input stream
    int a = v.data; // Extract the data
          
    // Do further processing
  }
}

Because the hls::stream data type is defined, the Vitis HLS tool infers axis interfaces. The following INTERFACE pragmas are shown as an example, but are not added to the code.

#pragma HLS INTERFACE axis port=stream_out
#pragma HLS INTERFACE axis port=stream_in
TIP: These example kernels show the definition of the streaming input/output ports in the kernel signature, and the handling of the input/output stream in the kernel code. The connection of kernel1 to kernel2 must be defined during the kernel linking process as described in Specifying Streaming Connections between Compute Units.

For more information on mapping streaming connections, refer to Building and Running the Application.

Free-Running Kernel

The Vitis core development kit provides support for one or more free-running kernels. Free-running kernels have no control signal ports, and cannot be started or stopped. The no-control signal feature of the free-running kernel results in the following characteristics:

  • The free-running kernel has no memory input or output port, and therefore it interacts with the host or other kernels (other kernels can be regular kernel or another free-running kernel) only through streams.
  • When the FPGA is programmed by the binary container (xclbin), the free-running kernel starts running on the FPGA, and therefore it does not need the clEnqueueTask command from the host code.
  • The kernel works on the stream data as soon as it starts receiving from the host or other kernels, and it stalls when the data is not available.
  • The free-running kernel needs a special interface pragma ap_ctrl_none inside the kernel body.

Host Coding for Free-Running Kernels

If the free-running kernel interacts with the host, the host code should manage the stream operation by clCreateStream/clReadStream/clWriteStream as discussed in Host Coding Guidelines. As the free-running kernel has no other types of inputs or outputs, such as memory ports or control ports, there is no need to specify clSetKernelArg. The clEnqueueTask is not used because the kernel works on the stream data as soon as it starts receiving from the host or other kernels, and it stalls when the data is not available.

Coding Guidelines for Free-Running Kernels

As mentioned previously, the free-running kernel only contains hls::stream inputs and outputs. The recommended coding guidelines include:

  • Use hls::stream<ap_axiu<D,0,0,0> > if the port is interacting with another stream port from the kernel.
  • Use hls::stream<qdma_axis<D,0,0,0> > if the port is interacting with the host.
  • Use the hls::stream data type for the function parameter causes Vitis HLS to infer an AXI4-Stream port (axis) for the interface.
  • The free-running kernel must also specify the following special INTERFACE pragma.
    #pragma HLS interface ap_ctrl_none port=return
TIP: ap_ctrl_none means there is no control interface for the kernel so typically there is no s_axilite interface generated. However, the presence of either scalar arguments or m_axi interfaces requires the use of an s_axilite interface.

The following code example shows a free-running kernel with one input and one output communicating with another kernel. The while(1) loop structure contains the substance of the kernel code, which repeats as long as the kernel runs.

void kernel_top(hls::stream<ap_axiu<32, 0, 0, 0> >& input, 
   hls::stream<ap_axiu<32, 0, 0, 0> >& output) {
#pragma HLS interface ap_ctrl_none port=return  // Special pragma for free-running kernel
 
#pragma HLS DATAFLOW // The kernel is using DATAFLOW optimization
	while(1) {
		...
	}
}
TIP: The example shows the definition of the streaming input/output ports in a free-running kernel. However, the streaming connection from the free-running kernel to or from another kernel must be defined during the kernel linking process as described in Specifying Streaming Connections between Compute Units.