OpenCL Kernels
The following OpenCL kernel discussion is
based on the information provided in the C/C++ Kernels
topic. The same programming techniques for accelerating the performance of a kernel apply to
both C/C++ and OpenCL kernels. However, the OpenCL kernel uses the __attribute
syntax in place of pragmas. For details of the available attributes,
refer to OpenCL Attributes.
The following code examples show some of the elements of an OpenCL kernel for the Vitis application acceleration development flow. This is not intended to be a primer on OpenCL or kernel development, but to merely highlight some of the key difference between OpenCL and C/C++ kernels.
Kernel Signature
In C/C++ kernels, the kernel is identified on the Vitis compiler command line using the v++ --kernel
option. However, in OpenCL code, the __kernel
keyword
identifies a kernel in the code. You can have multiple kernels defined in a single .cl
file, and the Vitis
compiler will compile all of the kernels, unless you specify the --kernel
option to identify which kernel to compile.
__kernel __attribute__ ((reqd_work_group_size(1, 1, 1)))
void apply_watermark(__global const TYPE * __restrict input,
__global TYPE * __restrict output, int width, int height) {
{
...
}
apply_watermark
, can be found
in the Global Memory Two Banks (CL) example in theVitis Accel Examples GitHub repository.In the example above, you can see the watermark kernel has two pointer
type arguments: input
and output
, and has two scalar type int arguments: width
and height
.
In C/C++ kernels, these arguments would need to be identified with the
HLS INTERFACE
pragmas. However, in the OpenCL kernel, the Vitis compiler, and
Vitis HLS recognize the kernel arguments, and compile
them as needed: pointer arguments into m_axi
interfaces,
and scalar arguments into s_axilite
interfaces.
Kernel Optimizations
Because the kernel is running in programmable logic on the target platform, optimizing your
task to the environment is an important element of application design. Most of the
optimization techniques discussed in C/C++ Kernels can be applied to
OpenCL kernels. Instead of applying the HLS pragmas
used for C/C++ kernels, you will use the __attribute__
keyword described in
OpenCL Attributes. Following is an example:
// Process the whole image
__attribute__((xcl_pipeline_loop))
image_traverse: for (uint idx = 0, x = 0 , y = 0 ; idx < size ; ++idx, x+= DATA_SIZE)
{
...
}
The example above specifies that the for
loop, image_traverse
, should be pipelined to improve the
performance of the kernel. The target II in this case is 1. For more information, refer to
xcl_pipeline_loop.
In the following code example, the watermark function uses the opencl_unroll_hint
attribute to let the Vitis compiler unroll the loop to reduce latency and improve
performance. However, in this case the __attribute__
is
only a suggestion that the compiler can ignore if needed. For details, refer to opencl_unroll_hint.
//Unrolling below loop to process all 16 pixels concurrently
__attribute__((opencl_unroll_hint))
watermark: for ( int i = 0 ; i < DATA_SIZE ; i++)
{
...
}
For more information, review the OpenCL Attributes topics to see what specific optimizations are supported for OpenCL kernels, and review the C/C++ Kernels content to see how these optimizations can be applied in your kernel design.