Using C/C++ Kernels

The kernel for matrix multiplication can be expressed in C/C++ code that can be synthesized by the Vivado® HLS tool. For kernels captured in this way, the SDAccel™ Environment supports all of the optimization techniques available in Vivado HLS. The only thing that the user has to keep in mind is that expressing kernels in this way requires compliance with a specific function signature style.

IMPORTANT!: Global variables and printf() are not supported in HLS C/C++ kernels.
void mmult(int *a, int *b, int *output)
{
#pragma HLS INTERFACE m_axi port=a offset=slave bundle=gmem
#pragma HLS INTERFACE m_axi port=b offset=slave bundle=gmem
#pragma HLS INTERFACE m_axi port=output offset=slave bundle=gmem
#pragma HLS INTERFACE s_axilite port=a bundle=control
#pragma HLS INTERFACE s_axilite port=b bundle=control
#pragma HLS INTERFACE s_axilite port=output bundle=control
#pragma HLS INTERFACE s_axilite port=return bundle=control

  const int rank = 16;
  int running = 0;
  int bufa[256];
  int bufb[256];
  int bufc[256];
  memcpy(bufa, (int *) a, 256*4);
  memcpy(bufb, (int *) b, 256*4);

  for (unsigned int c=0;c<rank;c++){
    for (unsigned int r=0;r<rank;r++){
      running=0;
      for (int index=0; index<rank; index++) {
  #pragma HLS pipeline
        int aIndex = r*rank + index;
        int bIndex = index*rank + c;
        running += bufa[aIndex] * bufb[bIndex];
      }
      bufc[r*rank + c] = running;
    }
  }

  memcpy((int *) output, bufc, 256*4);
  return;
}

The preceding code example is the matrix multiplication kernel expressed in C/C++ for Vivado HLS. The first thing to notice about this code is the function signature.

void mmult(int *a, int *b, int *output)

This function signature is almost identical to the signature of the kernel expressed in OpenCL C. It is important to keep in mind that by default, kernels captured in C/C++ for HLS do not have any inherent assumptions on the physical interfaces that will be used to transport the function parameter data. HLS uses pragmas embedded in the code to direct the compiler as to which physical interface to generate for a function port. For the function to be treated as a valid OpenCL kernel, the ports on the C/C++ function must be defined on the memory and control interface pragmas for HLS.

The memory interface specification is generated by the following command:

#pragma HLS INTERFACE m_axi port=<variable name> offset=slave bundle=<interface name>

With each unique bundle name used, there is a separate AXI4 master interface created. An interface name is generated by the compiler with the following rules, taking, for example, a function argument called arg_name.

Using platforms version 4.x or earlier, the interface name M_AXI_ARG_NAME was used by making arg_name uppercase irrelevant of the original capitalization and prefixing with M_AXI_.

IMPORTANT!: Using current platforms (versions 5 or later) the interface name m_axi_arg_name is used: the original capitalization of arg_name must be lower case and prefixed by m_axi_.

This interface name is required for some advanced options to instruct the tools to connect the interface to a specific platform DDR memory interface.

The control interface specification is generated by the following command:

#pragma HLS INTERFACE s_axilite port=<variable name> bundle=<interface name>

Detailed information on how these pragmas are used is available in the SDx Pragma Reference Guide (UG1253).

When a kernel is defined in C++, use extern "C" { ... } around the functions targeted to be kernels. The use of extern "C" instructs the compiler/linker to use the C naming and calling conventions.

When using structs it is recommended that the struct has a size in bytes that is a power of two in total. Taking into consideration that the maximum bitwidth of the underlying interface is 512 bits or 64 bytes, the recommended size of the struct is 4, 8, 16, 32 or 64 bytes.

IMPORTANT!: To reduce the risk of misalignment between the host code and the kernel code it is recommended that the struct elements use types of the same size.
TIP: C++ arbitrary precision data types can be used for global memory pointers on a kernel. They are not supported for scalar kernel inputs that are passed by value.
When using the command line flow, the kernel name is passed to xocc using the option --kernel, for example:
xocc .. --kernel my_c_kernel

For more information about xocc command options, see Compiling Your OpenCL Kernel Using the Xilinx OpenCL Compiler (xocc)