work_group_size_hint
Description
The work-group size in the OpenCL standard defines the size of the ND range space that can be handled by a single invocation of a kernel compute unit. When OpenCL kernels are submitted for execution on an OpenCL device, they execute within an index space, called an ND range, which can have 1, 2, or 3 dimensions. See "OpenCL Execution Model" in SDAccel Environment Optimization Guide (UG1207) for more information.
OpenCL kernel functions are executed exactly one time for each point in the ND range index
space. This unit of work for each point in the ND range is called a work-item. Unlike
for
loops in C, where loop iterations are executed sequentially and
in-order, an OpenCL runtime and device is free to execute work-items in parallel and in any
order.
Work-items are organized into work-groups, which are the unit of work scheduled onto
compute units. The optional work_group_size_hint
attribute is part of the
OpenCL Language Specification, and is a hint to the compiler that indicates the work-group
size value most likely to be specified by the local_work_size
argument to
clEnqueueNDRangeKernel
. This allows the compiler to optimize the
generated code according to the expected value.
reqd_work_group_size
attribute instead of the
work_group_size_hint
is highly recommended as it can be used for
performance optimization during the generation of the custom logic for a kernel.Syntax
__attribute__((work_group_size_hint(X, Y, Z)))
- X, Y, Z: Specifies the ND range of the kernel. This represents each dimension of a three dimensional matrix specifying the size of the work-group for the kernel.
Examples
The following example is a hint to the compiler that the kernel will most likely be executed with a work-group size of 1:
__attribute__((work_group_size_hint(1, 1, 1)))
__kernel void
...
See Also
- SDAccel Environment Optimization Guide (UG1207)
- https://www.khronos.org/
- The OpenCL C Specification