OpenCL Built-In Functions Support in the SDAccel Environment
The OpenCL™ C programming language provides a rich set of built-in functions for scalar and vector operations. Many of these functions are similar to the function names provided in common C libraries but they support scalar and vector argument types. The SDAccel™ development environment is OpenCL 1.0 embedded profile compliant. The following tables show descriptions of built-in functions in OpenCL 1.0 embedded profile and their support status in the SDAccel environment.
Work-Item Functions
Function | Description | Supported |
---|---|---|
get_global_size | Number of global work items | Yes |
get_global_id | Global work item ID value | Yes |
get_local_size | Number of local work items | Yes |
get_local_id | Local work item ID | Yes |
get_num_groups | Number of work groups | Yes |
get_group_id | Work group ID | Yes |
get_work_dim | Number of dimensions in use | Yes |
Math Functions
Function | Description | Supported |
---|---|---|
acos | Arc Cosine function | Yes |
acosh | Inverse Hyperbolic Cosine function | Yes |
acospi | acox(x)/PI | Yes |
asin | Arc Cosine function | Yes |
asinh | Inverse Hyperbolic Cosine function | Yes |
asinpi | Computes acos (x) / pi | Yes |
atan | Arc Tangent function | Yes |
atan2(y, x) | Arc Tangent of y / x | Yes |
atanh | Hyperbolic Arc Tangent function | Yes |
atanpi | Computes atan (x) / pi | Yes |
atan2pi | Computes atan2 (y, x) / pi | Yes |
cbrt | Compute cube-root | Yes |
ceil | Round to integral value using the round to +ve infinity rounding mode. | Yes |
copysign(x,
y)
|
Returns x with its sign changed to match the sign of y. | Yes |
cos | Cosine function | Yes |
cosh | Hyperbolic Cosine function | Yes |
cospi | Computes cos (x * pi) | Yes |
erf | The error function encountered in integrating the normal distribution | Yes |
erfc | Complementary Error function | Yes |
exp | base- e exponential of x | Yes |
exp2 | Exponential base 2 function | Yes |
exp10 | Exponential base 10 function | Yes |
expm1 | exp(x) - 1.0 | Yes |
fabs | Absolute value of a floating-point number | Yes |
fdim(x, y) | x - y if x > y, +0 if x is less than or equal to y. | Yes |
floor | Round to integral value using the round to -ve infinityrounding mode. | Yes |
fma(a, b, c) | Returns the correctly rounded floating-point representation of the sum of c with the infinitely precise product of a and b. Rounding of intermediate products shall not occur. Edge case behavior is per the IEEE 754-2008 standard. | Yes |
fmax(x, y) fmax(x, float y)
|
Returns y if x is less than y, otherwise it returns x. If one argument is a NaN, fmax() returns the other argument. If both arguments are NaNs, fmax() returns a NaN | Yes |
fmin(x, y)
fmin(x, float
y)
|
Returns y if y less than x, otherwise it returns x. If one argument is a NaN, fmax() returns the other argument. If both arguments are NaNs, fmax() returns a NaN. | Yes |
fmod | Modulus. Returns x - y * trunc (x/y) | Yes |
fract | Returns fmin( x - floor(x), 0x1.fffffep-1f ). floor(x) is returned in iptr | Yes |
frexp | Extract mantissa and exponent from x. For each component the mantissa returned is a float with magnitude in the interval [1/2, 1) or 0. Each component of x equals mantissa returned * 2exp. | Yes |
hypot | Computes the value of the square root of x2 + y2 without undue overflow or underflow. | Yes |
ilogb | Returns the exponent as an integer value. | Yes |
ldexp | Multiply x by 2 to the power n. | Yes |
lgamma | Returns the natural logarithm of the absolute value of the gamma function. The sign of the gamma function is returned in the signp argument of lgamma_r. | Yes |
lgamma_r | Returns the natural logarithm of the absolute value of the gamma function. The sign of the gamma function is returned in the signp argument of lgamma_r. | Yes |
log | Computes natural logarithm. | Yes |
log2 | Computes a base 2 logarithm | Yes |
log10 | Computes a base 10 logarithm | Yes |
log1p | loge(1.0+x) | Yes |
logb | Computes the exponent of x, which is the integral part of logr |x|. | Yes |
mad | Approximates a * b + c. Whether or how the product of a * b is rounded and how supernormal or subnormal intermediate products are handled is not defined. mad is intended to be used where speed is preferred over accuracy30 | Yes |
modf | Decompose a floating-point number. The modf function breaks the argument x into integral and fractional parts, each of which has the same sign as the argument. It stores the integral part in the object pointed to by iptr. | Yes |
nan | Returns a quiet NaN. The nancode may be placed in the significand of the resulting NaN. | Yes |
nextafter | Next representable floating-point value following x in the direction of y | Yes |
pow | Computes x to the power of y | Yes |
pown | Computes x to the power of y, where y is an integer. | Yes |
powr | Computes x to the power of y, where x is greater than or equal to 0. | Yes |
remainder | Computes the value r such that r = x - n*y, where n is the integer nearest the exact value of x/y. If there are two integers closest to x/y, n shall be the even one. If r is zero, it is given the same sign as x. | Yes |
remquo | Floating point remainder and quotient function. | Yes |
rint | Round to integral value (using round to nearest even rounding mode) in floating-point format. | Yes |
rootn | Compute x to the power 1/y. | Yes |
round | Return the integral value nearest to x rounding halfway cases away from zero, regardless of the current rounding direction. | Yes |
rsqrt | Inverse Square Root | Yes |
sin | Computes the sine | Yes |
sincos | Computes sine and cosine of x. The computed sine is the return value and computed cosine is returned in cosval. | Yes |
sinh | Computes the hyperbolic sine | Yes |
sinpi | Computes sin (pi * x). | Yes |
sqrt | Computes square root. | Yes |
tan | Computes the tangent. | Yes |
tanh | Computes hyperbolic tangent. | Yes |
tanpi | Computes tan(pi * x). | Yes |
tgamma | Computes the gamma. | Yes |
trunc | Round to integral value using the round to zero rounding mode. | Yes |
half_cos | Computes cosine. x must be in the range -216... +216. This function is implemented with a minimum of 10-bits of accuracy | Yes |
half_divide | Computes x / y. This function is implemented with a minimum of 10-bits of accuracy | Yes |
half_exp | Computes the base- e exponential of x. implemented with a minimum of 10-bits of accuracy | Yes |
half_exp2 | The base- 2 exponential of x. implemented with a minimum of 10-bits of accuracy | Yes |
half_exp10 | The base- 10 exponential of x. implemented with a minimum of 10-bits of accuracy | Yes |
half_log | Natural logarithm. implemented with a minimum of 10-bits of accuracy | Yes |
half_log10 | Base 10 logarithm. implemented with a minimum of 10-bits of accuracy | Yes |
half_log2 | Base 2 logarithm. implemented with a minimum of 10-bits of accuracy | Yes |
half_powr | x to the power of y, where x is greater than or equal to 0. | Yes |
half_recip | Reciprocal. Implemented with a minimum of 10-bits of accuracy | Yes |
half_rsqrt | Inverse Square Root. Implemented with a minimum of 10-bits of accuracy | Yes |
half_sin | Computes sine. x must be in the range -2^16... +2^16. implemented with a minimum of 10-bits of accuracy | Yes |
half_sqrt | Inverse Square Root. Implemented with a minimum of 10-bits of accuracy | Yes |
half_tan | The Tangent. Implemented with a minimum of 10-bits of accuracy | Yes |
native_ cos | Computes cosine over an implementation-defined range. The maximum error is implementation-defined. | Yes |
native_ divide | Computes x / y over an implementation-defined range. The maximum error is implementation-defined | Yes |
native_ exp | Computes the base- e exponential of x over an implementation-defined range. The maximum error is implementation-defined. | Yes |
native_ exp2 | Computes the base- 2 exponential of x over an implementation-defined range. The maximum error is implementation-defined. | No |
native_exp10 | Computes the base- 10 exponential of x over an implementation-defined range. The maximum error is implementation-defined. | No |
native_ log | Computes natural logarithm over an implementation-defined range. The maximum error is implementation-defined. | Yes |
native_ log10 | Computes a base 10 logarithm over an implementation-defined range. The maximum error is implementation-defined. | No |
native_ log2 | Computes a base 2 logarithm over an implementation-defined range. | No |
native_ powr | Computes x to the power of y, where x is greater than or equal to 0. The range of x and y are implementation-defined. The maximum error is implementation-defined. | No |
native_ recip | Computes reciprocal over an implementation-defined range. The maximum error is implementation-defined. | No |
native_ rsqrt | Computes inverse square root over an implementation-defined range. The maximum error is implementation-defined. | No |
native_ sin | Computes sine over an implementation-defined range. The maximum error is implementation-defined. | Yes |
native_ sqrt | Computes inverse square root over an implementation-defined range. The maximum error is implementation-defined. | No |
native_ tan | Computes tangent over an implementation-defined range. The maximum error is implementation-defined | Yes |
Integer Functions
Function | Description | Supported |
---|---|---|
abs | |x| | Yes |
abs-diff | |x-y| without modulo overflow | Yes |
add_sat | x+y and saturate result | Yes |
hadd | (x+y) >> 1 without modulo overflow | Yes |
rhadd | (x+y+1) >> 1. The intermediate sum does not modulo overflow. | Yes |
clz | Number of leading 0-bits in x | Yes |
mad_hi | mul_hi(a,b)+c | Yes |
mad24 | (Fast integer function.) Multiply 24-bit integer then add the 32-bit result to 32-bit integer | Yes |
mad_sat | a*b+c and saturate the result | Yes |
max | The greater of x or y | Yes |
min | The lessor of x or y | Yes |
mul_hi | High half of the product of x and y | Yes |
mul24 | (Fast integer function.) Multiply 24-bit integer values a and b | Yes |
rotate | result[indx]=v[indx]<<i[indx] | Yes |
sub_sat | x - y and saturate the result | Yes |
upsample | result[i] = ((gentype)hi[i] << 8|16|32) | lo[i] | Yes |
Common Functions
Function | Description | Supported |
---|---|---|
clamp | Clamp x to range given by min, max | Yes |
degrees | radians to degrees | Yes |
max | Maximum of x and y | Yes |
min | Minimum of x and y | Yes |
mix | Linear blend of x and y | Yes |
radians | degrees to radians | Yes |
sign | Sign of x | Yes |
smoothstep | Step and interpolate | Yes |
step | 0.0 if x < edge, else 1.0 | Yes |
Geometric Functions
Function | Description | Supported |
---|---|---|
clamp | Clamp x to range given by min, max | Yes |
degrees | radians to degrees | Yes |
cross | Cross product | Yes |
dot | Dot product only float, double, half data types | Yes |
dstance | Vector distance | Yes |
length | Vector length | Yes |
normalize | Normal vector length 1 | Yes |
fast_distance | Vector distance | Yes |
fast_length | Vector length | Yes |
fast_normalize | Normal vector length 1 | Yes |
Relational Functions
Function | Description | Supported |
---|---|---|
isequal | Compare of x == y. | Yes |
isnotequal | Compare of x != y. | Yes |
isgreater | Compare of x > y. | Yes |
isgreaterequal | Compare of x >= y. | Yes |
isless | Compare of x < y. | Yes |
islessequal | Compare of x <= y. | Yes |
islessgreater | Compare of (x < y) || (x > y). | Yes |
isfinite | Test for finite value. | Yes |
isinf | Test for +ve or -ve infinity. | Yes |
isnan | Test for a NaN. | Yes |
isnormal | Test for a normal value. | Yes |
isordered | Test if arguments are ordered. | Yes |
isunordered | Test if arguments are unordered. | Yes |
signbit | Test for sign bit. | Yes |
any | 1 if MSB in any component of x is set; else 0. | Yes |
all | 1 if MSB in all components of x is set; else 0. | Yes |
bitselect | Each bit of result is corresponding bit of a if corresponding bit of c is 0. | Yes |
select | For each component of a vector type, result[i] = if MSB of c[i] is set ? b[i] : a[i] For scalar type, result = c ? b : a. | Yes |
Vector Data Load and Store Functions
Function | Description | Supported |
---|---|---|
vloadn | Read vectors from a pointer to memory. | Yes |
vstoren | Write a vector to a pointer to memory. | Yes |
vload_half | Read a half float from a pointer to memory. | Yes |
vload_halfn | Read a half float vector from a pointer to memory. | Yes |
vstore_half | Convert float to half and write to a pointer to memory. | Yes |
vstore_halfn | Convert float vector to half vector and write to a pointer to memory. | Yes |
vloada_halfn | Read half float vector from a pointer to memory. | Yes |
vstorea_halfn | Convert float vector to half vector and write to a pointer to memory. | Yes |
Synchronization Functions
Function | Description | Supported |
---|---|---|
barrier | All work-items in a work-group executing the kernel on a processor must execute this function before any are allowed to continue execution beyond the barrier. | Yes |
Explicit Memory Fence Functions
Function | Description | Supported |
---|---|---|
mem_fence | Orders loads and stores of a work-item executing a kernel | Yes |
read_mem_fence | Read memory barrier that orders only loads | Yes |
write_mem_fence | Write memory barrier that orders only stores | No |
Async Copies from Global to Local Memory, Local to Global Memory Functions
Function | Description | Supported |
---|---|---|
async_work_group_copy | Must be encountered by all work-items in a workgroup executing the kernel with the same argument values; otherwise the results are undefined. | Yes |
wait_group_events | Wait for events that identify the async_work_group_copy operations to complete. | Yes |
prefetch | Prefetch bytes into the global cache. | No |
PIPE Functions
IMPORTANT!: OpenCL pipes
must be declared in all lowercase; for
example:
pipe int infifo_((xcl_reqd_pipe_depth(16))); //Cannot be 'pipe int inFifo'
pipe int outfifo_attribute_((xcl_req_pipe_depth(16))); //Cannot be 'pipe in outFifo
Function | Description | Supported |
---|---|---|
read_pipe | Read packet from pipe | Yes |
write_pipe | Write packet to pipe | Yes |
reserve_read_pipe | Reserve entries for reading from pipe | No |
reserve_write_pipe | Reserve entries for writing to pipe | No |
commit_read_pipe | Indicates that all reads associated with a reservation are completed | No |
commit_write_pipe | Indicates that all writes associated with a reservation are completed | No |
is_valid_reserve_id | Test for a valid reservation ID | No |
work_group_reserve_read_pipe | Reserve entries for reading from pipe | No |
work_group_reserve_write_pipe | Reserve entries for writing to pipe | No |
work_group_commit_read_pipe | Indicates that all reads associated with a reservation are completed | No |
work_group_commit_write_pipe | Indicates that all writes associated with a reservation are completed | No |
get_pipe_num_packets | Returns the number of available entries in the pipe | Yes |
get_pipe_max_packets | Returns the maximum number of packets specified when pipe was created | Yes |
Pipe Functions enabled by the cl_khr_subgroups extension
Function | Description | Supported |
---|---|---|
sub_group_reserve_read_pipe | Reserve entries for reading from a pipe | No |
sub_group_reserve_write_pipe | Reserve entries for writing to a pipe | No |
sub_group_commit_read_pipe | Indicates that all reads associated with a reservation are completed | No |
sub_group_commit_write_pipe | Indicates that all writes associated with a reservation are completed | No |
OpenCL 2.0 Image Objects
Function | Description | Supported |
---|---|---|
clCreateImage | Create an image object for a 1D image, 1D image buffer, 1D image array, 2D image, 2D image array or 3D image. | Yes |
clGetSupportedImageFormats | Get the list of image formats supported by an OpenCL implementation when the Context, Image type (1D, 2D, or 3D image, 1D image buffer, 1D or 2D image array) and Image object allocation information of the image memory object is specified. | Yes |
clEnqueueReadImage | Enqueue commands to read from an image or image array object to host memory. | Yes |
clEnqueueWriteImage | Enqueue commands to write to an image or image array object from host memory. | Yes |
clEnqueueFillImage | Enqueues a command to fill an image object with a specified color. | No |
clEnqueueCopyImageToBuffer | Enqueues a command to copy an image object to a buffer object. | No |
clEnqueueMapImage | Enqueues a command to map a region in an image object into the host address space and returns a pointer to this mapped region. | No |
clGetImageInfo | Obtain information specific to an image object created with clCreateImage. To get information that is common to all memory objects, use the clGetMemObjectInfo function. | Yes |