OpenCL Built-In Functions Support in the SDAccel Environment

The OpenCL™ C programming language provides a rich set of built-in functions for scalar and vector operations. Many of these functions are similar to the function names provided in common C libraries but they support scalar and vector argument types. The SDAccel™ development environment is OpenCL 1.0 embedded profile compliant. The following tables show descriptions of built-in functions in OpenCL 1.0 embedded profile and their support status in the SDAccel environment.

Work-Item Functions

Function Description Supported
get_global_size Number of global work items Yes
get_global_id Global work item ID value Yes
get_local_size Number of local work items Yes
get_local_id Local work item ID Yes
get_num_groups Number of work groups Yes
get_group_id Work group ID Yes
get_work_dim Number of dimensions in use Yes

Math Functions

Function Description Supported
acos Arc Cosine function Yes
acosh Inverse Hyperbolic Cosine function Yes
acospi acox(x)/PI Yes
asin Arc Cosine function Yes
asinh Inverse Hyperbolic Cosine function Yes
asinpi Computes acos (x) / pi Yes
atan Arc Tangent function Yes
atan2(y, x) Arc Tangent of y / x Yes
atanh Hyperbolic Arc Tangent function Yes
atanpi Computes atan (x) / pi Yes
atan2pi Computes atan2 (y, x) / pi Yes
cbrt Compute cube-root Yes
ceil Round to integral value using the round to +ve infinity rounding mode. Yes
copysign(x, y)

                         

Returns x with its sign changed to match the sign of y. Yes
cos Cosine function Yes
cosh Hyperbolic Cosine function Yes
cospi Computes cos (x * pi) Yes
erf The error function encountered in integrating the normal distribution Yes
erfc Complementary Error function Yes
exp base- e exponential of x Yes
exp2 Exponential base 2 function Yes
exp10 Exponential base 10 function Yes
expm1 exp(x) - 1.0 Yes
fabs Absolute value of a floating-point number Yes
fdim(x, y) x - y if x > y, +0 if x is less than or equal to y. Yes
floor Round to integral value using the round to -ve infinityrounding mode. Yes
fma(a, b, c) Returns the correctly rounded floating-point representation of the sum of c with the infinitely precise product of a and b. Rounding of intermediate products shall not occur. Edge case behavior is per the IEEE 754-2008 standard. Yes

fmax(x, y)

fmax(x, float y)

                 

Returns y if x is less than y, otherwise it returns x. If one argument is a NaN, fmax() returns the other argument. If both arguments are NaNs, fmax() returns a NaN Yes
fmin(x, y) fmin(x, float y)

                 

Returns y if y less than x, otherwise it returns x. If one argument is a NaN, fmax() returns the other argument. If both arguments are NaNs, fmax() returns a NaN. Yes
fmod Modulus. Returns x - y * trunc (x/y) Yes
fract Returns fmin( x - floor(x), 0x1.fffffep-1f ). floor(x) is returned in iptr Yes
frexp Extract mantissa and exponent from x. For each component the mantissa returned is a float with magnitude in the interval [1/2, 1) or 0. Each component of x equals mantissa returned * 2exp. Yes
hypot Computes the value of the square root of x2 + y2 without undue overflow or underflow. Yes
ilogb Returns the exponent as an integer value. Yes
ldexp Multiply x by 2 to the power n. Yes
lgamma Returns the natural logarithm of the absolute value of the gamma function. The sign of the gamma function is returned in the signp argument of lgamma_r. Yes
lgamma_r Returns the natural logarithm of the absolute value of the gamma function. The sign of the gamma function is returned in the signp argument of lgamma_r. Yes
log Computes natural logarithm. Yes
log2 Computes a base 2 logarithm Yes
log10 Computes a base 10 logarithm Yes
log1p loge(1.0+x) Yes
logb Computes the exponent of x, which is the integral part of logr |x|. Yes
mad Approximates a * b + c. Whether or how the product of a * b is rounded and how supernormal or subnormal intermediate products are handled is not defined. mad is intended to be used where speed is preferred over accuracy30 Yes
modf Decompose a floating-point number. The modf function breaks the argument x into integral and fractional parts, each of which has the same sign as the argument. It stores the integral part in the object pointed to by iptr. Yes
nan Returns a quiet NaN. The nancode may be placed in the significand of the resulting NaN. Yes
nextafter Next representable floating-point value following x in the direction of y Yes
pow Computes x to the power of y Yes
pown Computes x to the power of y, where y is an integer. Yes
powr Computes x to the power of y, where x is greater than or equal to 0. Yes
remainder Computes the value r such that r = x - n*y, where n is the integer nearest the exact value of x/y. If there are two integers closest to x/y, n shall be the even one. If r is zero, it is given the same sign as x. Yes
remquo Floating point remainder and quotient function. Yes
rint Round to integral value (using round to nearest even rounding mode) in floating-point format. Yes
rootn Compute x to the power 1/y. Yes
round Return the integral value nearest to x rounding halfway cases away from zero, regardless of the current rounding direction. Yes
rsqrt Inverse Square Root Yes
sin Computes the sine Yes
sincos Computes sine and cosine of x. The computed sine is the return value and computed cosine is returned in cosval. Yes
sinh Computes the hyperbolic sine Yes
sinpi Computes sin (pi * x). Yes
sqrt Computes square root. Yes
tan Computes the tangent. Yes
tanh Computes hyperbolic tangent. Yes
tanpi Computes tan(pi * x). Yes
tgamma Computes the gamma. Yes
trunc Round to integral value using the round to zero rounding mode. Yes
half_cos Computes cosine. x must be in the range -216... +216. This function is implemented with a minimum of 10-bits of accuracy Yes
half_divide Computes x / y. This function is implemented with a minimum of 10-bits of accuracy Yes
half_exp Computes the base- e exponential of x. implemented with a minimum of 10-bits of accuracy Yes
half_exp2 The base- 2 exponential of x. implemented with a minimum of 10-bits of accuracy Yes
half_exp10 The base- 10 exponential of x. implemented with a minimum of 10-bits of accuracy Yes
half_log Natural logarithm. implemented with a minimum of 10-bits of accuracy Yes
half_log10 Base 10 logarithm. implemented with a minimum of 10-bits of accuracy Yes
half_log2 Base 2 logarithm. implemented with a minimum of 10-bits of accuracy Yes
half_powr x to the power of y, where x is greater than or equal to 0. Yes
half_recip Reciprocal. Implemented with a minimum of 10-bits of accuracy Yes
half_rsqrt Inverse Square Root. Implemented with a minimum of 10-bits of accuracy Yes
half_sin Computes sine. x must be in the range -2^16... +2^16. implemented with a minimum of 10-bits of accuracy Yes
half_sqrt Inverse Square Root. Implemented with a minimum of 10-bits of accuracy Yes
half_tan The Tangent. Implemented with a minimum of 10-bits of accuracy Yes
native_ cos Computes cosine over an implementation-defined range. The maximum error is implementation-defined. Yes
native_ divide Computes x / y over an implementation-defined range. The maximum error is implementation-defined Yes
native_ exp Computes the base- e exponential of x over an implementation-defined range. The maximum error is implementation-defined. Yes
native_ exp2 Computes the base- 2 exponential of x over an implementation-defined range. The maximum error is implementation-defined. No
native_exp10 Computes the base- 10 exponential of x over an implementation-defined range. The maximum error is implementation-defined. No
native_ log Computes natural logarithm over an implementation-defined range. The maximum error is implementation-defined. Yes
native_ log10 Computes a base 10 logarithm over an implementation-defined range. The maximum error is implementation-defined. No
native_ log2 Computes a base 2 logarithm over an implementation-defined range. No
native_ powr Computes x to the power of y, where x is greater than or equal to 0. The range of x and y are implementation-defined. The maximum error is implementation-defined. No
native_ recip Computes reciprocal over an implementation-defined range. The maximum error is implementation-defined. No
native_ rsqrt Computes inverse square root over an implementation-defined range. The maximum error is implementation-defined. No
native_ sin Computes sine over an implementation-defined range. The maximum error is implementation-defined. Yes
native_ sqrt Computes inverse square root over an implementation-defined range. The maximum error is implementation-defined. No
native_ tan Computes tangent over an implementation-defined range. The maximum error is implementation-defined Yes

Integer Functions

Function Description Supported
abs |x| Yes
abs-diff |x-y| without modulo overflow Yes
add_sat x+y and saturate result Yes
hadd (x+y) >> 1 without modulo overflow Yes
rhadd (x+y+1) >> 1. The intermediate sum does not modulo overflow. Yes
clz Number of leading 0-bits in x Yes
mad_hi mul_hi(a,b)+c Yes
mad24 (Fast integer function.) Multiply 24-bit integer then add the 32-bit result to 32-bit integer Yes
mad_sat a*b+c and saturate the result Yes
max The greater of x or y Yes
min The lessor of x or y Yes
mul_hi High half of the product of x and y Yes
mul24 (Fast integer function.) Multiply 24-bit integer values a and b Yes
rotate result[indx]=v[indx]<<i[indx] Yes
sub_sat x - y and saturate the result Yes
upsample result[i] = ((gentype)hi[i] << 8|16|32) | lo[i] Yes

Common Functions

Function Description Supported
clamp Clamp x to range given by min, max Yes
degrees radians to degrees Yes
max Maximum of x and y Yes
min Minimum of x and y Yes
mix Linear blend of x and y Yes
radians degrees to radians Yes
sign Sign of x Yes
smoothstep Step and interpolate Yes
step 0.0 if x < edge, else 1.0 Yes

Geometric Functions

Function Description Supported
clamp Clamp x to range given by min, max Yes
degrees radians to degrees Yes
cross Cross product Yes
dot Dot product only float, double, half data types Yes
dstance Vector distance Yes
length Vector length Yes
normalize Normal vector length 1 Yes
fast_distance Vector distance Yes
fast_length Vector length Yes
fast_normalize Normal vector length 1 Yes

Relational Functions

Function Description Supported
isequal Compare of x == y. Yes
isnotequal Compare of x != y. Yes
isgreater Compare of x > y. Yes
isgreaterequal Compare of x >= y. Yes
isless Compare of x < y. Yes
islessequal Compare of x <= y. Yes
islessgreater Compare of (x < y) || (x > y). Yes
isfinite Test for finite value. Yes
isinf Test for +ve or -ve infinity. Yes
isnan Test for a NaN. Yes
isnormal Test for a normal value. Yes
isordered Test if arguments are ordered. Yes
isunordered Test if arguments are unordered. Yes
signbit Test for sign bit. Yes
any 1 if MSB in any component of x is set; else 0. Yes
all 1 if MSB in all components of x is set; else 0. Yes
bitselect Each bit of result is corresponding bit of a if corresponding bit of c is 0. Yes
select For each component of a vector type, result[i] = if MSB of c[i] is set ? b[i] : a[i] For scalar type, result = c ? b : a. Yes

Vector Data Load and Store Functions

Function Description Supported
vloadn Read vectors from a pointer to memory. Yes
vstoren Write a vector to a pointer to memory. Yes
vload_half Read a half float from a pointer to memory. Yes
vload_halfn Read a half float vector from a pointer to memory. Yes
vstore_half Convert float to half and write to a pointer to memory. Yes
vstore_halfn Convert float vector to half vector and write to a pointer to memory. Yes
vloada_halfn Read half float vector from a pointer to memory. Yes
vstorea_halfn Convert float vector to half vector and write to a pointer to memory. Yes

Synchronization Functions

Function Description Supported
barrier All work-items in a work-group executing the kernel on a processor must execute this function before any are allowed to continue execution beyond the barrier. Yes

Explicit Memory Fence Functions

Function Description Supported
mem_fence Orders loads and stores of a work-item executing a kernel Yes
read_mem_fence Read memory barrier that orders only loads Yes
write_mem_fence Write memory barrier that orders only stores No

Async Copies from Global to Local Memory, Local to Global Memory Functions

Function Description Supported
async_work_group_copy Must be encountered by all work-items in a workgroup executing the kernel with the same argument values; otherwise the results are undefined. Yes
wait_group_events Wait for events that identify the async_work_group_copy operations to complete. Yes
prefetch Prefetch bytes into the global cache. No

PIPE Functions

IMPORTANT!: OpenCL pipes must be declared in all lowercase; for example:
pipe int infifo_((xcl_reqd_pipe_depth(16)));  //Cannot be 'pipe int inFifo'
pipe int outfifo_attribute_((xcl_req_pipe_depth(16))); //Cannot be 'pipe in outFifo 
Function Description Supported
read_pipe Read packet from pipe Yes
write_pipe Write packet to pipe Yes
reserve_read_pipe Reserve entries for reading from pipe No
reserve_write_pipe Reserve entries for writing to pipe No
commit_read_pipe Indicates that all reads associated with a reservation are completed No
commit_write_pipe Indicates that all writes associated with a reservation are completed No
is_valid_reserve_id Test for a valid reservation ID No
work_group_reserve_read_pipe Reserve entries for reading from pipe No
work_group_reserve_write_pipe Reserve entries for writing to pipe No
work_group_commit_read_pipe Indicates that all reads associated with a reservation are completed No
work_group_commit_write_pipe Indicates that all writes associated with a reservation are completed No
get_pipe_num_packets Returns the number of available entries in the pipe Yes
get_pipe_max_packets Returns the maximum number of packets specified when pipe was created Yes

Pipe Functions enabled by the cl_khr_subgroups extension

Function Description Supported
sub_group_reserve_read_pipe Reserve entries for reading from a pipe No
sub_group_reserve_write_pipe Reserve entries for writing to a pipe No
sub_group_commit_read_pipe Indicates that all reads associated with a reservation are completed No
sub_group_commit_write_pipe Indicates that all writes associated with a reservation are completed No

OpenCL 2.0 Image Objects

Table 1. OpenCL 2.0 Image Options
Function Description Supported
clCreateImage Create an image object for a 1D image, 1D image buffer, 1D image array, 2D image, 2D image array or 3D image. Yes
clGetSupportedImageFormats Get the list of image formats supported by an OpenCL implementation when the Context, Image type (1D, 2D, or 3D image, 1D image buffer, 1D or 2D image array) and Image object allocation information of the image memory object is specified. Yes
clEnqueueReadImage Enqueue commands to read from an image or image array object to host memory. Yes
clEnqueueWriteImage Enqueue commands to write to an image or image array object from host memory. Yes
clEnqueueFillImage Enqueues a command to fill an image object with a specified color. No
clEnqueueCopyImageToBuffer Enqueues a command to copy an image object to a buffer object. No
clEnqueueMapImage Enqueues a command to map a region in an image object into the host address space and returns a pointer to this mapped region. No
clGetImageInfo Obtain information specific to an image object created with clCreateImage. To get information that is common to all memory objects, use the clGetMemObjectInfo function. Yes