Casting and Datatype Conversion
Casting intrinsic functions (as_[Type]()
) allow casting between vector types or scalar types of the
same size. The casting can work on accumulator vector types too. Generally, using
the smallest data type possible will reduce register spillage and improve
performance. For example, if a 48-bit accumulator (acc48) can meet the design
requirements then it is preferable to use that instead of a larger 80-bit
accumulator (acc80).
acc80
vector data type occupies two neighboring 48-bit
lanes.Standard C casts can be also used and works identically in almost all cases as shown in the following example.
v8int16 iv;
v4cint16 cv=as_v4cint16(iv);
v4cint16 cv2=*(v4cint16*)&iv;
v8acc80 cas_iv;
v8cacc48 cas_cv=as_v8cacc48(cas_iv);
There is hardware support built-in for floating-point to fixed-point
(float2fix()
) and fixed-point to floating-point
(fix2float()
) conversions. For example, the
fixed-point square root, inverse square root, and inverse are implemented with
floating point precision and the fix2float()
and
float2fix()
conversions are used before and
after the function.
The scalar engine is used in this example because the square root and inverse functions are not vectorizable. This can be verified by looking at the function prototype's input data types:
float _sqrtf(float a) //scalar float operation
int sqrt(int a,...) //scalar integer operation
Note that the input data types are scalar types (int) and not vector types (vint).
The conversion functions (fix2float
,
float2fix
) can be handled by either the vector
or scalar engines depending on the function called. Note the difference in data
return type and data argument types:
float fix2float(int n,...) //Uses the scalar engine
v8float fix2float(v8int32 ivec,...) //Uses the vector engine
float2fix
, there are two types of implementations,
float2fix_safe
(default) and float2fix_fast
with the float2fix_safe
implementation offering a more strict data type check.
You can define the macro FLOAT2FIX_FAST
to make
float2fix
choose the float2fix_fast
implementation.