pragma SDS async
Description
The ASYNC
pragma must be paired with the WAIT
pragma
to support manual control of the hardware function synchronization.
The ASYNC
pragma is specified immediately preceding a call to a
hardware function, directing the compiler not to automatically generate the wait
based on data flow analysis. The WAIT
pragma must be inserted at an
appropriate point in the program to direct the CPU to wait until the associated
ASYNC
function call with the same ID has completed.
In the presence of an ASYNC
pragma, the SDSoC system compiler does
not generate an sds_wait() in the stub function for the
associated call. The program must contain the matching
sds_wait(ID) or #pragma SDS wait(ID) at an
appropriate point to synchronize the controlling thread running on the CPU with the
hardware function thread. An advantage of using the #pragma SDS
wait(ID) over the sds_wait(ID)
function call is that
the source code can then be compiled by compilers other than the SDSoC compiler,
like gcc
, that does not interpret either ASYNC
or
WAIT
pragmas.
Syntax
#pragma SDS async(<ID>)
...
#pragma SDS wait(ID)
<ID>
: Is a user-defined ID for theASYNC
/WAIT
pair specified as a compile time unsigned integer constant.
Example 1
IDs
:{
#pragma SDS async(1)
mmult(A, B, C);
#pragma SDS async(2)
mmult(D, E, F);
...
#pragma SDS wait(1)
#pragma SDS wait(2)
}
The program running on the hardware first transfers A
and
B
to the mmult hardware and returns immediately. Then the
program transfers D
and E
to the mmult hardware
and returns immediately. When the program later executes to the point of
#pragma SDS wait(1)
, it waits for the output C
to be ready. When the program later executes to the point of #pragma SDS
wait(2)
, it waits for the output F
to be ready.
Example 2
ID
to pipeline the data transfer and accelerator
execution:for (int i = 0; i < pipeline_depth; i++) {
#pragma SDS async(1)
mmult_accel(A[i%NUM_MAT], B[i%NUM_MAT], C[i%NUM_MAT]);
}
for (int i = pipeline_depth; i < NUM_TESTS-pipeline_depth; i++) {
#pragma SDS wait(1)
#pragma SDS async(1)
mmult_accel(A[i%NUM_MAT], B[i%NUM_MAT], C[i%NUM_MAT]);
}
for (int i = 0; i < pipeline_depth; i++) {
#pragma SDS wait(1)
}
In the above example, the first loop ramps up the pipeline with a depth of
pipeline_depth
, the second loop executes the pipeline, and the
third loop ramps down the pipeline. The hardware buffer depth (pragma SDS data buffer_depth) should be set to the same value as
pipeline_depth
. The goal of this pipeline is to transfer data
to the accelerator for the next execution while the current execution is not
finished. Refer to "Increasing System Parallelism and Concurrency" in SDSoC
Environment User Guide (UG1027) for more information.