pragma HLS dataflow
Description
The DATAFLOW
pragma enables task-level pipelining, allowing functions and
loops to overlap in their operation, increasing the concurrency of the RTL implementation,
and increasing the overall throughput of the design.
All operations are performed sequentially in a C description. In the absence of any
directives that limit resources (such as pragma HLS allocation
), Vivado HLS
seeks to minimize latency and improve concurrency. However, data dependencies can limit
this. For example, functions or loops that access arrays must finish all read/write accesses
to the arrays before they complete. This prevents the next function or loop that consumes
the data from starting operation. The DATAFLOW
optimization enables the
operations in a function or loop to start operation before the previous function or loop
completes all its operations.
Figure: DATAFLOW Pragma
When the DATAFLOW
pragma is specified, Vivado HLS analyzes the dataflow
between sequential functions or loops and create channels (based on pingpong RAMs or FIFOs)
that allow consumer functions or loops to start operation before the producer functions or
loops have completed. This allows functions or loops to operate in parallel, which decreases
latency and improves the throughput of the RTL.
If no initiation interval (number of cycles between the start of one function or loop and the next) is specified, Vivado HLS attempts to minimize the initiation interval and start operation as soon as data is available.
config_dataflow
command specifies the default memory
channel and FIFO depth used in dataflow optimization. Refer to the
config_dataflow
command in the Vivado Design Suite User Guide:
High-Level Synthesis (UG902) for more information. DATAFLOW
optimization to work, the data must flow through the
design from one task to the next. The following coding styles prevent Vivado HLS from
performing the DATAFLOW
optimization: - Single-producer-consumer violations
- Bypassing tasks
- Feedback between tasks
- Conditional execution of tasks
- Loops with multiple exit conditions
Finally, the DATAFLOW
optimization has no hierarchical implementation. If
a sub-function or loop contains additional tasks that might benefit from the
DATAFLOW
optimization, you must apply the optimization to the loop, the
sub-function, or inline the sub-function.
Syntax
Place the pragma in the C source within the boundaries of the region, function, or loop.
#pragma HLS dataflow
Example 1
Specifies DATAFLOW
optimization within the loop wr_loop_j
.
wr_loop_j: for (int j = 0; j < TILE_PER_ROW; ++j) {
#pragma HLS DATAFLOW
wr_buf_loop_m: for (int m = 0; m < TILE_HEIGHT; ++m) {
wr_buf_loop_n: for (int n = 0; n < TILE_WIDTH; ++n) {
#pragma HLS PIPELINE
// should burst TILE_WIDTH in WORD beat
outFifo >> tile[m][n];
}
}
wr_loop_m: for (int m = 0; m < TILE_HEIGHT; ++m) {
wr_loop_n: for (int n = 0; n < TILE_WIDTH; ++n) {
#pragma HLS PIPELINE
outx[TILE_HEIGHT*TILE_PER_ROW*TILE_WIDTH*i+TILE_PER_ROW*TILE_WIDTH*m+TILE_WIDTH*j+n] = tile[m][n];
}
}
See Also
- pragma HLS allocation
- Vivado Design Suite User Guide: High-Level Synthesis (UG902)
- xcl_dataflow
- SDAccel Environment Optimization Guide (UG1207)