One Dimensional Read and Write

Below is the code snippet from the Wide Memory Read/Write Example on Xilinx On-boarding Example GitHub that shows the recommended coding style for automatically inferring burst read for one dimensional vectors. A local memory v1_local is used for buffering the data from a single burst. The entire input vector is read in multiple bursts. The choice of LOCAL_MEM_SIZE depends on the specific application and available on-chip memory on the target FPGA.

kernel __attribute__ ((reqd_work_group_size(1, 1, 1)))
void vadd(
        const __global uint16 *in1, // Read-Only Vector 1
        const __global uint16 *in2, // Read-Only Vector 2
        __global uint16 *out,       // Output Result
        int size                   // Size in integer
        )
{
    local uint16 v1_local[LOCAL_MEM_SIZE];    // Local memory to store vector1
    int size_in16 = (size-1) / VECTOR_SIZE + 1;
    ...
    for(int i = 0; i < size_in16;  i += LOCAL_MEM_SIZE)
    {
    ...
        int chunk_size = LOCAL_MEM_SIZE;
        //boundary checks
        if ((i + LOCAL_MEM_SIZE) > size_in16) 
            chunk_size = size_in16 - i;

        v1_rd: __attribute__((xcl_pipeline_loop))
        for (int j = 0 ; j <  chunk_size; j++){
            v1_local[j] = in1 [i + j];
        }
    ...
    }
}

The Device Hardware Transaction View below shows that multiple read bursts are sent at the kernel start and all read data come back continuously after the memory read latency.