Global Memory

The global memory is defined as the region of system memory that is accessible to both the OpenCL™ host and device. The host is responsible for the allocation and deallocation of buffers in this memory space. There is a handshake between host and device over control of the data stored in this memory. The host processor transfers data from the host memory space into the global memory space. Then, when a kernel is launched to process the data, the host loses access rights to the buffer in global memory. The device takes over and is capable of reading and writing from the global memory until the kernel execution is complete. Upon completion of the operations associated with a kernel, the device turns control of the global memory buffer back to the host processor. After it has regained control of a buffer, the host processor can read and write data to the buffer, transfer data back to the host memory, and deallocate the buffer.

You can use the clCreateSubDevices function to create a sub-device for each compute unit mapped to a specific DDR/Global memory. In the Xilinx® implementation each sub-device can only have one compute unit.

The following example shows how you can create 16 sub-devices:
int num_sub_devices = 16;

	const cl_device_partition_property properties[3] = {
    		CL_DEVICE_PARTITION_EQUALLY  ,
   		1, 		// Use only one compute unit
   	 	0 		//CL_DEVICE_PARTITION_BY_COUNTS_LIST_END
	};

	cl_device_id subdevice_id;
	cl_device_id sub_device_ids[num_sub_devices];
	cl_uint num_devices_ret = 0;
	err = clCreateSubDevices(device_id, properties, 
		num_sub_devices, sub_device_ids, &num_devices_ret);
	if (err != CL_SUCCESS) {
    		fprintf(stderr, "failed to create sub device %d!\n", err);
    		return 1;
	}

Specify the sub-devices using the xocc command option --nk to add multiple kernels. The following shows how this is used in a Makefile.

	KERNEL_1_COMPILE_FLAGS = --nk subf:16
	KERNEL_1_CU_1 = subf_1
	KERNEL_1_CU_2 = subf_2
	KERNEL_1_CU_3 = subf_3
	KERNEL_1_CU_4 = subf_4
	KERNEL_1_CU_5 = subf_5
	KERNEL_1_CU_6 = subf_6
	KERNEL_1_CU_7 = subf_7
	KERNEL_1_CU_8 = subf_8
	KERNEL_1_CU_9 = subf_9
	KERNEL_1_CU_10 = subf_10
	KERNEL_1_CU_11 = subf_11
	KERNEL_1_CU_12 = subf_12
	KERNEL_1_CU_13 = subf_13
	KERNEL_1_CU_14 = subf_14
	KERNEL_1_CU_15 = subf_15
	KERNEL_1_CU_16 = subf_16
	XOS_1 += $(KERNEL_1_XO)
	XOCC_LINK_OPTS_1 +=  --nk $(KERNEL_1):16:$(KERNEL_1_CU_1).
$(KERNEL_1_CU_2).
$(KERNEL_1_CU_3).
$(KERNEL_1_CU_4).
$(KERNEL_1_CU_5).
$(KERNEL_1_CU_6).
$(KERNEL_1_CU_7).
$(KERNEL_1_CU_8).
$(KERNEL_1_CU_9).
$(KERNEL_1_CU_10).
$(KERNEL_1_CU_11).
$(KERNEL_1_CU_12).
$(KERNEL_1_CU_13).
$(KERNEL_1_CU_14).
$(KERNEL_1_CU_15).
$(KERNEL_1_CU_16)