Single Out-of-Order Command Queue

The figure below shows an example with a single out-of-order command queue CQ. The scheduler can dispatch commands from CQ in any order. You must set up event dependencies and synchronizations explicitly if required.

Figure: Example with Single Out-of-Order Command Queue

Below is the code snippet from the Concurrent Kernel Execution Example from Xilinx On-boarding Example GitHub that sets up single out-of-order command queue and enqueues commands into the queue:
cl_command_queue ooo_queue = clCreateCommandQueue(
      world.context, world.device_id,
      CL_QUEUE_PROFILING_ENABLE | CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE, &err);

clEnqueueNDRangeKernel(ooo_queue, kernel_mscale, 1, offset, global,
                       local, 0, nullptr, &ooo_events[0]);

clEnqueueNDRangeKernel(ooo_queue, kernel_madd, 1, offset, global,
                      local, 1,
                      &ooo_events[0], // Event from previous call
                      &ooo_events[1]);

clEnqueueNDRangeKernel(ooo_queue, kernel_mmult, 1, offset, global,
                       local, 0,
                       nullptr, // Does not depend on previous call
                       &ooo_events[2])
The Application Timeline view below shows that the compute unit mmult_1 is running in parallel with the compute units mscale_1 and madd_1, using both multiple in-order queues and single out-of-order queue methods.

Figure: Application Timeline View Showing mult_1 Running with mscale_1 and madd_1