opencl Memory flags


When allocating Memory you have the option to choose between different modes:

  • Read only memory
  • Write only memory
  • Read/Write memory

Read-only memory is allocated in the __constant memory region, while the other two are allocated in the normal __global region.

In addition to the accessibility you can define where your memory is allocated.

  • Not specified: Your memory is allocated on the device memory as you would expect. The $source pointer can be set to null.
  • CL_MEM_USE_HOST_PTR: This tells the device that the data is in the system RAM and should not be moved. Instead the data is manipulated directly in the ram.
  • CL_MEM_COPY_HOST_PTR: Tells the device to copy all values at the given address to device memory or, using CL_MEM_ALLOC_HOST_PTR, to a seperate memory region in system ram.
  • CL_MEM_ALLOC_HOST_PTR: Tells the device to allocate space at the system ram. If used as the only parameter, the $source pointer can be set to null.

Speed-wise, access to device global memory is the fastest one. But you also need to call it twice to copy data. Using the Host-pointer is the slowest one, while Alloc_host_ptr offers a higher speed.

When using use_host_ptr, the device does exactly that: It uses your data in the system ram, which of course is paged by the os. So every memory call has to go through the cpu to handle potential pagefaults. When the data is available, the cpu copies it into pinned memory and passes it to the DMA controller using precious cpu clock cycles. On the contrary, alloc_host_ptr allocates pinned memory in the system ram. This memory is placed outside of the pageswap mechanism and therefore has a guaranteed availability. Therefore the device can skip the cpu entirely when accessing system ram and utilize DMA to quickly copy data to the device.