In CUDA, one can use a texture unit to read from any location of global memory using ‘cudaBindTexture’ and ‘tex_Dfetch’. Is it possible to do the same in OpenCL?
I’ve been reading the specification and it seems like OpenCL make a distinction between buffer objects and image objects, and that a sampler can only read from image objects. Is there a way to either make an image object use the same memory as an already existing buffer object, or have a kernel use a sampler to read from a buffer object? One solution would be to copy the buffer contents to the image using ‘clEnqueueCopyBufferToImage’, launch the kernel and then copy the data back again, but I would like to avoid this seemingly pointless copying.
The reason I want to read through a sampler is that I have a kernel implemented in both CUDA and OpenCL where the CUDA version, which uses textures, gets much better performance than the OpenCL version, which uses normal array lookups from global memory. Changing the CUDA version to use normal array lookups reduces the performance to be the same as the OpenCL version. The reason for the difference in performance is, I assume, because I cannot properly coalesce the memory reads, but they are at least somewhat close to each other taking advantage of the texture cache.