Storing pointers / structs on the device without the need for kernel parameters

Hello there,

let’s say I’ve got some struct “ProgramState” here that I want to hold on the GPU. The current way of doing this is AFAIK creating a buffer on the host and handing over this pointer anytime as kernel parameter when I want to execute a partial of my program on the device.

Is there any possibility to store these memory references on the device without the need to pass them anytime I execute the kernel on the device?

The same goes for images that I create once in the host, I have to hand over an image object to the kernel anytime I want to do something with it, are there other possibilities to let the device “remember” the image that was used the last time?

Thanks,

Nils

Memory on the device and on the host are different. When you allocate memory on the device, it returns a pointer to device memory. You then pass this pointer as a parameter to the kernel call. This pointer is stored in constant memory (on NVIDIA devices) every time the kernel is invoked.

If you think that the entire buffer is sent over with each kernel invocation, then you are misunderstanding what is happening. Only the pointer is transferred. I don’t know why you would want to avoid passing a single pointer as an argument. As far as I know (and I am no expert), there is no way to access the pointer from the kernel without it being an argument.

If you have too many parameters in your function call, you can put them all into a struct or some other memory structure with a single pointer. This can be loaded in constant memory for speed. But constant memory is small, and if you want to pass a boat-load of pointers to your function, then you can load them all into global memory from the host and pass a single pointer as a parameter to this memory construct.

Let me know if this doesn’t answer your question.

Sorry, but of course I know the difference between device and host, different memory spaces and all that.

The problem is, that I not only have pointers but also image objects that I want to pass to kernels. They can’t be stored in a struct. It would be easier if I could “remember” the images on the device, preventing to pass them as kernel arguments everytime.

OpenCL has the more abstract concept of memory objects compared to C4CUDA. This becomes quite obvious if you consider that CL memory objects are created context-, not device-wise, an can be shared between multiple devices. Pointers, however, are device-specific and would only reflect one view on a shared memory object, but would be completely meaningless on another device.

From what I understand, an OpenCL runtime is not even required to bind a memory object to a certain address between different kernel invocations, in principle it could store it on disc and restore it before another kernel execution somewhere else.

For that reason, there are only pointers that exist for the time a kernel is executed (so these pointers are device-specific and valid only for a a short time), and there is no support for storing pointers to global memory. You might somehow get it working by casting pointers through size_t to unsigned long long (and vice versa), but don’t be surprised if you don’t get what you expected.

Regards,
Markus