Possible pointer issue

Dear All

Suppose the following. I allocate a buffer in the GPU from the host. Then I have a pointer to that memory in the GPU. I transfer an array with several data from the host to that GPU buffer. Then I call a kernel. Can I make several pointer operations in the host with GPU pointer? I explain better. Can I do the following?

kernel1<<<grid,thread>>>(GPUpointer+offset,…other parameters);

Can I make GPU pointer operations in the host side?


Luis Gonçalves

Why not try it? You can learn a lot that way.

GPUpointer+offset is ok.

*GPUpointer+value is not ok.

my_other_GPUpointer = GPUpointer is ok.

The fundamental rule is that you cannot dereference a device pointer in (ordinary) host code. Operations that don’t involve dereferencing device pointers should be OK in host code.

I was supposing that in someway the memory management in GPU was different from the CPU. When I add “+1” to a double pointer in CPU the code adds 8 (size of double). But I supposed that in the GPU could add another amount if the memory was organized another way.

Thanks for the answer.

Just a reminder; you need to be aware of the alignment and stride/pitch values, especially with cudaMallocPitch(), cudaMalloc3D(), etc.
I realized considerable speedup in some cases once I started using cudaMalloc() once for the whole memory requirement and calculated device pointers myself for individual chunks of data instead of calling cudaMalloc() for each chunk.

The amount added to a pointer via pointer arithmetic is a matter of compliance with the C/C++ language. It does not matter how memory is organized, or even if the pointer is valid. A double pointer will add the implementation-dependent size of a double scaled by the offset. As you point out, for all implementations on which CUDA is currently valid, the size of a double is 8 (bytes).

Furthermore, the host compiler, which is generating the underlying code to offset the pointer, has no knowledge of the difference between a host and device pointer. They will be handled identically.

Another question.

my_other_GPUpointer = GPUpointer is ok.

How I declare (or allocate) “my_other_GPUpointer” in the host without allocating memory in the GPU? It is only to store values like “GPUpointer+offset” in the host.


Luis Gonçalves

double *myGPUpointer;

cudaMalloc(&myGPUpointer, my_size*sizeof(double));

double *my_other_GPUpointer = myGPUpointer + my_offset;

This is pretty much the same way you would do it in C if the cudaMalloc statement were replaced with an equivalent malloc statement. There’s no need to allocate anything if my_offset is less than my_size (i.e. if you intend my_other_GPUpointer to point to some region that has already been allocated for myGPUpointer)