Arrays on the GPU and CudaMemset()

Hello everyone.

I am wondering about using cudaMemset() on arrays on the GPU. I allocate some 3D arrays on the GPU and I set their when I’m there. I have this loop that makes multiple calls to the GPU, and it has these 5 3D arrays to store needed intermediate values in GMEM before writing some stuff back to the host. The first iteration goes through fine, and then I run cudaMemset() to set everything back to 0, but when I run through more iterations of the loop, the kernel crashes when I’m in the part where I access those 3D arrays again.

There could be a problem before the cudaMemset() is called so as to cause these problems, but I am wondering if calling cudaMemset() would reset the addresses of my pointers in memory?

I have:


double* arr;

double** arr2DPtr;

double*** arr3DPtr;


So, I do a bunch of stuff using arr3DPtr, and then later I need to wipe all of “arr” clean, so I call cudaMemset() on it. Again, this is caused by some other error I am making, but would called cudaMemset() also mess up the addresses of arr2DPtr and arr3DPtr? That make sense?

If you define 2D and 3D arrays this way (as pointers to pointers to pointers), then, yes, cudaMemset will zero out a pointer you need. (The allocation of all the fragments for arr3DPtr must be horrible!)

C-style dynamic multidimensional arrays in this form are terribly inefficient anyway (all those pointers live in global memory and have to be fetched before reading the data you want). It is much better to declare multidimensional arrays as a big 1D array, and do some index calculation to convert 3 indices to a 1D index when you need to read or write. Then it is easy to clear the contents of the array with cudaMemset().

As a side note, they’re pretty inefficient on CPUs too. A major reason I’m interested in teaching programming courses is to try to minimise the number of pointer-chasing monstrosities I’m asked to fix.