allocating double pointer memory in GPU

BRabbit27 · February 2, 2011, 7:54pm

I have some questions with the right use of cudaMalloc. Suppose I have a double pointer to float and allocate memory on CPU using malloc, which is something like this:

float **array;

    array = (float**)malloc(sizeof(float*) * 4);

    for( i = 0; i < 4; i++)

    	array[indexI] = (float*)malloc(sizeof(float)*10);

    ...

    //Fill the array with some stuff

    ...

Now if I want to allocate memory on the GPU for that kind of array what should I do? I have something like this:

float **d_array;

    cudaMalloc((void**)d_array, sizeof(float*)*4);

    for( i = 0; i < 4; i++)

        cudaMalloc((void**)&d_array[i], sizeof(float)*10);

Finally to copy from array to d_array and put it on GPU-Memory, should I do something like the following?

for( i = 0; i < 4; i++)

        cudaMemcpy(&d_array[i], array[i], sizeof(float)*10, cudaMemcpyHostToDevice);

This is some of my code, but when I compile it, it gives me some errors

pprRGB2CMYK.cu(69): warning: variable "d_array" is used before its value is set

./essai.cu(24): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(24): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(24): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(26): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(27): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(28): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(29): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(29): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(43): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(44): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(45): Warning: Cannot tell what pointer points to, assuming global memory space

./essai.cu(46): Warning: Cannot tell what pointer points to, assuming global memory space

pprRGB2CMYK.cu(69): warning: variable "d_array" is used before its value is set

Any ideas?

tera · February 3, 2011, 2:06am

Your device memory allocation for d_array should read

float **d_array;

    cudaMalloc((void**)&d_array, sizeof(float*)*4);

however you would then also need to copy d_array to the device.

You’ll thus have to deal with host pointers to host pointers, host pointers to device pointers and device pointers to device pointers, but have to make sure you avoid device pointers to host pointers. Then you have to copy from host pointers to device pointers to device pointers to device pointers, and over all that don’t forget to copy the actual data you are interested in, which can easily be found by looping over all fragments and copying from host pointers to host pointers to host data to device pointers to device pointers to device data.

In short, flatten your data structures and don’t use pointers to pointers.

BRabbit27 · February 3, 2011, 12:21pm

Thanks, but I didn’t get the point. If I cudaMalloc my device array with

float **d_array;

    cudaMalloc((void**)&d_array, sizeof(float*)*4);

there will be an error when compiling, won’t it? because cudaMalloc expects a double void pointer void** and passing as argument something like &d_array is now a triple pointer 'cause d_array was declared as float** d_array.

The idea is to have a 2D matrix of floats but created dynamically on the HOST and then how to allocate memory on DEVICE for that 2D-array.

tera · February 3, 2011, 3:05pm

A pointer to anything (including a pointer to a pointer) can be safely cast to a void pointer, so the triple pointer in the function argument can safely be cast to a double void pointer.

If all you want is a matrix, don’t use any pointers to pointers. Just flatten to a vector by writing [font=“Courier New”]array[y*stride + x][/font] instead of [font=“Courier New”]array[y][/font]. Saves both a lot of trouble and many unnecessary memory accesses for the indirection pointers.