double pointer allocation

Hi, I’m new with CUDA development and I have a basic question.

In the C original code implementation I have a double integer pointer, used to store N lists of values, each of them with a difference lenght. How is it possible to have a CUDA version similar to these line?

[codebox]

int **descs;

descs = (int**) malloc(Nsizeof(int));

for (int i=0;i<N;++i) {

descs[i] = (int*) malloc(sizeof(int)*nd[i];

}

[/codebox]

where nd[i] is set somewhere.

I tried this:

[codebox]

int **descs;

cudaMalloc((void **) &descs,Nsizeof(int));

for (int i=0;i<N;++i) {

cudaMalloc((void**) &descs[i],sizeof(int)*nd[i]);

}

[/codebox]

receiving a segfault :-(

There is a method to do this? or I can only allocate 1D array?

What you’re using is called jagged arrays. They do work in CUDA, but remember to copy the pointer list from the host to the device before use. Remember that the device cannot access the pointer list which still resides in host memory (the kernel would crash!). Your CudaMallocs currently write device memory pointers an array in host memory.

Overall these jagged arrays are a pain to set up in CUDA, and the overhead of doing a lot of tiny memory allocations (and potentially copy operations) is tremendous. Consider CudaArrays or single 1D representation of a 2D array in memory (manual computation of indexes in the kernel)

Christian

The overhead shouldn’t be a problem: this array of arrays contains constants, is allocated at the begin of the execution and then just read.

What I don’t understand is how to correctly do the 2nd step: allocate the 1D array and save the value in one portion of theprevious array.

I solved the problem. What I didn’t understand was the need to have a support double pointer on the host to store the device address.

I succeeded using a code portion simular to this:

[codebox]int **descs;

int **support;

cudaMalloc((void **) &descs,Nsizeof(int));

support = (int**) malloc(sizeof(int*)*N);

cudaMemcpy(suppor,descs,Nsizeof(int),cudaMemcpyDeviceToHos

t);

for (int i=0;i<N;++i) {

cudaMalloc((void**) support[i],sizeof(int)*nd[i]);

}[/codebox]

I put a 5 star in this topic. In my case this is the only case of building beautiful code. In my case I have to update a master vector with new elements which are only known at runtime. So the lame way is to destroy the master matrix and rebuild it with the newly added elements. This is extremely lame. The way is the double pointer and then use the double pointer to build the master vector.

Just a bug but everyone should have noticed it…

cudaMalloc((void**)& support[i],sizeof(int)*nd[i]);

. Someone build a wiki for this case please. Nice trick.