need some help with cudaMemcpy/cudamemcpy2D

Hello,

I have been struggling for a while with this. I have a C++ array that is allocated dynamically as follows:

[codebox]

float ** A = new float *[equations];

for (unsigned k = 0; k < equations; ++k)

{

A[k] = new float[parameters]; 

}

[/codebox]

Now, what I want to do is transfer this to the device. However, I have been unsuccessful in doing so:

I tried cudaMemcpy and cudamemcpy2D. With cudamemcpy2D, i tried the following:

[codebox]

float *d_A = 0;

cudaMalloc((void**)d_A, equationsparameterssize));

cudaMemcpy2D(d_A, equations * sizeof(float), A, equations * sizeof(float),

                    equations, parameters, cudaMemcpyHostToDevice));

[/codebox]

However, when I examine the copied values, they are rubbish.

Does anyone know what I should do to achieve this?

Thanks,

Luca

An array of pointers is not the same thing as a two-dimensional array. As you allocate each of the 1d stripes separately in the host, you will have to do just the same on the device:

[codebox]

float **d_A = 0;

for (unsigned k = 0; k < equations; ++k)

{

cudaMalloc(&(d_A[k]), parameters*sizeof(float));

cudaMemcpy(d_A[k], A[k], equations * sizeof(float), cudaMemcpyHostToDevice);

}

[/codebox]

It might be worth turning this into one big allocation of an array with equations*parameters floats, so that you can alloc (and later copy) them in one go, instead of per-stripe.

And it’s definitely worth to add error-checking.

Thanks. I ended up factoring my code to change it into a single linear array.