cudaMemcpy : How to copy a smaller chunk of data from source to destination

I am writing a MD code using CUDA C and I have run into a problem with cudaMemcpy.

CHECK(cudaMemcpy(d_graphene, d_latticePointEvolution, nBytes, cudaMemcpyDeviceToDevice));
CHECK(cudaMemcpy(d_latticeVel, &d_latticePointEvolution + nBytes, nBytes, cudaMemcpyDeviceToDevice)); //Shifting source address by nBytes to reach the Vel section
CHECK(cudaMemcpy(d_latticeAccln, &d_latticePointEvolution + 2 * nBytes, nBytes, cudaMemcpyDeviceToDevice));//Shifting src address by 2*nBytes to reach the Accleration section
cudaDeviceSynchronize();

In the code snippet above “d_latticePointEvolution” is one large structure which holds position, velocity and acceleration. Variable declaration is as follows.

typedef struct{
    double x[LEN];
    double y[LEN];
    double z[LEN];
    double Vx[LEN];
    double Vy[LEN];
    double Vz[LEN];
    double Ax[LEN];
    double Ay[LEN];
    double Az[LEN];
} LatticePointEvolution;

I want to copy the data stored in Vx, Vy , Vz to another device variable “d_latticeVel”. To achieve this I am shifting the address of “&d_latticePointEvolution” by “nbytes” which is the size of three arrays x, y ,z together in “d_latticePointEvolution”.

My code is compiling but when I run my code I am getting below error.

Error: graphene_main.cu:159, code: 11, reason: invalid argument
Error: graphene_main.cu:160, code: 11, reason: invalid argument

Can anyone please help with the correct way to do this ? Is it possible to copy a specific piece of data [an array in my case] from a device structure variable [which has a bunch of arrays defined in it] ?

In C or C++ programming, if d_latticePointEvolution is a pointer, and we want to offset that pointer by nBytes, we do:

d_latticePointEvolution + nBytes

not

&d_latticePointEvolution + nBytes

LatticePointEvolution *d_latticePointEvolution;

I had initially done what you are suggesting.

CHECK(cudaMemcpy(d_latticeVel, d_latticePointEvolution + nBytes, nBytes, cudaMemcpyDeviceToDevice)); //Shifting src address by nBytes to reach the Vel section NEED TO VERIFY THIS
CHECK(cudaMemcpy(d_latticeAccln, d_latticePointEvolution + 2 * nBytes, nBytes, cudaMemcpyDeviceToDevice));//Shifting src address by 2*nBytes to reach the Accleration section
cudaDeviceSynchronize();

But i am getting the same error

Error: graphene_main.cu:159, code: 11, reason: invalid argument
Error: graphene_main.cu:160, code: 11, reason: invalid argument
CHECK(cudaMemcpy(d_graphene, d_latticePointEvolution, nBytes, cudaMemcpyDeviceToDevice));
cudaMemcpy(d_latticeVel, (d_latticePointEvolution + nBytes), nBytes, cudaMemcpyDeviceToDevice); //Shifting src address by nBytes to reach the Vel section NEED TO VERIFY THIS
cudaMemcpy(d_latticeAccln, (d_latticePointEvolution + 2 * nBytes), nBytes, cudaMemcpyDeviceToDevice);//Shifting src address by 2*nBytes to reach the Accleration section
cudaDeviceSynchronize();

I removed the CHECK error handle and its not giving me the error when I run the program. I will do the necessary logic checks and reply back if this has indeed solved the issue.

Thanks txbob for your help.

It won’t solve the issue. Don’t remove the error-checking.

d_latticePointEvolution is a pointer to a structure.

In C or C++ when we offset a pointer like this:

d_latticePointEvolution + nBytes

the compiler knows that d_latticePointEvolution is a pointer to a structure, and it assumes that the numerical offset value is in units of that structure, not bytes.

If you want to offset a particular (non-byte-type) pointer in units of bytes, you must first cast that pointer to a byte-type, such as char:

((char *)d_latticePointEvolution) + nBytes

Ok i am seeing something interesting now. Based on your comment i first did this

CHECK(cudaMemcpy(d_latticeVel, (((LatticePointEvolution *)d_latticePointEvolution) + nBytes), nBytes, cudaMemcpyDeviceToDevice));

But this ended up giving the same error that I reported earlier. But when i replaced “LatticePointEvolution” with “char” it compiles and runs correctly.

CHECK(cudaMemcpy(d_latticeVel, (((char *)d_latticePointEvolution) + nBytes), nBytes, cudaMemcpyDeviceToDevice));

Does the above casting also change my data type in the arrays which originally is “float” to “char” ? I dont understand why “char” casting is working here, this is something totally new for me. Can you give some explanation or reference i can look at ? Again, thanks for the help.

No it doesn’t change any of your data, nor does it change any behavior when you use the original pointer:

d_latticePointEvolution

at any later time.

This is basically a C programming concept (C pointer arithmetic). Manipulation of pointers in C. There’s nothing CUDA specific about this, and if you tried to do something similar in ordinary host code (with memcpy, for example) you’d run into the same issues.

If you want references, just google “offset pointers in C” and read the first few hits.

Why do all the groundwork yourself? Instead of first figuring out what nbytes is, and then how to add that to a pointer of the wrong type, just write

&(d_latticePointEvolution->Vx)