Copying struct with multiple arrays of structs with cudaMemcpy

Suppose a struct X with some primitives and an array of Y structs:

typedef struct
int a;
Y** y;
} X;

An instance X1 of X is initialized at the host, and then copied to an instance X2 of X, on the device memory, through cudaMemcpy.

This works fine for all the primitives in X (such as int a), but cudaMemcpy seems to flatten any double pointer into a single pointer, thus causing out of bounds exceptions wherever there’s an access to the struct arrays in X (such as y).

In this case am I supposed to use another memcpy function, such as cudaMemcpy2D or cudaMemcpyArrayToArray?

Suggestions are much appreciated. Thanks!

Have you tried type-casting it manually as a double pointer?

Like, Y tmp = (Y) X2->y;


I don’t see how that could solve the problem. The data type isn’t the issue here - the issue is how to transfer a complex data structure (non-contiguous) from host memory to device memory.

Besides what I’m already doing (a regular cudaMemcpy from host_X to dev_X), I’ve also tried a for loop cudaMemCpy’ing host_Y’s to dev_Y’s, which results in a crash.

Well, can you actually manipulate your structure like a 1D array? If it works as a 1D array, then why even change it? Just change the way you iterate over it.