Hello,
I am a beginner in CUDA. I am trying to copy a complex struct with pointers and arrays on the device but I get illegal memory access with the arrays. I am using cudaMalloc() and cudaMemCpy().
How I can correctly copy the struct on the device without errors?
Copying a struct of arrays to device memory is no different from independent arrays. You need to make sure you use cudaMalloc on each struct item. Could you provide that code you’re trying to implement?
That looks correct to me. I suggest starting with a smaller example that just performs cudaMemcpy to confirm you’re doing everything correctly. You might find one at the NVIDIA Developer blog https://devblogs.nvidia.com/.
When you are copying pointer types to the GPU, you have to do a deep copy. Each pointer has to get a new value pointing to the newly allocated device memory. If there is a pointer to host memory left, it will lead to a crash of the kernel.
Alternatives to manual deep copies:
use index-oriented or use value-oriented data structures instead of pointers
use managed memory (see mnicely’s link) or zero-copy host memory. Both are slower than global device memory.
use C++ data structures with defined ownership of member variables. Then you can either use member variable types that manage their memory on their own or do have a copy constructor and copy assignment operator, which handle the deep copies.
All in all, the GPU is not well suited for data structures with pointers. A single index (as in the example of the OP) typically works well, but pointers to pointers are often a sign to reformulate the algorithm.