Efficiently dereferencing an array of pointers

Hi all,

I’m rewriting part of a program in CUDA and I’m not sure how best to approach this problem:

Suppose I have an array


which contains pointers to structs


. I want to use


to allocate an array


on the GPU which contains the structs, not pointers, and then use cudaMemcpy to fill the array with the


structs give by the pointers in the first array.

I could just loop through the elements and allocate one by one but I fear that this might take longer than necessary, particularly if cudaMemcpy takes any amount of time to start up or something (I may have a LOT of potato chips).


You should definitely do only one large memory allocation on the GPU for the whole array of structs. For highest speed you can also copy the structs into one contiguous array of structs on the CPU (preferably in page-locked memory for faster transfer).