Unspecified host to device memory copy between kernel calls


My program calls the same device kernel around 9,000 times and host to device memory copies are occurring automatically after each kernel has finished even though I didn’t specify those copies. I’m not using unified memory for your information.

This issue doesn’t happen if I declare device variables under a module with the device kernel rather than declaring them in my host code. I’d like to keep the device variables declared in the host code because it seems to be using less registers.

I’d appreciate any insight regarding this problem. Thank you!


Hi Youn,

Assuming your using assumed-shape arrays as arguments, it’s the array descriptors that’s being copied. Besides putting these in a module, the another work around is to pass in the arrays as assumed-size.


Hi Mat,

Thank you very much! I changed assumed-shape arrays to fixed-size arrays and the problem went away. There are no more host to device copies after each kernel call.