I’m trying to transfer data from device to host, using cudaMemcpy
The size of the data is only 12KB, but it is taking about 50mSec.
Is this normal?
Transfering from host to device is much faster though (4KB in 230uSec)
Any thinkable cause for this? :blink:
EDIT: Actually no matter the size, its always 50mS…thats even weirder :|