I compile the CUDA code with EmuRelease mode.
I call " cudamemcpy " to copy the device data to host array.
It’s ok…, and the host data is correct. :)
But when I run the code on Release mode…( Run with GPU )
It seems that the “cudaMemcpy” is useless…
Cuz my host array didn’t update from the Device data…
All of the host data are zero… :wacko:
what’s the problem with my CUDA code ?
I need some help …Thanks~ External Image
you ask too many resources (too many threads per block e.g.)
you have a 5 sec timeout problem
You should use CUT_CHECK_ERROR (and compile in debug mode) to check for errors. Also posting your kernel code & the code where you call the kernel is helpful, because with the information you provided it is not easy to help you.