I have a very big Test.cu, in c. And it only works when I put a ‘printf(“\n”);’ before the <<<…>>> launch.
But when I try it in Debug, the results are always correct?
I have to admit, my Cuda version is still 2.2.1… But my feeling is that there is something wrong with the memory, though constant&shared Memory looks correct?