I am having problems with the SDK example simplePrintf.
I have a compute capability 1.3 GPU (GeForce GTX 285).
If I compile put a “for loop” around the call to the kernel invocation in simplePrintf.cu and compile with “-arch=sm_13”, I only get output to stdout for the first and 128th invocation of the kernel.
When I print the value of the variable “magic” in cudaPrintfDisplay() in cuPrintf.cu, it is 51217, which is the value of CUPRINTF_SM11_MAGIC, after the 1st and 128th invocation of the kernel, but 0 otherwise.
Note: magic==51216 always and everything prints if I don’t compile with -arch=sm_13, but I need “-arch=sm_13” for double precision support.
Here’s the for loop code:
int ii;
for( ii=0; ii<256; ii++ )
{
testKernel<<<dimGrid, dimBlock>>>(10);
cutilDeviceSynchronize();
std::cout << “ii=” << ii << std::endl;
cudaPrintfDisplay( stdout, truee );
}
I must be missing something really obvious.
Thanks.
Chris