Unspecified driver error using device printf on Fermi?

Trying to get the Fermi device printf to work, but not having much success, I’m compiling a variation of a previous example found in this forum (however printf with no args still fails):

#include <stdio.h>

__global__ void helloCUDA(float f)


	printf("Hello thread %d, f=%f\n", threadIdx.x, f);


int main()



	helloCUDA<<<1, 5>>>(1.2345f);


	cudaError_t error = cudaGetLastError();

	if (error != cudaSuccess)	printf("FAILURE: %s\n", cudaGetErrorString(error));


	return 0;


Compiling like this:

nvcc-3.2 main.cu -o main -arch=sm_21

(running 260.19 driver and x86-64 Linux)

I’ve tried this with a Tesla C2050 and a Quadro 4000 but still get an unspecified driver error. Anyone have any ideas?

It appears the issue is compiling for 64bit, as passing the -m32 flag to nvcc seems to work as expected.

Is there a restriction on using device printf for 32bit Linux only? I can’t seem to find anything in the documentation that inicates this.

I’ll try this out with the newer 270 driver and Cuda 4.0 rc2…

Tesla C2050 is sm_20, not sm_21. What happens if you compile for sm_20 ? I use device-side printf() on a C2050 under Linux64 all the time, so this should definitely work.

Just tried that to no avail. The PTX comes out with an sm_20 header regardless, so I guessed it was checking compatibility at compile time anyway.

The latest 270.40 driver with Cuda 4.0rc2 works fine for 32bit and 64bit however. It’s just 260.19 and Cuda 3.2 that have issues with 64bit.

I’ve tried rebooting on the same driver as well, but just can’t make it work with that combination. I’m happy to push my driver on and use cuda 4.0 as the version I use debugging, but I’d be interested to know if anyone else has encountered this issue though?

I’ve been getting the same error off and on : sometimes when I use printf i get “unknown error”. I just solved this issue in 1 project by running with the GPU debug flag -G0 on… Why should this be necessary?