printf statements from cuda's __global__ and __device__ functions

I noticed, when I compile the cuda file with nvcc in deviceemu mode, and link it with fortran compiler (either ifort or gfortran), the printf statements from global or device functions do not work.

The same statements work in case the whole code is in c and I am using nvcc only for the compilation. Is this the way it should be? Or please suggest as to how I could print these statements. I also posted this on CUDA on linux forum but did not receive any response, therefore posting here.

System I am using is as follows:

OS - centOS 5.3
Processor: Intel Core 2 X9650
Main memory: 4 GB
Cuda: version 2.2
gfortran: version 4.1.2
ifort: version 11

Thanks in advance,

Of course they don’t work when you’re not compiling in emulation mode - how is the GPU going to execute a printf? You should surround the print statements in device code with #ifdef DEVICE_EMULATION guards.

Thanks YDD,

The problem is not that as nvcc is used for compilation exactly same way as the file which contains the cuda functions remains same for C as well as fortran linkers. I think I know the reason now. When I was linking the library with fortran compiler, I was using cufft library and not the cudart. Some how cufft also resolves all the symbols and produces the executable but printf function does not work.

While trying to make a very simple code for demonstration, I noticed it in the makefile. Later when I used cudart library while linking printf works with fortran compilers as well.

May be some one from nvidia can verify this.

Thanks again,



I’ve tried this on Linux (fedora core 9) but there is no output.

I’ve also tried opening a file (using fprintf) because I thought there would be a zillion printfs, but there is no output, no errors, no file. I’m doing make emu=1 dbg=1

Any pointers to the documentation and/or notes that documents how to use printf inside the kernels ?