How to use printf for Fermi

breezee · October 8, 2010, 7:35am

I see that we can use printf in kernel function for fermi cards. But when I compile the example for the programming guide, I got an error: identifier “printf” is undefined

__global__ void helloCUDA(float f) 

{ 

printf("Hello thread %d, f=%f\n", threadIdx.x, f); 

} 

void main() 

{ 

helloCUDA<<<1, 5>>>(1.2345f); cudaThreadExit();

}

I compile the program using “nvcc -arch sm_20 test.cu -o test”

Is there any problem?

breezee · October 8, 2010, 7:35am

I see that we can use printf in kernel function for fermi cards. But when I compile the example for the programming guide, I got an error: identifier “printf” is undefined

__global__ void helloCUDA(float f) 

{ 

printf("Hello thread %d, f=%f\n", threadIdx.x, f); 

} 

void main() 

{ 

helloCUDA<<<1, 5>>>(1.2345f); cudaThreadExit();

}

I compile the program using “nvcc -arch sm_20 test.cu -o test”

Is there any problem?

Jeroen · October 8, 2010, 7:48am

Add:
#include <stdio.h>

That should do the trick.

Jeroen · October 8, 2010, 7:48am

Add:
#include <stdio.h>

That should do the trick.

breezee · October 11, 2010, 6:22am

It doesn’t work. The error message changes to:

error: calling a host function from a device/global function is only allowed in device emulation mode

breezee · October 11, 2010, 6:22am

It doesn’t work. The error message changes to:

error: calling a host function from a device/global function is only allowed in device emulation mode

Jeroen · October 11, 2010, 7:13am

Which version of CUDA are you using?

Jeroen · October 11, 2010, 7:13am

Which version of CUDA are you using?

avidday · October 11, 2010, 8:12am

I can confirm that doesn’t work with either CUDA 3.1 or the current 3.2rc. I have gotten printf to work on Fermi in other codes before, and looking at the interim outputs from nvcc, this looks like an nvcc bug of some sort. Some magic combinations of code features get the printf definitions propagated into the correct phases of the compilation trajectory, other code (like this one), doesn’t.

avidday · October 11, 2010, 8:12am

I can confirm that doesn’t work with either CUDA 3.1 or the current 3.2rc. I have gotten printf to work on Fermi in other codes before, and looking at the interim outputs from nvcc, this looks like an nvcc bug of some sort. Some magic combinations of code features get the printf definitions propagated into the correct phases of the compilation trajectory, other code (like this one), doesn’t.

breezee · October 11, 2010, 8:21am

I’m using CUDA 3.0.

I can’t believe it’s just a bug of nvcc. How can NVIDIA says they provide this feature when even an example in the guide doesn’t work.

And this seems too simple to be a bug.

breezee · October 11, 2010, 8:21am

I’m using CUDA 3.0.

I can’t believe it’s just a bug of nvcc. How can NVIDIA says they provide this feature when even an example in the guide doesn’t work.

And this seems too simple to be a bug.

apaehler · October 11, 2010, 11:56am

[codebox]include <stdio.h>

device void callme(int i, float f)

{

printf(“Hello device thread %d, f=%f\n”, i, f);

}

global void helloCUDA(float f)

{

printf(“Hello thread %d, f=%f\n”, threadIdx.x, f);

callme(threadIdx.x, f);

}

int main()

{

helloCUDA<<<1, 5>>>(1.2345f); cudaThreadExit();

return 0;

}

using GTX460 / 3.2RC

nvcc -arch sm_21 -o pfermi printfermi.cu

paehler@nvidia> pfermi

Hello thread 0, f=1.234500

Hello thread 1, f=1.234500

Hello thread 2, f=1.234500

Hello thread 3, f=1.234500

Hello thread 4, f=1.234500

Hello device thread 0, f=1.234500

Hello device thread 1, f=1.234500

Hello device thread 2, f=1.234500

Hello device thread 3, f=1.234500

Hello device thread 4, f=1.234500

[/codebox]

apaehler · October 11, 2010, 11:56am