in-kernel printf

Hey, I want to use in-kernel printf.
I have cuda toolkit 3.2, VS 2008.
I have GTX 460.
I build my app for compute 2.0.

When I put printf(“helo”); in my kernel I get following compiler error:

: error: identifier “printf” is undefined

Why? Do I have to include some special header to use printf?

include stdio.h and it should work.

I compiled the code but now when I launch my kernel program crashes :( - why?

I build it on Realease config… should I use debug for printf or what?

It crashes on > program.exe!__tmainCRTStartup() Line 501 + 0xf bytes C

below is some strange file crtexe.c

/*

            * do C++ constructors (initializers) specific to this EXE

            */

            if (__native_startup_state == __initializing)

            {

                _initterm( __xc_a, __xc_z ); // <<<<<<<<<<<<<<<<<<<<<<<<<<< this line unhandled exception

                __native_startup_state = __initialized;

            }

Okay it works now, something strange happened and I had to recompile whole project…

Thanks a lot for help :)

mainCRTStartup is buried in the CUDA runtime library. A failure like that would generally suggest something in your build system, toolchain or CUDA toolkit is badly broken, but I can’t help with Windows or Visual Studio troubleshooting, sorry.

you can not use printf inside a kernel but there is an implementation called cuprintf.

yes you can. ignore my comment

External Image

I am having the same problem, and must be doing some boneheaded thing. I have compute capability 2.1, so I should be able to use printf. Here is my code:

include <stdio.h>

global void test_kernel() {

printf(“Can I do this?\n”);

} // end test_kernel

int main(int argc, char *arvc) {

test_kernel<<<1,1>>>();

}

I compile it with

nvcc -o hello_world hello_world.cu

And encounter the compilation error:

hello_world.cu(4): error: calling a host function from a device/global function is not allowed

Any help would be appreciated.

Thanks

Pass an architecture selection option to nvcc to tell it you want to compile for a compute 2.1 device, something like

nvcc -arch=sm_21 -o hello_world hello_world.cu

Yeah!

I had to add the line

cudaThreadSynchronize();

after the kernel call, but now it is working.

Thank you very much.