I’m a new guy to Cuda.
I add a printf() in the global function to watch some variables.
When building it returns an Error: calling a host function from a device/global function is only allowed in device emulation mode
i am building under the EmuDebug. How can i set the device emulation mode???
how can i watch the variables in the global/device function
You might want to keep reading the CUDA programming guide until you finish Chapter 4. Then you will understand why you are getting an incorrect answer (hint: you are getting zeros because your kernel is never launching), the output to file is working correctly.
Kiran_CUDA: You can not call your kernel function with pointers to the host memory, the pointers must be to the device memory, you have to allocate memory on the device first (using cudaMalloc), then copy the A and the B arrays (using cudaMemCpy), then run the kernel with the pointers to the device memory, and then copy back the result.
In emulation, the GPU isn’t touched. Everything is done on the host CPU in host memory, with the emulation layer launching one CPU thread per GPU thread requested by the CUDA code. Threads are serviced sequentially and in-order. This means that there are a range of potential problems that emulation cannot detect, like race conditions, coherency problems, and certainly classes of improper device memory usage.