I’m a new guy to Cuda.
I add a printf() in the global function to watch some variables.
When building it returns an Error: calling a host function from a device/global function is only allowed in device emulation mode
i am building under the EmuDebug. How can i set the device emulation mode???
how can i watch the variables in the global/device function
///*printf(“%f”, C[i]); // You canNOT call a Host function sitting inside a Global/Device function in “Debug” and “Release” modes.
}
void main()
{
int i;
float A[3];
A[0]=1;
A[1]=2;
A[2]=3;
float B[3];
B[0]=1;
B[1]=2;
B[2]=3;
float C[3];
//kernel invocation
vecAdd<<<1, 3>>>(A, B, C);
getch();
}
I have the following questions:
1- When we run the program in emulation mode, do we really take the advantage of the nVIDIA graphics card (9600 Gt, 512MB )present in the PC
2- How exactly i can see the value in C (which should be 2.000 4.0000 and 6.00000). You had given the hint that we can save the variable out put in a file. In my case it would be
You might want to keep reading the CUDA programming guide until you finish Chapter 4. Then you will understand why you are getting an incorrect answer (hint: you are getting zeros because your kernel is never launching), the output to file is working correctly.
Kiran_CUDA: You can not call your kernel function with pointers to the host memory, the pointers must be to the device memory, you have to allocate memory on the device first (using cudaMalloc), then copy the A and the B arrays (using cudaMemCpy), then run the kernel with the pointers to the device memory, and then copy back the result.
well I have one more question. In emulation mode what role does a graphics card plays (if it is installed)? or is it completely detached from the whole process?
In emulation, the GPU isn’t touched. Everything is done on the host CPU in host memory, with the emulation layer launching one CPU thread per GPU thread requested by the CUDA code. Threads are serviced sequentially and in-order. This means that there are a range of potential problems that emulation cannot detect, like race conditions, coherency problems, and certainly classes of improper device memory usage.