How to run device code on CPU?


I am running experiments to test the time difference between a CPU run and a GPU run. Although i have a separate piece of code written for the CPU, i actually want to test it using the same code if possible. I got to know from someone
that we can use the same .cu file code to execute on the CPU. We just need to change a flag while compiling with nvcc. Is that correct? I tried looking for it in the NVCC guide but did not find any thing in particular.
However the nvcc -c option converts an file to an file. Does this new .c file now execute on the host?
Can someone plz expalin the use of -c and also give me a solution if this is not it?


You could use the emulation (flag -deviceemu for the compiler) to run your code on your CPU, wether you have CUDA-enabled GPU or not (enable you to use any PC as a development tool). BUT…

But the code that you may develop and optimize for CUDA GPGPU may run slowly and totally inefficiently on a CPU, because it’s not optimized for CPU, and worse even the algorithms choosen for CUDA are often “unnatural” for general purpose CPU.

So running CUDA C-code into CPU is just meaningless in terms of performance.

You have to compare execution time between CUDA optimized code on GPU, versus CPU optimized code (assembly, SSE, prefetching, etc…) on CPU.

Thanks for the insight!