I found a surprise: in VS2008, if you change .C or .CPP compiler from ‘C/C++ Compiler Tool’ to ‘CUDA Build Rule v2.1.0’, you will get a big performance improvement to your runtime program. On my problem, arround 25%. Attention, you need NOT to write any CUDA code to get it.
Any one know why? It caused by NVCC optimization or my missing some performance related parameters on ‘C/C++ Compiler Tool’?
Do you mean your CPU code runs faster? When compiling with ‘C/C++ Compiler Tool’, did you choose the “Release” mode in VS? Also, there are a couple of Optimization options that you can tune in the Project properties (or preferences, didn’t use VS in a while) to get a better performance.
According to documentation NVCC itself doesn’t compile your CPU code, it rather separates it from GPU code and passes on to your regular compiler.
It has to do with your configurations in VS and has no connection to CUDA. In general, if your code doesn’t get faster in “Release” mode, then something is wrong with your configs for Release. Just a friendly suggestion, you should look for your answer in VS forums.
It gets faster in release mode because VS takes out the debugging symbols and code (fewer instructions == faster program). It should also be using less memory (the host part, not the CUDA part), again due to the lack of debugging information.