Consider the following code (tested with CUDA 2.0 beta 2 on OpenSuSE 10.2):
CUDA_SAFE_CALL(cudaGetDeviceCount(&devices));
CUDA_SAFE_CALL(cudaSetDevice(1));
CUDA_SAFE_CALL(cudaGetDevice(&dev));
At least in debug mode, here cudaSetDevice can not work because CUDA_SAFE_CALL calls a function that already forces code to execute on device 0.
While that is annoying, that is understandable and acceptable.
But cudaSetDevice does not return an error, and cudaGetDevice will still return 1, making finding this problem a real pain!
Note for more reliable testing in case CUDA_SAFE_CALL is changed or debug is disabled, try with the attached file.
(FYI I tested on which device the code runs by looking at the device temperature - for my application that is a very clear indication).
test.txt (321 Bytes)