cudaGetDevice unreliable

Consider the following code (tested with CUDA 2.0 beta 2 on OpenSuSE 10.2):




At least in debug mode, here cudaSetDevice can not work because CUDA_SAFE_CALL calls a function that already forces code to execute on device 0.

While that is annoying, that is understandable and acceptable.

But cudaSetDevice does not return an error, and cudaGetDevice will still return 1, making finding this problem a real pain!

Note for more reliable testing in case CUDA_SAFE_CALL is changed or debug is disabled, try with the attached file.

(FYI I tested on which device the code runs by looking at the device temperature - for my application that is a very clear indication).
test.txt (321 Bytes)

remove the CUDA_SAFE_CALL calls. It is not like they are useful at that stage.

I know I can do that, I found that out myself. Though I would not have wasted as much time debugging it if the cudaSetDevice and cudaGetDevice functions did not behave in such an idiotic, and in the case of cudaGetDevice completely contrary to the documentation, way.

So consider this as an official bug report/enhancement request to NVidia, as far as such a thing exists for ordinary users.