Question about GPU usage


I have question about GPU selection when programming CUDA.

I am seeing some weird behavior with some CUDA code that I’m trying to debug (mostly not my code). In case it is relevant: I am primarily trying to sort out what I believe are some issues with concurrency in the code leading to deadlocks and incorrect results. In the interest of thoroughness I will also mention that this application is supposed to use multiple threads to do various calculations on the same GPU.

As part of my investigation, I ran nvprof like this:

nvprof --profile-child-processes --print-gpu-trace ./application

Looking at the trace, I see that nvprof believes everything is running on the system’s single TITAN Xp, as should be the case. The curious thing is that when I look at activity with nvidia-smi, I see that the process is using two GPUs: the Titan Xp and a GTX 1080 that is also on the system (devices 0 and 3, respectively). I should mention that the system actually has a total of 4 GPUs: one Titan Xp and 3 GTX 1080s.

As far as I know there is no selection of GPUs whatsoever in the code and this is corroborated by the nvprof trace. What do I make of the discrepancy I see with nvidia-smi’s output? Could this be related to the deadlocks/incorrect results?

Thanks for any clues!

If you believe that the application should only use 1 GPU, you can restrict it with the CUDA_VISIBLE_DEVICES environment variable. The CUDA runtime may do activity on other GPUs when you have multiple GPUs, even if those GPUs are not actively involved in the computation.