If I want to use multiple GPU’s in parallel do I have to use cutStartThread? Is there something magical or special about cutStartThread?
Does cudaSetDevice set the device for the entire process or just the current thread?
The reason I am asking is because it seems that cudaSetDevice sets the device for all of the threads in a process, not just the current process. I’m wondering what I’m doing wrong.
Don’t use cut* ever. It’s not an officially supported part of CUDA, just some things to make SDK examples easier to read. Use your thread library of choice.
OK mystery solved. I Googled cudaGetDevice and all that came up was entries for cudaSetDevice. So I thought the command didn’t exist. It turns out that this is a feature of Google.