Dedicating cuFFT to a single GPU in a system with multiple GPU's


I’m using a system with multiple GPU’s. I would like to dedicate a GPU to a process, and that have that dedicated GPU not be the “root” GPU, or basically not device 0. When attempting to call cufftExecR2C() in a process that I’ve called cudaSetDevice() in to select a GPU for that process that is not the “root” or 0 device, I’m getting a CUFFT_EXEC_FAILED error. When I pass “0” to cudaSetDevice() in that process though, the calls to execute the FFT works as expected.

Is there a step I’m missing when attempting to use cudaSetDevice() to map a single GPU to a single process on a system with multiple GPU’s, and combined that with the cuFFT library?

As a quick follow-up, this seems to be unrelated to cuFFT in general… there seems to be something wrong with the stream handle I’m passing the library, as cuda-gdb is telling me the kernel launch is returning an error code of 0x21 which is for an invalid resource handle.