cufft on multiple GPU's

Does anyone know how cufft handles multiple GPUs? Does it split the work up internally to multiple cards? Or is it determined by the cudaSetDevice() call of the thread that issues a cufftExec?

Oops - moving this over to the CUDA Programming forum