since CUDA 11.2 associating streams with multi GPU cuFFT plans is allowed. In my case I have several GPUs computing in their private streams, then there is a multi GPU FFT and then the computation continues. What I would like to do is associating each GPU’s private stream with the cuFFT plan so when I call the cufftXtExec, the operation is enqueued in compute stream’s queue. But I do not know how and if it is even possible.
I tried calling cufftSetStream() for each GPU’s compute stream (in its own context) however it did not work. Plus the documentation says that only one stream can be associated with the plan. So my question is, how should associating streams with multi GPU plans work?
Thanks for reply.
I don’t think it is possible to do it directly. The CUDA stream system has no way to directly specify a dependency from one stream to another.
What you probably can do is associate the cufft plan to stream A, launch the multi GPU FFT into stream A, then put a cudaEvent after that into stream A, call it event E.
Then in your “private” streams, put a cudaStreamWaitEvent call on E at the appropriate point. That will cause each stream to wait until the multi gpu fft call is complete.
thank you for your reply. I will try it out and let you know if it worked. However I still would like to ask if it is possible to expect a native solution for this issue, for example a cuFFT API call like:
cufftXtSetStream(cufftHandle handle, cudaStream_t *stream)