Number of 1D FFTs performed a same time for a given size

Hello everyone!

I know I can use the batch mode of cuFFT if I have some n FFTs to perform of some m vectors each. How can I know the maximum number of FFTs that can be executed for the array size in order to get all GPU threads working. Is there a simple way to know that or I have to discover it manually (trial and error? )


Hi Oscar,

It would depend on the size of the FTT and GPU as well as the version of cuFFT, perhaps even the driver version. For these reasons I would recommend that you empirically determine it, either via profiler or preferably just looking at execution time as the batch size increases. The latter will likely observe a smoother curve of overall time vs. size than you anticipate. Also, “get all GPU threads working� is not the same as getting good performance, since you can get some very fast kernels with ILP and low occupancy, especially when on-chip resources are at in demand.

Hope this helps,