Streams and CUFFT

bidger · March 5, 2008, 7:02pm

I think I cannot do this, but I wanted to confirm:

I wanted to call cufftExecC2C (or any CUFFT really) within different
streams. I think the cufft calls are not callable from within kernel
routines, and I think that means I am out of luck. I had a simple
kernel defined like:

global void
fftKernel(cufftHandle fftPlan, cufftComplex *d_fftArrayA, cufftComplex *d_fftArrayB)
{
// now call our fft
CUFFT_SAFE_CALL(cufftExecC2C(fftPlan, d_fftArrayA, d_fftArrayB, CUFFT_FORWARD));
}

Then I wanted to call that as in:

fftKernel<<<nblocks, nthreads, streamArr[i]>>>(fftPlan[i], fftDeviceArrayA[i], fftDeviceArrayB[i]);

I get nvcc compile errors for this; I believe it is angry about trying to call
the cufft routine within another kernel – I recall reading somewhere that
cufft calls were esentially kernel calls in and of themselves.

So - is there a way to call a cufft routine within a stream? Thoughts on this would
be appreciated, thanks.

jimh · March 5, 2008, 7:09pm

I’d like to know if CUFFT works in a stream as well.

mfatica · March 5, 2008, 7:17pm

I am not sure it will work with streams, you may try to modify the source code. Also remember that in order to use async calls, the data need to be in pinned memory.

By the way, your call is wrong:

fftKernel<<<nblocks, nthreads, streamArr[i]>>>(fftPlan[i], fftDeviceArrayA[i], fftDeviceArrayB[i]);

should be
fftKernel<<<nblocks, nthreads, 0, streamArr[i]>>>(fftPlan[i], fftDeviceArrayA[i], fftDeviceArrayB[i]);

You are missing the shared memory info.

jimh · March 6, 2008, 10:43pm

Thanks for the info, Massimiliano.

skb · March 21, 2008, 6:40pm

Has anyone got cufft running in a stream as yet? Also, when I run cufft through the profiler, I see kernel code with _mpsm and _mpgm extensions. I assume this has to do with shared mem and global mem access? I don’t see this in the CUFFT source code release… do we have the complete source code to get equivalent performance as running the host callable CUFFT routines?

Thank you,

skb

Charley · March 22, 2008, 1:16am

I think CUFFT works with streams if you use batched FFTs.

skb · March 26, 2008, 7:03pm

I thought the batched ffts only changes the grid dimensions on the kernel code of cufft. Have you had success using batched ffts and launching cufft on a stream?

I have noticed that not all the CUFFT code is provided to us, so it will take some time to get it to work with streams if we have to modify the source code. I have posted a request to NVIDIA to see whether they have any advice or (preferably) make the entire CUFFT library source code available to us. No answer either way yet … :(

skb

Charley · March 26, 2008, 11:40pm

I thought I did, but I was confused about streaming. Does anyone know if someone else has written an FFT routine for CUDA?

mstgpuser · June 16, 2008, 11:16pm

Does anybody know where to download the source code for CUFFT?

Thanks.

Topic		Replies	Views
How to use cufft ... Have problem calling cufft functions in kernel CUDA Programming and Performance	5	12374	November 5, 2014
cuFFT feature request: Streams Can it be added or can the source be (re)published? CUDA Programming and Performance	1	1294	March 24, 2009
cuFFT's stream support CUDA Programming and Performance	0	3287	July 29, 2009
Implementing cuFFT with streams problem GPU-Accelerated Libraries cufft	3	924	October 12, 2021
cuFFT, MemcpyAsync = gain ? howto use streams CUDA Programming and Performance	2	6594	January 27, 2011
Using STREAMS with CUDA compute capability 1.1 on C1060 STREAMS with CUDA compute capability 1.1 on CUDA Programming and Performance	1	1409	February 18, 2009
cuFFT + streams CUDA Programming and Performance	8	5259	May 18, 2018
CUFFT strange behaviour when using streams GPU-Accelerated Libraries	3	1362	April 9, 2015
cufft concurrent streams CUDA Programming and Performance	2	1926	August 20, 2014
Streams Problem CUDA Programming and Performance	2	4710	December 7, 2008

Streams and CUFFT

Related topics