cufftDx performance not achieve the cufft performance

oshinover · August 10, 2021, 2:13pm

Hi!

I’m trying to improve performance using cufftDx library instead of cufft.
I created matrix of 1024X1024 complex numbers, and made convolution of each row with complex vector (using FFT, vector multiplication and IFFT).

Using the cufft library, I used FFT and IFFT planned by cufftPlanMany, and vector multiplication kernel.

Using the cufftDx, I implement all the convolution in one kernel so I was expected to get better performance because of the efficient L1 cache usage.
I created the convolution kernel so each block act on few rows of the matrix, and perform the convolution on this rows. Thus, every SM execute the convolution on amount of data that is smaller then the L1 cache. This way, the L1-cache usage is efficient and the execution time of the convolution suppose to decrease.

It didn’t worked and I got better results with the cufft, Any Ideas?

mnicely · August 11, 2021, 2:58pm

You would need to add code and hardware specs to get performance evaluation.
Please note that cuFFTDx is still in EA, therefore maximum performance isn’t guaranteed.

Topic		Replies	Views
cufftDx - inverse FFT behave like forward FFT GPU-Accelerated Libraries cufft	4	647	June 28, 2022
How can I get good performance from cuFFT? GPU-Accelerated Libraries	2	1419	June 8, 2016
FFT Cuda implementation CUDA Programming and Performance cuda , kernel	4	859	June 3, 2021
Performance of CuFFT 3.1 library CUDA Programming and Performance	0	3260	July 8, 2011
Using cufftDx to calculate FFTs on matrix lines GPU-Accelerated Libraries cufft	3	1075	November 18, 2022
Does cufft show much higher efficiency than cpu fft routines? CUDA Programming and Performance	10	9181	July 19, 2010
cuFFT DFT Performance question GPU-Accelerated Libraries	1	518	November 20, 2019
Batched 1D FFTs (using CUFFT and MEX) CUDA Programming and Performance	7	3630	March 4, 2009
CUFFT: calculation time CUDA Programming and Performance	6	2676	April 21, 2012
CUFFT (and kernel) questions CUDA Programming and Performance	1	2222	August 14, 2009

cufftDx performance not achieve the cufft performance

Related topics