I am writing one program which applies 13x13 filter on image in CPU. I want to optimize this code with using GPU. For that I have one way to do this, use CUFFT libraries. In this library there is some functions for fourier transform, like cufftExecR2C, cufftExecC2C and cufftExecC2R. But my image data and filter kernel is in real format.
So how can I apply real to real operation in FFT calculation using CUFFT library?