I have been trying to implement 2d convolution with CUFFT in an audio plug-in, but my kernel (impulse response)
needs to be much larger in size than the input data array (about 100-1000 times larger generally). I’ve found lots of tutorials but they re always using a small kernel and a much larger data input ( e.g. the CUFFT convolution2d example project and other image processing related examples).
How should i go implementing it for a large kernel and small input data?
What I’m thinking is that :
Since convolution is commutative ( x[n] * y[n] = y[n] * x[n] )…
I can probably just swap the arrays and instead of copying my host input data
to the example’s input data array, copy them to the kernel.
Then copy the filter kernel to the variable used for the input data in the example,
and vuala!.
Is this true for this type of convolution (circular 2d)?
I’ve got some doubts, cause I’ve seen that padding in 2d is not quite the same process as in 1d convolution… there are variables like kernel_x and kernel_y used in padding …
By the way…
I’ve seen in some relevant tutorials that kernel_x variable should be equal to half kernel width,
and kernel_y equal to half kernel height.
BUT, in the convolutionCUFFT2d example if I remember correctly:
You shouldn’t even need to swap pointers (I think) The sample code determines the appropriate size of array to use for the 2-D FFT. For example, they determine that the width of the data should be some appropriate number (for optimal execution of the FFT) >= DATA_WIDTH + KERNEL_WIDTH - 1. That number will stay the same regardless of whether you pass the data or kernel as the data or kernel.
KERNEL_X/Y just refers to the center point of your kernel and you’d probably need something similar in 1-D convolution to setup your data properly. If you were dealing with a 1-D signal this would affect the causality of your filter (ie. put the center at one end and the output is based entirely on past inputs whereas at the other it will be based entirely on future inputs).
It’s been a while since I’ve done any signal processing though…