The CUDA convolution code in the SDK currently has limit on the size of the convolution kernel. I was wondering if someone could tell me if there is an easy way to remove this limitation. I would like to use the code for convolution with kernels of larger radius like 15 or 20 and wondering if it is possible to do so using the existing code base from the SDK.
There are several codes that relate to convolution in the CUDA Samples:
Which one are you referring to?
Sorry about that. I meant the convolution separable example.