kernel radius in SDK convolutionSeparable how to change?


I’m trying to run the convolutionSeparable example from the SDK 2.3, and I want to change the KERNEL_RADIUS. The documentation shows results with kernel_radius between 1 and 31, but I can only run it with 8.

I tried to change the value KERNEL_RADIUS in the file convolutionSeparable_common.h to 16 or 32, but after I compile and run, I get the following assertion error:


$ ./convolutionSeparable

3072 x 1536

Allocating and intializing host arrays…

Allocating and initializing CUDA arrays…

Running GPU convolution (10 identical iterations)…

convolutionSeparable: void convolutionColumnsGPU(float*, float*, int, int): Assertion `8 * 1 >= 16’ failed.



I tried playing around with some of the other constants, but then the correctness test fails.

Has anyone been able to run this example with different kernel radius? I would appreciate any help.