The best algorithm of Gaussian fliter in Guda

Hello Everyone:
Happy new year! :haha:
This is my first time posting to the Nvidia Forum. And of course I am quite new to GPGPU(just a few weeks)
I was wondering : anybody tried to implement a recursive Gaussian filter like Ian T.Young did in his paper(not the deriche’s algorithm in the SDK examples, which has two disadvantages: it is not circular symmetric in 2-D and is not the target filter, the Gaussian). In my mind and experiment in the C code, It is the best algorithm of the Gaussian filter until now.
The Ian T.Young’ paper:
the better parameter in his another paper:
the boundary conditions for the filter:[attachment=8163:boundary…e_filter.pdf]

And certain anisotropic(not symmetric) Gaussian filters with arbitrary orientation are of great importance in image processing. So Stanley and Bertram Shi proposed an algorithm to separate the anisotropic Gaussian filter into three one-dimention Gaussian filters(ordinary separable filter plus 45degree filter or -45 degree filter), which do not need interpolation. So It make the algorithm more parallel and suitable to the GPGPU process.
the paper of Stanley and Bertram E. Shi

I am wondering someone with sophisticate cuda program skill is interesting with this code. :thumbup: If it had much greater efficiency than the C code, it will become the standard Gaussian GPU code. And it will be very useful for me and other researchers(who want to use the Nvidia Graphic Card to accelerate the whole code)

The C code composed: (51.8 KB)


Stanley___TripleAxis_Decomposition.pdf (497 KB)
boundary_recursive_filter.pdf (72.1 KB)
Recursive_Gabor_Filtering.pdf (341 KB)
Recursive_implementation_of_the_Gaussian_filter.pdf (562 KB)

no one interested :(

:wacko: :wacko: