Understanding CUDA box filter

Hey guys,

I am trying to understand how cuda box filter works. Here is a presentation with very good exemple. (http://www.nvidia.com/content/nvision2008/tech_presentations/Game_Developer_Track/NVISION08-Image_Processing_and_Video_with_CUDA.pdf). I have understand how rows are process but i don’t understand how column are process, exemple from slide 27,28.
First time are prococess rows and after the row jub is finish the column are process with the same ideea? Filter (2*r)+1 ?:(
Can somebody help me? :D

What i actually can’t understand is how the final value is calculated.
I have understand that eache thread calculate an average with a specific radius value. But after that i can’t understand how are process this value on column.
First time are made the calculation for row and after that are made the same calculation for column?
Eche pixel value will be the avarage value of his neighbors on row, and after that the some thing on column?

Exactly. A box filter is a separable convolution.

Thanks for your reply @HannesF99
So all it does is to add a number of (2r+1) pixel value and divide them by the (2r+1) for row , and then the some thing for column ?
That easy?:D

Box filter is conceptually easy yes. See section 3.1.1 of http://www.bioss.ac.uk/people/chris/ch3.pdf
Using ‘running sums’ might give additional performance and is easy to implement on CPU, but not straightforward to implement on GPU


Thank you HannesF99