3D Separable Kernel Some question

TranceMokes · March 16, 2009, 7:12pm

Hi,

In the SDK examples there is 2D Separable kernel available. Now i want to expand this to a 3D Separable Kernel. I tried many different ways, but still no solutions :unsure:

This is the first time i"m playing with CUDA so much, and some questions shows up.

It is possible to expand the 2D Separable kernel? Yes, i think.

I programmed a benchmark already, so i can see that there is a difference between my code and CUDA.

Row Convolution no problem, because he is reading row data 1 2 3 4 5 6 7 8 9 10 … etc.

Column Convolution problems, because he thinks my column of the Z-axis is under the other data? :wacko: I willl explain it

Slice 1 Slice 2

1 2 3 4 – – 3 4 7 9

4 5 6 7 – – 4 8 3 1 ===> I want that CUDA do a separate column convolution on every slices, not one image.

8 9 1 2 – – 4 6 8 1

Question time: I Can seperate the DATA_ Z but i need then to make more kernels? Do i lose many speedup because of this? I cant test it out because I got no 3D Separable Kernel convolution yet

At the moment my program is thinking, that slice1 and slice2 are one Slice.

Slice 1

1 2 3 4

4 5 6 7

8 9 1 2

3 4 7 9

4 8 3 1

4 6 8 1

I was thinking to use the Z - dimension, but if i look at the code of the separable convolution. I think its impossible to do, correct me if i’m wrong?

DATA_W & DATA_H = 256 //// DATA_Z = 10

dim3 blockGridRows(iDivUp(DATA_W, ROW_TILE_W), DATA_H*DATA_Z);

dim3 threadBlockRows(KERNEL_RADIUS_ALIGNED + ROW_TILE_W + KERNEL_RADIUS);

dim3 blockGridColumns(iDivUp(DATA_W, COLUMN_TILE_W), iDivUp(DATA_H, COLUMN_TILE_H),DATA_Z);
dim3 threadBlockColumns(COLUMN_TILE_W, 8 ,??? );

Can someone explain me why you need less ID for the ColumnConvolution

blockGridRows x threadBlockRows ==> 77824 = (2,256) * (152,1,1) ?

blockGridColumns x threadBlockColumns ==> 12288 (16,6)*(16,8) ?

Best regards,

Jorn

Topic		Replies	Views
3d convolutions and correlations Any experience with 3d filtering? CUDA Programming and Performance	3	8841	October 4, 2007
3D texture based separable convolution extension of SDK example CUDA Programming and Performance	1	1854	April 6, 2010
Spatially separable 3D convolution CUDA Programming and Performance	1	820	September 25, 2021
separableConvolution mirroring edges CUDA Programming and Performance	0	492	January 17, 2017
Simple 2d Convolution Low Pass filter like blur filter CUDA Programming and Performance	3	2819	April 15, 2014
Parallell thinking for 3d convolution CUDA Programming and Performance	0	1351	September 15, 2008
Help: Shared memory vs. Caching in ConvolutionSeparable Example CUDA Programming and Performance	1	4476	December 7, 2008
Understanding NVidia separable convolution example CUDA Programming and Performance	0	2290	March 28, 2012
Image Convolution with CUDA paper Not quite understanding the tiling method they're showing CUDA Programming and Performance	9	6284	January 11, 2018
Non-Separable and Non-Linear Image Filter CUDA Programming and Performance	0	936	May 6, 2009

3D Separable Kernel Some question

Related topics