Fast GPU convolution reference

Jimmy_Pettersson · November 24, 2013, 12:15pm

Hi!

I’m looking for the fastest available 2D convolution for 32-bit floating point to benchmark some of my own code against. Does anyone have any suggestion? Benchmark results?

I noticed NPP has routines for this, as I’m sure there ar MANY other implementations available. Can anyone give me some pointers?

Thanks!
Jim

eyalhir74 · November 24, 2013, 6:57pm

Silly question - did you try the SDK sample? how did your code stood against that one?

pasoleatis · November 24, 2013, 7:22pm

Hello,

I run codes which perform in the intermediate steps convolutions. I do it in k (inverse) space using cufft libraries. It is easy to implement and very efficient if the range of the convolution is large, since you reduce everything to 3 fft (1 forward and 1 backwards) and a matrix-matrix multiplication (element wise).

CudaaduC · November 24, 2013, 7:42pm

Jim’s code always seems to be the fastest IMO, so please update us to your findings.

Topic		Replies	Views
NPP 2D Convolusion CUDA Programming and Performance	2	2775	March 22, 2012
Performance issue CUDA Programming and Performance	3	2101	June 18, 2008
3DFFT efficiency CUDA Programming and Performance	1	4137	June 8, 2011
Arbitrary 2D convolution CUDA Programming and Performance	4	6336	February 17, 2012
CUFFT: calculation time CUDA Programming and Performance	6	2676	April 21, 2012
2D CUDA convolution CUDA Programming and Performance	3	16296	May 2, 2016
2d convolution utilizing tensor cores Computer Vision & Image Processing	1	634	January 8, 2023
Is nppiFilter() convolution implemented using FFT? CUDA Programming and Performance	0	820	November 22, 2012
CUFFT & 2D textures CUDA Programming and Performance	2	2504	April 10, 2007
2D cross correlation CUDA Programming and Performance	11	26073	May 19, 2011

Fast GPU convolution reference

Related topics