Help: Shared memory vs. Caching in ConvolutionSeparable Example

mikey79 · December 7, 2008, 12:47pm

Hello!

My question concerns the use of shared memory versus caching doing texture lookups. In particular, I refer to the ‘convolutionseparable’ example. In it, a convolution is done on an input array. The array is convoluted by first doing the rows, then the columns (for caching speedups). This is done in two functions successively. In each function, the corresponding data rows/columns are first read into a shared memory region, which is then used for doing the convolution. My question now is, how this speeds up things? Isn’t that doubling the effort? If I read out the texture values anyway, why not compute the convolution values fot that row directly on it instead of first buffering the values in some shared segment, synching threads and then doing all the calculation?
Since I am fairly new to Cuda I apologize in advance if this might be some kind of ‘noobish’ question, but hope some of you can clear my mind about it :-)
Thanks a lot!

Ailleur · December 7, 2008, 2:26pm

All threads must access each value in shared memory once. So instead of doing 2KERNEL_RADIUS reads from global memory you do one read and then do 2KERNEL_RADIUS reads from shared memory, which is much elss costly.

Topic		Replies	Views
when to use shared memory CUDA Programming and Performance	0	2259	March 10, 2009
CUDA OpenGL post-processing example CUDA Programming and Performance	9	13245	May 27, 2007
Convolution Texture with Shared Memory CUDA Programming and Performance	3	481	April 15, 2024
texture memory or shared memory? which is faster, and by what factor? CUDA Programming and Performance	0	1146	March 14, 2008
Question about texture/shared memory enhance the computing efficiency CUDA Programming and Performance	3	5381	December 4, 2007
Shared memory vs texture fetches CUDA Programming and Performance	0	1912	April 26, 2007
Dynamically allocated shared memory CUDA Programming and Performance	5	3920	February 12, 2009
const vs shared speed CUDA Programming and Performance	2	4528	August 30, 2007
3D texture based separable convolution extension of SDK example CUDA Programming and Performance	1	1854	April 6, 2010
When to use textures CUDA Programming and Performance	7	8122	February 12, 2008

Help: Shared memory vs. Caching in ConvolutionSeparable Example

Related topics