Is it correct for mono image process

darot · January 22, 2009, 1:58am

Dear all:

After my testing, even make all L/UL of global mem are coalesced memoy access when using share memoy ,

The transfer time is still there.(becasue I have to do syncthread after read like code below).

__shared__ float share[BLOCK_SIZESglLRX][BLOCK_SIZESglLRY];

	unsigned int xIndex,yIndex,index_in;

	xIndex = blockIdx.x * BLOCK_SIZESglLRX + threadIdx.x;-blockIdx.x*Pitch;

	yIndex = blockIdx.y * BLOCK_SIZESglLRY + threadIdx.y;

	index_in = yIndex * devImgSizeX + xIndex;

	if (xIndex<devImgSizeX && yIndex<devImgSizeY){

		share[threadIdx.x][threadIdx.y] = *(S+index_in);

	}

	__syncthreads();

If I want to make a 2d convolution process(not seperable), like smooth or laplacian, I think to use texture memory will be faster then use share memoy.

Is it right?

tatou1234 · January 22, 2009, 7:46am

Dear all:

After my testing, even make all L/UL of global mem are coalesced memoy access when using share memoy ,

The transfer time is still there.(becasue I have to do syncthread after read like code below).
__shared__ float share[BLOCK_SIZESglLRX][BLOCK_SIZESglLRY];

	unsigned int xIndex,yIndex,index_in;

	xIndex = blockIdx.x * BLOCK_SIZESglLRX + threadIdx.x;-blockIdx.x*Pitch;

	yIndex = blockIdx.y * BLOCK_SIZESglLRY + threadIdx.y;

	index_in = yIndex * devImgSizeX + xIndex;

	if (xIndex<devImgSizeX && yIndex<devImgSizeY){

		share[threadIdx.x][threadIdx.y] = *(S+index_in);

	}

	__syncthreads();
If I want to make a 2d convolution process(not seperable), like smooth or laplacian, I think to use texture memory will be faster then use share memoy.

Is it right?

I’m not sure if is faster but I know that shared memory is very very fast. Are you interested on computer vision? Do you know where I can find information about it and cuda??

MisterAnderson42 · January 22, 2009, 12:35pm

Try it and find out!

Topic		Replies	Views
Is it already the mostly optimized version? CUDA Programming and Performance	2	1550	January 22, 2009
Copy from texture memory to shared memory Confused about best transfer strategy CUDA Programming and Performance	4	1651	February 11, 2010
Newbie - Need to use shared mem? CUDA Programming and Performance	27	15217	December 17, 2008
Roughly the same processing time for global and shared mem CUDA Programming and Performance	9	1928	June 6, 2010
Question about texture/shared memory enhance the computing efficiency CUDA Programming and Performance	3	5453	December 4, 2007
when to use shared memory CUDA Programming and Performance	0	2295	March 10, 2009
Convolution Texture with Shared Memory CUDA Programming and Performance	3	641	April 15, 2024
Shared memory problem CUDA Programming and Performance	3	2316	February 8, 2008
Shared vs. texture memory CUDA Programming and Performance	6	3263	April 18, 2009
CUDA texture memory performance CUDA Programming and Performance	4	33690	January 13, 2009

Is it correct for mono image process

Related topics