Is this coalesced access global memory access in for loop and with divergent while loop

whitewatercn · January 5, 2009, 8:44am

I have a kernel that access global memory in a for loop, and then followed with a data-dependent divergent while loop.

[codebox]global void my_kernel(int *g_data,int rows,int cols,int *g_results,…)

{

int i=0,j;

extern shared int s_data;

int word;

int res=0;

const int tid=blockDim.x*blockIdx.x+threadIdx.x;

/some code to assign the s_data/

__syncthreads();

for(j=0;j<rows;++j)

{

//__syncthreads();

word=g_data[j*cols+tid];

while(s_data[i]<word&&i<cols)

{

  ++i;

}

if(s_data[i]==word)

   res+=s_data[i];

//__syncthreads();

}

g_results[tid]=res;

}[/codebox]

Is this coalesed access? I have tried adding __syncthreads() in the for loop ,but the performance doesn’t improve.

I am a newbie to the CUDA programming, so any suggestion for my code is greatly appreciated.

Thanks

Sarnath · January 5, 2009, 8:59am

Yes, it is coalesced access… The one you are reading from g_data is colaesced…

Topic		Replies	Views
Coalescing access CUDA Programming and Performance	3	795	March 2, 2012
coalescing problem CUDA Programming and Performance	4	1116	August 8, 2011
clarification of coalesced memory access CUDA Programming and Performance	1	1088	February 22, 2011
Need some help to understand how to coalesce memory access CUDA Programming and Performance	4	1038	June 30, 2010
Need help on non-coalesced access CUDA Programming and Performance	0	1155	May 9, 2009
Problem about Coalesced Access CUDA Programming and Performance	1	4176	July 8, 2008
Coalesced Memory access related doubt CUDA Programming and Performance	13	2179	December 9, 2010
memory accesses by thread block accessing memory by thread block is only semi-coalesced? CUDA Programming and Performance	7	3843	February 16, 2009
questions about coalescing access coalescing access CUDA Programming and Performance	8	2060	November 13, 2009
Is these way coalesced access? CUDA Programming and Performance	0	415	March 6, 2020