global memory access disabled in a loop

KevinIfremer · July 3, 2009, 11:56am

Hi,

i have a problem with the following kernel code :

for(int x = kernelradius; x <= nbNiveaux - kernelradius - 1; x++)

  {

	for(int y = kernelradius; y <= nbNiveaux - kernelradius - 1; y++)

	{

	  sommediv = 0;

				  unsigned int index_cmat = __umul24(i, nbNiveaux) + __umul24(__umul24(j, d), nbNiveaux) + x + __umul24(y , d); // (x,y)

	 float cmatxy = Cmats[index_cmat];

	 for(int kx = -kernelradius; kx <= kernelradius; kx++)

	 {

		for(int ky = -kernelradius; ky <= kernelradius; ky++)

		{

			dx = x + kx;

			dy = y + ky;

			unsigned int index_cmatf = __umul24(i, nbNiveaux) + __umul24(__umul24(j, d), nbNiveaux) + dx + __umul24(dy , d); // (dx,dy)

			CmatsF[index_cmatf] = h[kernelradius + kx + (kernelradius + ky) * largeurFiltre[0]];

		}

	 }	

	}

  }

CmatsF and h are global memory arrays. When I want to read elements of h outside the loops “for(int kx …, ky …” it works perfectly, but inside the loops, it seems to crash the kernel (or CUDA) and global memory is cleared. :unsure:

Can someone explain to me why I can’t access h in these loops ?

thanks

KevinIfremer · July 3, 2009, 12:41pm

Broadly speaking, when trying to read global memory inside the loop, CUDA crashes. :wacko:

demnim · July 7, 2009, 7:29am

h[kernelradius + kx + (kernelradius + ky) * largeurFiltre[0]];

looks like is getting out of range, check the values on device emulation mode with a printf.

Topic		Replies	Views
unspecified Launch Failure Kernel fails often CUDA Programming and Performance	5	2169	February 3, 2010
Issue with Writing to Global memory CUDA Programming and Performance	5	2892	May 16, 2009
beginners problem - global memory damage? CUDA Programming and Performance	5	1891	September 23, 2008
Strange error when reading global memory CUDA Programming and Performance	4	1365	June 9, 2009
Problem report--Crush problem with too many global memory accesses CUDA Programming and Performance	4	1864	January 23, 2009
Compile problems with global arrays Global array compile-time indexing fails, but local arrays work CUDA Programming and Performance	2	1331	April 28, 2010
Strange problems with matrix in Global Memory CUDA Programming and Performance	0	2109	January 6, 2009
Set global memory inside conditional statement? CUDA Programming and Performance	7	1347	July 15, 2009
Access to global memory doesnt work without shared-buffering? CUDA Programming and Performance	2	2924	February 13, 2009
help getting shared memory working CUDA Programming and Performance	11	4315	June 12, 2007

global memory access disabled in a loop

Related topics