Constant memory pointer dereference issue

Hi,

I seem to have found a rather perplexing error when using constant memory. I’m currently working on my M.sc thesis and my general area of study is acceleration of image analysis operations. Anyway, I’ll spare you the boring details and get straight to the issue at hand. When trying to iterate over an image kernel (filter kernel i.e a matrix) stored in constant memory i noticed the code ended up returning all zeroes when using a sum of iterating variables. Following is a code example illustrating this:

__kernel void add(__constant float *a, __global float *answer)

{

	const int id_x = (int)get_global_id(0);

	int i,j;

	float sum = 0;

	

	

	for(i = 0; i < 3; i++)

	{

		for(j = 0; (j < 3); j++)

		{

			//This will not work

			sum +=  a[i+j];

			/*

			This will work:

			sum += a[j*i];

			

			So will this:

			sum += a[j];

			

			And this:

			sum += a[i]; 

			*/

			

		}

	}

	//Write result

	answer[id_x] = sum;

}

The size of the a array is 32 (same as the global work size) so there’s no risk of going out of bounds nor of running out of constant memory. Now this code will run just fine on the CPU, likewise the code will run fine on the GPU if *a is made __global. However when running the code above on the GPU (specifically a 8600M GT in a 15’ MacBookPro4,1 with CUDA 3.0 SDK installed) it will simply return 0s. I poked around some more, and discovered that if did not reset the increment variables by adding a third variable incremented inside the inner loop but not 0’d when the inner loop completes the code will function properly:

int k = 0;

	

for(i = 0; i < 3; i++)

{

	for(j = 0; (j < 3); j++,k++)

	{

		//This works

		sum +=  a[i+k];

	}

}

Have I missed some rule about pointer arithmetic and stumbled into the world of undefined behaviour or is this a compiler error?

Welcome to the strange world of openCL compiler:P

I don’t know, but I’ve an similiar issue, while using str[i++] = ‘a’; and after str[i] = ‘\0’ I got a crash. But if i do: str[i] = ‘a’; and after str[i+1]=‘\0’ it works:P

If its working, go to the next problem:P Do not bother, trying to understand this weird behaviours, unless they are causing another problems somewhere.