With the code:
max1 = sdata[0];
// Halve size if size isn't a power of 2
size = ((size&(size-1))==0) ? size : size/2;
I’m getting the “Cannot tell what pointer points to, assuming global memory space” warning on the “size = …” line even though “size” was declared as “unsigned long size;” (It’s value was obtained from an array though)
When I comment out the “size = …” line though the warning is given for the "max1 = … " line, even though this warning wasn’t given when the “size = …” line was uncommented. The variable declarations are:
__shared__ float sdata[N_THREADS_IN_BLOCK*2 + 1];
float max1;
So in summary when no code is commented, the warning appears for the "size = " line and when then "size = " line is commented the warning appears for the "max1 = " line.
As a side note, I’m also getting these warnings when accessing 2D arrays (example a[i][tid]) that have been passed as arguments (float **a). Everything works well with the emulator but this problem seems to be messing with my results on the actual GPU.
Simply, this message means that the compiler cannot determine what type a pointer points to.
Normally a compiler can tell that in the following example:
int i;
foo(&i);
That any reference to the pointer parameter in foo points to the int i. This type of analysis is used for various optimizations and correctness checks, but in the case of nvcc, it needs it to determine what address space it is in.
This type of analysis gets very hairy when you start loading pointers from memory rather than passing them directly as parameters to a function call. For example:
int i;
int* iptr = &i;
foo(&iptr);
is hard:
int i;
int* iptr;
memcpy( &iptr, &&i, sizeof(int*) );
foo(&iptr);
is very hard.
int i;
__shared__ int j;
int* iptr;
if( runtimeDeterminedVariable )
{
iptr = &i;
}
else
{
iptr = &j
}
foo(&iptr);
is impossible.
In CUDA, if the compiler can’t figure out what type of pointer is begin dereferenced, it defaults to global memory. This will result in incorrect results if the pointer actually points to shared memory. As of now there is no way to force it to use shared or global memory, it always uses global memory. You can get around this by trying to make the compiler’s job easier though:
For example, rather than allocating a multidimenional array, you might want to allocate a flat array and then index into it.