In what situations compiler uses shared memory, currently i’m compiling a kernel
and inside i’v use array of 512 ints in shared memory, and 4 variables in shared memory
declared as follow:
global void KDKernelBroad(unsigned int *OutNodes, unsigned int Level, unsigned int *OutIndices, const unsigned int IndicesMAX)
shared unsigned int Counts;
shared Split, TMin, TMax, TMid;
/// kernel code follows
but nvcc spits:
1>ptxas info : Compiling entry function ‘__globfunc__Z15NodeKernelBroadPjjS_j’
1>ptxas info : Used 30 registers, 2064+1104 bytes smem, 40 bytes cmem, 64 bytes cmem
ok, i’v see my 2064 bytes of memory, but what a hell are those 1104 additional bytes
of shared memory allocated by compiller ? kernel params does not take that much space ?!
What constructs allocate additional shared memory ?
Looking at the ptx assembly those 1104 bytes of shared memory are never referenced ?!
Any ideas what this might be ?