Wrong shared memory allocation

hello,

I’m using CUDA 2.0. I have written a kernel which uses shared memory. All shared memory variables are declared as follows:

[codebox]

__shared__ float errork[4], tmpData[1000];[/codebox]

However the nvcc compiler reports the following where it seems like the tmpData variable has been allocated as double in size:

[codebox]

1>ptxas info : Compiling entry function ‘_globfunc__Z5kTestjPfS_jjfS

1>ptxas info : Used 6 registers, 4060+4044 bytes smem, 16 bytes cmem[1]

[/codebox]

If I change the declaration to :

[codebox]

__shared__ float errork[2], tmpData[1000];

[/codebox]

Then the nvcc output magically seems fine:

[codebox]

1>ptxas info : Compiling entry function ‘_globfunc__Z5kTestjPfS_jjfS

1>ptxas info : Used 6 registers, 4052+52 bytes smem, 16 bytes cmem[1]

[/codebox]

Hi,

I’ve asked this question in the past, but couldnt get an “official” answer :)

However in both cases the amount of shared memory used is ~4060 (in your case) the second

number (i.e. after the + sign) is something internal or subset of the number left of the + sign.

You can ignore it.

eyal