Wrong shared memory allocation

ekon · May 20, 2009, 5:46pm

hello,

I’m using CUDA 2.0. I have written a kernel which uses shared memory. All shared memory variables are declared as follows:

[codebox]

__shared__ float errork[4], tmpData[1000];[/codebox]

However the nvcc compiler reports the following where it seems like the tmpData variable has been allocated as double in size:

[codebox]

1>ptxas info : Compiling entry function ‘_globfunc__Z5kTestjPfS_jjfS’

1>ptxas info : Used 6 registers, 4060+4044 bytes smem, 16 bytes cmem[1]

[/codebox]

If I change the declaration to :

[codebox]

__shared__ float errork[2], tmpData[1000];

[/codebox]

Then the nvcc output magically seems fine:

[codebox]

1>ptxas info : Compiling entry function ‘_globfunc__Z5kTestjPfS_jjfS’

1>ptxas info : Used 6 registers, 4052+52 bytes smem, 16 bytes cmem[1]

[/codebox]

eyalhir74 · May 20, 2009, 7:19pm

hello,

I’m using CUDA 2.0. I have written a kernel which uses shared memory. All shared memory variables are declared as follows:

[codebox]
__shared__ float errork[4], tmpData[1000];[/codebox]
However the nvcc compiler reports the following where it seems like the tmpData variable has been allocated as double in size:

[codebox]

1>ptxas info : Compiling entry function ‘_globfunc__Z5kTestjPfS_jjfS’

1>ptxas info : Used 6 registers, 4060+4044 bytes smem, 16 bytes cmem[1]

[/codebox]

If I change the declaration to :

[codebox]
__shared__ float errork[2], tmpData[1000];
[/codebox]

Then the nvcc output magically seems fine:

[codebox]

1>ptxas info : Compiling entry function ‘_globfunc__Z5kTestjPfS_jjfS’

1>ptxas info : Used 6 registers, 4052+52 bytes smem, 16 bytes cmem[1]

[/codebox]

Hi,

I’ve asked this question in the past, but couldnt get an “official” answer :)

However in both cases the amount of shared memory used is ~4060 (in your case) the second

number (i.e. after the + sign) is something internal or subset of the number left of the + sign.

You can ignore it.

eyal

Topic		Replies	Views
shared memory usage by nvcc CUDA Programming and Performance	0	2522	September 14, 2008
can I use shared memory of size 3600*sizeof(double) on <<<1,1>>> config? CUDA Programming and Performance	4	1255	July 21, 2009
Bug(?) with short datatype in shared memory CUDA Programming and Performance	5	5543	September 8, 2008
Declaration of Double Precision in Shared Memory CUDA Programming and Performance	2	5051	August 4, 2010
Declaration of shared memory arrays in kernel function CUDA Programming and Performance	3	619	December 20, 2021
How to allocate shared memory for an array CUDA Programming and Performance cuda	4	968	February 11, 2022
Shared Memory initialization CUDA Programming and Performance	19	45497	March 26, 2007
shared mem size doubles with "-rdc=true"? CUDA Programming and Performance	1	926	March 13, 2013
What's the problem in my code? CUDA Programming and Performance	2	531	August 24, 2016
shared memory wrong allocation? CUDA Programming and Performance	2	878	July 29, 2009

Wrong shared memory allocation

Related topics