where is the another 32 byte shared memory

Hi everyone,

recently, I optimize my programme, I found the the size of shared memory in cubin file is greater than I used in the kernel (only greater 32byte). I wonder where I use these shared memory?

[codebox]/************************************************************

************/

/* calculate the maximum and minimum of a vector */

/************************************************************

************/

template

global void CalMaxMinD1(int np, float* idata, float* odatamax, float* odatamin)

{

int tid = threadIdx.x;

int i = __mul24(blockIdx.x, blockDim.x) + threadIdx.x;

int gridsize = __mul24(gridDim.x, blockDim.x);

__shared__ float maxtemp[256];

__shared__ float mintemp[256];

// load

maxtemp[tid] = idata[i];

mintemp[tid] = maxtemp[tid];

i += gridsize;

while (i < np)

{

	float temp = idata[i];

	if (maxtemp[tid] < temp) { maxtemp[tid] = temp; }

	if (mintemp[tid] > temp) { mintemp[tid] = temp; }

	i += gridsize;

}

__syncthreads();

do the reduce works

// output

if (tid == 0)

{

	odatamax[blockIdx.x] = maxtemp[0];

	odatamin[blockIdx.x] = mintemp[0];

}

}[/codebox]

In this code, I only use two shared memory array. it is totally 2564Byte2 = 2048. While in the cubin file, it shows I used 2080 Byte shared memory. Could you please tell me why?

[codebox]code {

name = _Z11CalMaxMinD1ILi2EEviPfS0_S0_

lmem = 0

smem = 2080

reg  = 8

bar  = 1

const {

		segname = const

		segnum  = 1

		offset  = 0

		bytes   = 4

	mem {

		0x0000001f 

	}

}

bincode {.......

}

}[/codebox]

peter

CUDA uses some of the shared memory for storing kernel parameters and block/grid dimensions.

Thanks.

peter