Diff between statically-allocated&dynamically-allocated

in this code

__global  void kernel()
{
__shared__  int a[64];
}

when i use CU_FUNC_ATTRIBUTE_SHARED_SIZE_BYTES to get the attribute of
this kenel ,
the result is 0!

i think this a is static shared memory , but the the result is zero .
is there any answer for this ??

Your kernel doesn’t do anything. The compiler may have optimized out the usage of shared memory.

$ cat t2057.cu
#include <cstdio>
#include <iostream>
__global__ void k(int n){

  __shared__ int s[64];

#ifdef DO_SMTH

  for (int i = 0; i < 64; i++) s[i] = 0;
  for (int i = 0; i < n; i++) s[i] = i;
  int v = 0;
  for (int i = 0; i < 64; i++) v += s[i];
  printf("v = %d\n", v);
#endif
}

int main(){

  cudaFuncAttributes a;
  cudaFuncGetAttributes(&a, k);
  std::cout << a.sharedSizeBytes << std::endl;
}
$ nvcc -o t2057 t2057.cu
t2057.cu(5): warning: variable "s" was declared but never referenced

$ ./t2057
0
$ nvcc -o t2057 t2057.cu -DDO_SMTH
$ ./t2057
256
$

“use it or lose it”

1 Like

i try your code.

the key point is here:
“printf(“v = %d\n”, v);”
if use this line code , sharedSizeBytes will be 256.
if no this line code ,sharedSizeBytes will be 0.
you can try;

before i also use the shared memory like you ,but no printf to print somethings
like

_global__ void k(int n){
  __shared__ int s[64];
  for (int i = 0; i < 64; i++) s[i] = 0;
}

so the shared memory used is 0;

anyway , thanks for your answer!

Thanks for your answer!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.