compilation ok in Emulation mode, not in Release __shared__ varaible declaration dropped


I have a code that compiles fine in emulation mode, but it does not compile if I want to execute it on the gpu.

The compilator complains that a shared variable is not declared.

I have something like this in my code :

template <class T, int size> 

struct stack{

unsigned int n

T data;

  __device__ __host__ void push(T t){data[n]=t; n++;}

  __device__ __host__ T pop(){n--; return data[n];}


typedef struct 


  int somedata;

} foo;

__shared__ stack<foo, 10> fooStack[BLOCK_SIZE];

typedef struct 


  __device__ __host__ test()


    foo myfoo;

    fooStack[threadIdx.x].push(myfoo); // <- here the compiler complains that fooStack was not declared...


} bar;

my config :

8800 GTX, linux 32, cuda v0.9

thanks for your advice.


I’m assuming this quite close to the entire (if not the entire kernel). I am curious if the scoping issues are completely different for using structs in a kernel. Try declaring your stack inside the struct. I just my group has run into similar problems with kernel execution in emulation mode compared to using the device.

Thanks for the advice, but I think I found the error : I am trying to access shared mem in a function declared host, which is not allowed.