__device__ variariable doubt

Can I read/write data to a device variable not allocated with cudaMalloc()?

For example…

__device__ __align__(16) float globalVar[2048];

__global__ void MyKernel ()


   const int idx ( blockIdx.x*blockDim.x + threadIdx.x );

   globalVar[idx] = static_cast<float>(threadIdx.x);


It’s curious but using -deviceemu works… but if I compile it for the GPU puts thrash on the globalVar…

I’m using CUDA2 beta on Win Vista x64 ( x64 SDK/toolkit version too ).


  1. do you really need static_cast ? can you just try to fill your array with 1s (ones)
  2. what is your code how you check the values of this global variable?