Can I read/write data to a device variable not allocated with cudaMalloc()?
For example…
__device__ __align__(16) float globalVar[2048];
__global__ void MyKernel ()
{
const int idx ( blockIdx.x*blockDim.x + threadIdx.x );
globalVar[idx] = static_cast<float>(threadIdx.x);
}
It’s curious but using -deviceemu works… but if I compile it for the GPU puts thrash on the globalVar…
I’m using CUDA2 beta on Win Vista x64 ( x64 SDK/toolkit version too ).
thx