Reading/writing unallocated GPU memory

I have a program in Visual Studio.
I allocate some memory pointers on GPU and then read/write them. The problem is, when I set a large number of data, Emulation works differently than on a GPU.
When I change the program, that it always allocates 5 times more memory on a GPU, the program working on a GPU works well.

So I guess that I am reading/writing from unallocated space.
The emulation mode doesn’t crash, so I guess that GPU emulator always allocates some more memory than it is set in cudaMalloc.

Is there an easy way to see when I am reading/writing to/from unallocated GPU memory in a kernel?

cuda-memcheck was added in 3.0b just for this.

Resolved the problem :) I thought that I am reading/writing from/to unallocated memory because errors are non-deterministic, and did not occur when I allocated more memory - only by accident.

The real problem was - not synchronizing threads after copying global memory to shared memory :P

But thanks for the advice - maybe this tool will help me in the future.