bugs related to memory allocation

my program uses a lot of GPU memory allocation and free, using both cudaMalloc/cudaFree and glBufferData to maintain several STL vector like structures. It runs fine in emulation mode, but get very strange result in release mode:
A small section in the middle of a piece of memory i cudaMalloced becomes unwritable by my kernel (any write to it in kernel silently fails), but cudaMemcpy to it works. this doesn’t happen if i allocating the memory twice, throw away the first pointer and just use the second one.