Hi, I am running into the strangest thing.
In my kernel, I attempt to update some global memory in a loop, like this:
// float * gmem is an input array to my kernel.
for (..) {
gmem[ n ] += float_val;
}
and I get a segmentation fault. As far as I can tell, the += increment is the problem. If I replace with a = , the kernel behaves as expected, and no seg faults. Obviously, though, I need to incremental sum behavior, so I can’t just get rid of the +=. What could be the problem?
// this runs with no seg faults
for (..) {
gmem[ n ] = float_val;
}
Any ideas? Please help! Windows XP, CUDA 2.3, GPU CUDA capability 1.3