I have a very strange problem. i have this C kernel code in cuda. I debug it using the gdb=1 option to compile it using debug mode,and the executable runs correctly. But if I left this option, then the compiled executable output all results as 0.
I actually try to fix it by explicitly combine two memory accesses in the kernel using an intermediate variable. After that the released executable works all right.
Is this a bug or my problem?