I am facing a weird problem. I have a kernel which does what it is supposed to do when i use -G0 flag for compiling ( this flag is used to generate debug information). But the problem is the same code doesnt work if I dont use -G0 flag. Not sure why this is happening.
I have used similar thing before also but that time there was no such problem. Any idea what could be the problem?
I am pretty sure that compiling device code with -G causes it to spill just about everything into local memory so that the debugger can access it. It is possible that you have something like an out of bounds shared memory access or similar (which is fatal in Fermi) that could be causing problems. I would suggest compiling without the debugging flag and running the code with cuda-memcheck to see if it finds anything.
:) :) :).
THat was the problem and now its solved :).
thanks a looooooooooooooooooooooooooooooooooooot
I am little curious, from where do you guys know these things? Are these things given in the manual or you know these things with experience?