did any one have idea on debug device code

my program run fine on emulation mode
but give different result on device code.
did any one kown how to step in to kernel code in device code?
don’t tell me compiler it in emu dbg mode.it work fine. and i have already try this

at this time there is no way to debug code on the device. There has been talk about an gdb-based visual debugger, let’s hope that it will be released together with CUDA 2.0

In general when things work in emu mode, but not in debug, you are not thinking parallel enough. Can you describe what is the problem?

inow i dont use parallel ,only on thread ,just make program runfirst

i’m working on seismic processing module and port it to cuda make it faster.

now,the debuger is needed to find out any value is different between emulation and device mode .

I also have this kind of experience.

And in my case, the reason was a little mistake.

I missed to initialize initial values. And the default values of CPU and those of GPU were different from each others.

Sometimes I make a mistake that I directly use GPU variables on the CPU code. In this case, program works well in emulation mode, but doesn’t work in device mode.

If your code is currently just a trial stage and simple, then why don’t you post your code on this forum? :)

i’m truely very sorry that my code is really procuction code for Seismic Processing and i can’t post it.

inface i need a debuger can debug device code

Well, at this time the only thing you can do is also extract intermediate values from the GPU back to the CPU, to compare intermediate values, to see where things go wrong. Are there very large differences? Or can it be explained by GPU_float vs CPU_double?

i have solved problem. cpu malloc memory deault value is zero while gpu not

so i fill that array with zero and erery thing work fine.

By the way, seismic processing a not so strictly on float or double differenct.