How to debug into all the threads in device simulation mode

I am using VC2005 to debug CUDA. It seems that I can only debug the first thread. How to debug into all the threads in device simulation mode?

Thank you,

Set a breakpoint inside the kernel, then run the debugger. It will break in the first thread of the warp. Just hit F5 to run until the next breakpoint, which will probably be the same breakpoint, just in the 2nd thread of the first warp. Keep hitting F5 and you’ll keep going from thread to thread.

Or you could put in an if statement that selects the thread you want to debug with a dummy line of code inside and put a breakpoing on that line. There is probably a better way to do it, but that is what I usually do.

That works. Thank you!