I have noticed that the Emulation Debug (EmuDebug) feature is deprecated in the latest 3.0 version of the CUDA SDK. For myself (and many others I think), my flow of CUDA development is done with Visual Studio Express, with one cuda-capable GPU. When I need to debug, I use EmuDebug mode and set (usually) conditional breakpoints to trace a a specific Thread in a specific Thread Block. It is my understanding that when EmuDebug mode is removed, I will have to use NSIGHT to debug, which currently requires two systems, with the target system having at least one CUDA-capable GPU, and Visual Studio Standard or Professional version ($$$). I understand that there may be development to allow NSIGHT to debug on a single system, but this still does not get past the requirement of MSVS Standard or Professional. Additionally, if you don’t have integrated video with a potential future version of NSIGHT that allows debug on a single system, you will need a second GPU ($). MSVS Standard or Professional is prohibitively expensive. Unless I have mis-stated something above, I don’t understand the rationale for removing EmuDebug capability. If it is a requirement to have MSVS Standard or Professional with NSIGHT and to have two systems (currently), I am requesting that NVIDIA reconsider it’s decision to remove EmuBedug capability. If not, this will be a barrier to entry for CUDA adoption and development. If forum members agree, I urge you to petition NVIDIA on this (or your own) post.
There’s a thread that discusses the emulator depreciation here. I am also sad to see the emulator go, since I use it more than any of the other debugging tools, and I use emulation for more than just debugging by adding “heavyweight” checkpoint and analysis steps inside my algorithms to report intermediate status and validity (For example, for my raytracer, how many rays are pending, are there loaded voxels which have been orphaned but I think are still valid, have threads all shared their desired nodes with each other, are there rays still pending which have actually finished, is there some ray that seems stuck forever? Those questions are expensive and annoying to compute on the GPU just for debugging, but the emulator has no register limit and can printf() and assert() with no overhead or hassle.)
That said, I am sure NV is dropping the emulator not because they think it’s not useful, but because it takes too much time to support… time which could be focus on the more “complete” tools like Nexus or cuda-gdb, etc. Unfortunately the emulator was more than just a debugger, so it’s sad to lose those unique abilities.
Supporting device emulation roughly halved the speed of feature development in the CUDA API, and there are much better alternatives available now (Ocelot). Ocelot is much, much better than device emulation ever was (automatic detection of warp synchronous programming! my heart is all aflutter just thinking about that).
Amdahl’s law!
It is hard to believe. That way every windows developer will need to write its own software cuda kernell implementation. To check it work.