CUDA Toolkit 3.0 update GPU HW debugging tools to replace device emulation

tmurray · January 23, 2010, 1:05am

Device emulation support in the CUDA C Runtime will be deprecated as of the CUDA Toolkit 3.0 production release.

Now that more sophisticated hardware debugging tools are available and more are on the way, we will be focusing on supporting these tools instead of the legacy device emulation functionality.

On Linux, use cuda-gdb and cuda-memcheck. Third-party solutions from Allinea and TotalView will also be available soon.

On MacOS, use cuda-memcheck. We’re working a cuda-gdb port to MacOS for a future release and will provide a preview to all GPU Computing Registered Developers as soon as it’s ready.

On Windows, use the new Visual Studio integrated debugging and profiling tools code-named “Nexus.” Please see www.nvidia.com/nexus for details.

Deprecating the device emulation feature in this release means no further development or bug fixes for this feature will be made after this release, and device emulation will be removed entirely from the CUDA Toolkit 3.1 release.

SPWorley · January 23, 2010, 1:38am

I admit I abuse emulation mode for a lot of algorithm instrumentation… sometimes I stream out extra statistics or progress traces in emulation mode only (things like snapshotting all pending rays to a file at a particular point in the computation.) I do this for debugging but also for things like visualizations. Emulation mode lets me stick these extra steps into the compute since I can call extra host-only functions as needed.

This will likely still be possible using the GPU but now it may need an extra layer, similar to cuPrintf(), and it will clearly not be as easy. Alternatively I need to start looking into the advanced features of Ocelot.

Will Fermi be supported with the 3.0 toolkit (and therefore have legacy emulation ability) or will Fermi never be emulatable?

srhines · January 23, 2010, 1:45am

Emulation is its own target, so there is no distinction between Tesla and Fermi. But 3.0 should provide emulation support for all the features that Fermi brings.

SPWorley · January 23, 2010, 1:58am

I meant the new Fermi-specific extensions that are not in the current 3.0 beta toolkit, things like setting the shared memory mode, launching multiple kernels per GPU, specifying on-chip atomic globals, etc.

tmurray · January 23, 2010, 2:19am

Device emulation has nothing to do with any of that. Basically, “device emulation” is a misnomer–it never emulated a G80 or any other chip. Instead, it compiled for the CPU using the most obvious ways possible (this is why it was so slow). As a result, hardware-specific extensions are completely separate from device emulation.

E.D_Riedijk · January 23, 2010, 6:51am

Not that I am using device emulation, so I don’t care at all ;) But using Nexus means using versions of windows that are known to have less performance for CUDA than XP…

nitin.life · January 23, 2010, 7:47am

I want nexus for linux :( … or else please let the deviceemu be… I do use it a lot when I am debugging my algorithm for math errors… hence I will miss it. :(

E.D_Riedijk · January 23, 2010, 8:49am

why not use gdb with ddd as graphical frontend???

hlr · January 23, 2010, 9:00am

As E.D. Riedijk noted, does this mean that windows developers are forced to use >=Windows Vista, >=Visual Studio 2008 and >=2 GPUs just to be able to debug their code? This can be quite a limitation. Especially the 2 GPU part. Even if you have two computers each running a GPU, you are blocking both computers during debugging.
Even a ported cudadbg would be helpfull for Windows developers. Or developers group together and quickly port gpuocelot :D.
Also a quick question about the developer page. I have created an account for Nexus through the developer page, but I can’t use my account there. Should I do it again to gain access to those resources or is a simpler way possible?

CapJo · January 23, 2010, 11:32am

I don’t see Visual Studio 2008 as limitation and debugging on the real hardware is quite important, since errors might only occur running your code on the gpu and not in emulation mode. At the moment I have such a problem and I don’t know how to locate the line of code that causes this behavior.

It’s very probably impossible to run you graphics output and the debugger on one GPU. To inspect the memory on the GPU the debugger has to stop the execution on the GPU and what will happen to your graphics output? So you need 2 GPUs.

Windows Vista / Windows 7 might be a limitation. NVIDIA don’t want to do the work twice an support two kinds of driver models for debugging and I don’t know if it’s even possible to do that with the XP driver model. The overhead for starting a kernel is at the moment higher on vista / win 7, but that overhead might decrease in future. You can also develop your CUDA applications with vista and run your code later on Windows XP.

The hardware and software requirements are higher, but in imho it’s worth. Hardware debugging was on top of my own feature request list.

jma · January 23, 2010, 1:05pm

As an alternative, for an X-application (in Linux!) the second card could also be the one on your laptop (if you have one) - in which case just about any old piece of techno-trash will do.

erdooom · January 23, 2010, 2:18pm

is there any chance that the emu code will be opensourced ? once its deprecated of course …

nitin.life · January 23, 2010, 4:23pm

Great point + 1 for that question ? ( I would love it… External Media )

nitin.life · January 23, 2010, 4:26pm

Hmm thanks very much :) … I dint knew much about DDD… (am a non cs student) … looks nice… how complex is it to use ?

hlr · January 23, 2010, 4:28pm

I understand the 2 GPU constraint, but currenlty under linux you can use a second GPU in your system. For Nexus you need another box. And the problem with the new OS/Compiler is mainly convenience. I would like to maintain the same system as team, and they will surely not upgrade just for me.
I know it is required, but less limitations would help alot.

erdooom · January 23, 2010, 7:32pm

well actualy u don’t need a second box for nexus, the main idea is that you can’t debug on the same gpu that u are using for display, which makes sense. so either you use a gpu from another box or on the same box. The display gpu can be a simple one. if its the same box it needs to be an nvidia one. I do hope that the emu will continue in some form. that way you can always do some work even on a box without a nvidia gpu. We actualy did just that. Moved the whole team to vs 08 from vs 05 because of Nexus. But considering that vs 10 is in advanced stages of development well its always harder the bigger the gap.

tmurray · January 24, 2010, 4:56am

No–if you want something like that, just use Ocelot. It’s light years beyond device emulation anyway.

Gregory_Diamos · January 24, 2010, 2:09pm

Thanks for the vote of confidence Tim :)

CapJo · January 24, 2010, 5:56pm

Unfortunately it’s at the moment only for Linux available :-(.

Is it hard to port it to Windows, maybe because of dependencies?

Gregory_Diamos · January 25, 2010, 12:12am

It would be difficult but not impossible to port to windows. All of the major dependencies (LLVM, boost) have windows support. The main difficulties would be in wrapping the interface to pthreads and linux timers and changing the build system to use something other than autotools. I think that it could be done by one person in a few weeks. Unfortunately, no one in my lab even has windows installed so finding someone to actually do it would be the biggest problem.

Topic		Replies	Views
CUDA Toolkit 3.0 beta released now with public downloads CUDA Programming and Performance	104	430247	March 25, 2010
CUDA 2.1 beta CUDA Programming and Performance	49	67221	December 3, 2008
NVIDIA has hade a huge mistake with HW debugger Single-GPU debugging not supported and no emulation& CUDA Programming and Performance	34	6062	August 7, 2010
CUDA Toolkit 3.0 released CUDA Programming and Performance	62	26130	September 21, 2010
CUDA 2.1 discussion CUDA Programming and Performance	71	63991	February 17, 2009
CUDA Toolkit and SDK v2.2 released CUDA Programming and Performance	59	64697	January 25, 2011
CUDA Toolkit 3.2 release candidate available to registered developers CUDA Programming and Performance	68	63175	December 3, 2010
Is emulation mode removed from CUDA 3.0? CUDA Programming and Performance	23	22621	July 3, 2010
Cuda-gdb doesn't break and/or step into Kernels CUDA Programming and Performance	26	53825	August 1, 2011
Random execution times and freezes with concurent kernels - 2 CUDA Programming and Performance	5	2645	November 10, 2015

CUDA Toolkit 3.0 update GPU HW debugging tools to replace device emulation

Related topics