CUDA 2.1 Release

What’s new in CUDA 2.1
• Debugger support using gdb for CUDA
• Support for using a GPU that is not driving a display on Vista
(This was already supported on Windows XP, OSX and Linux)
• DX10 interop support, accelerates communication with DX10 applications
• VisualStudio 2008 support for Windows XP and Windows Vista
• Just-in-time (JIT) compilation, for applications that dynamically generate CUDA kernels
• C++ templates are now supported in CUDA kernels
• Support for recent releases of Linux including Fedora9, OpenSUSE 11 and Ubuntu 8.04

New CUDA SDK samples for CUDA 2.1
• smokeParticles (volumetric particle shadows)
• DX10 interop samples: simpleD3D10 and simpleD3D10Texture

Known Issues
In this release, #pragma unroll sometimes does not unroll loops because of limits in the compiler on loop bodies, which may cause a decrease in performance versus CUDA 2.0. A user can override this limit on the command line with the following nvcc compiler flag:

nvcc -Xopencc -OPT:unroll_size=200000

In most cases, this should override the built-in loop unrolling limits. Unless a kernel uses #pragma unroll and shows a significant performance drop from CUDA 2.0, this flag should not be used.

See also the CUDA FAQ update for the 2.1 release.

181.20 for Windows XP
181.20 for Windows XP x64
181.20 for Vista 32-bit
181.20 for Vista 64-bit
180.22 for Linux 32-bit
180.22 for Linux 64-bit

Direct Download URLs for CUDA 2.1 Release

CUDA Toolkit
[url=“”][/url] (EULA with redist rights)

[url=“”][/url] (windows)

CUDA Visual Profiler
[url=“”][/url] (User Manual)…_15Dec08.tar.gz

CUDA Debugger

Please discuss the CUDA 2.1 release in this thread.