Yesterday, we released a bunch of new stuff for registered developers to use.
The big news is our OpenCL performance profiler for the GPU. We also released updated OpenCL drivers (now packaged with all the other drivers instead of being inside the SDK) and several new SDK code samples to help developers using OpenCL.
The OpenCL Visual Profiler will be included in the next release of the CUDA Toolkit.
The OpenCL Visual Profiler uses the extensive performance instrumentation in NVIDIA’s OpenCL drivers and hardware performance signals designed into NVIDIA GPUs to provide developers with insight into performance bottlenecks and opportunities for optimization. Key features include:
Profiling of actual hardware signals, kernel efficiency, and instruction issue rate
Timing of memory copies between system memory and device memory
Customizable graphs to help developers focus in on problem areas
Basic auto-analysis to reveal warp serialization problems
Easy import/export to CSV for custom analysis
Support for multi-GPU performance scaling has been added to most of the SDK code samples for OpenCL, including:
We also added a few DirectCompute samples, if you’re interested in that sort of thing.
The drivers and SDK code samples in this release are compatible with with the publicly available CUDA Toolkit 2.3 which is available at www.nvidia.com/cuda.
Finally, we also released our OpenCL Best Practices Guide, designed to help developers using OpenCL on the CUDA architecture implement high performance parallel algorithms and understand best practices for GPU Computing. Chapters on the following topics and more are included in the guide:
Heterogeneous Computing with OpenCL
Performance Optimization Strategies
The OpenCL Best Practices Guide will also be included in the next release of the CUDA Toolkit, but you can get a copy now at www.nvidia.com/opencl.