ANNOUNCING NVIDIA® cuDNN – GPU Accelerated Machine Learning
NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks. It emphasizes performance, ease-of-use, and low memory overhead. NVIDIA cuDNN is designed to be integrated into higher-level machine learning frameworks, such as UC Berkeley’s popular Caffe software. The simple, drop-in design allows developers to focus on designing and implementing neural net models rather than tuning for performance, while still achieving the high performance modern parallel computing hardware affords.
Key Features
- Forward and backward convolution routines, tuned for NVIDIA GPUs
- Always optimized for latest NVIDIA GPU architectures
- Arbitrary dimension ordering, striding, and subregions for 4d tensors
- Forward and backward paths for common layer types (ReLU, Sigmoid, Tanh, pooling, softmax)
- Context-based API allows for easy multithreading
Visit the Parallel ForAll Blog for an introduction to cuDNN, visit here to download the library and visit here for more information on accelerating machine learning with GPUs.
Stephen Jones
Product Manager – Strategic Alliances