Programming Efficiently with the NVIDIA CUDA 11.3 Compiler Toolchain

jwitsoe · April 16, 2021, 12:41am

Originally published at: Programming Efficiently with the NVIDIA CUDA 11.3 Compiler Toolchain | NVIDIA Technical Blog

The CUDA 11.3 release of the CUDA C++ compiler toolchain incorporates new features aimed at improving developer productivity and code performance. NVIDIA is introducing cu++flt, a standalone demangler tool that allows you to decode mangled function names to aid source code correlation. Starting with this release, the NVRTC shared library versioning scheme is relaxed to…

haichengw · April 22, 2021, 4:27am

CUDA 11.3 significantly improves the performance of Ampere/Turing/Volta Tensor Core kernels.

298TFLOPS was recorded on A100 when benchmarking FP16 GEMM from CUTLASS, an open source CUDA DL/HPC library (GitHub - NVIDIA/cutlass: CUDA Templates for Linear Algebra Subroutines). This is 14% higher than CUDA 11.2. FP32(via TF32) GEMM is improved by 39% and can reach 143TFLOPS. The same speedup applies to the CONV kernels.

Also, see the discussion here: CUDA 11.3 significantly improved the performance of CUTLASS · Discussion #241 · NVIDIA/cutlass · GitHub

agmmal · July 1, 2021, 8:37am

How do I use the toolkit to build CUDA for my custom x86-64 OS (Yocto Built) with the support of an Nvidia GPU card using my Ubuntu x86-64 host system?
Thanks.

Topic		Replies	Views
Exploring the New Features of CUDA 11.3 Technical Blog	2	663	April 23, 2021
Reducing Application Build Times Using CUDA C++ Compilation Aids Technical Blog	1	695	October 31, 2021
Discovering New Features in CUDA 11.4 Technical Blog	0	539	July 27, 2021
CUDA Toolkit 3.0 released CUDA Programming and Performance	62	26809	September 21, 2010
Enhancing Memory Allocation with New NVIDIA CUDA 11.2 Features Technical Blog	0	448	December 16, 2020
Boosting Productivity and Performance with the NVIDIA CUDA 11.2 C++ Compiler Technical Blog	0	542	February 13, 2021
CUDA Toolkit 11.2 Now Available CUDA Programming and Performance cuda	2	755	December 16, 2020
CUDA 7 Release Candidate Feature Overview: C++11, New Libraries, and More Technical Blog	43	2041	August 8, 2016
NVIDIA® Nsight™ Compute 2021.1 is now available Nsight Compute	3	2802	February 29, 2024
CUDA 11 Features Revealed Technical Blog	4	744	October 16, 2024

Programming Efficiently with the NVIDIA CUDA 11.3 Compiler Toolchain

Related topics