Accelerating HPC Applications with NVIDIA Nsight Compute Roofline Analysis

Originally published at:

Writing high-performance software is no simple task. After you have code that can compile and run, a new challenge is introduced when you try and understand how it is performing on the available hardware. Different platforms, whether they are CPUs, GPUs, or something else, will have different hardware limitations like available memory bandwidth and theoretical…

It was great to collaborate with some of the foremost experts on Roofline Analysis and the Nsight Compute engineering team to create this example. If you have any questions or comments, please let us know.