Analysis-Driven Optimization: Analyzing and Improving Performance with NVIDIA Nsight Compute, Part 2

Originally published at: https://developer.nvidia.com/blog/analysis-driven-optimization-analyzing-and-improving-performance-with-nvidia-nsight-compute-part-2/

In part 1, I introduced the code for profiling, covered the basic ideas of analysis-driven optimization (ADO), and got you started with the Nsight Compute profiler. In part 2, you apply what you learned to improve the performance of the code and then continue the analysis and optimization process. Refactoring To refactor the code based…