How to calculate the exact gain after certain optimization?

quan.luo.101 · November 18, 2025, 7:17pm

I’ve run nsight compute on my kernel. And I can see a large warp stall from “Stall Long Scoreboard“ and I knew how to optimize it.

However, before optimizing, is there a metric on nsight compute that it can tell me how much gain I can get after optimizing all the stalls?

For example, if I have 13.7 cycles per instruction stall long scoreboard, after optimizing it, how much gain I can get? Is there a theoretical way to get that?

veraj · November 19, 2025, 3:05am

Hi, @quan.luo.101

Thanks for using Nsight Compute.
Please check if 2. Profiling Guide — NsightCompute 13.0 documentation can help.

veraj · December 19, 2025, 3:06am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Memory Workload Analysis related metrics Nsight Compute	1	1996	January 30, 2020
Find load store stalls Nsight Compute cuda	3	766	January 12, 2024
Where can i find detail information of all the metrics and concept in the Nsight Compute? Nsight Compute	1	456	August 31, 2022
Long score board metrics Nsight Compute	2	61	December 14, 2025
questions on some metrics Nsight Compute	6	562	May 7, 2019
How to know my kernel if Pipeline parallel by nsight compute Nsight Compute	6	1040	April 18, 2023
Warp stall reduded but performance not improved CUDA Programming and Performance nsight	0	438	October 23, 2022
How to profile how many times an instruction is executed or how much duration it takes? Nsight Compute cuda , kernel , profiling	2	745	January 12, 2024
What does Nsight Compute cuda , kernel	2	435	April 29, 2024
Can I see each thread status? Nsight Compute	6	670	April 11, 2024

How to calculate the exact gain after certain optimization?

Related topics