NSight 3.0 - CUDA Instruction Count

GeorgT · January 8, 2013, 3:07pm

Hi,

I am using NSight 3.0 VSE, and doing a remote analysis. The target system is a Windows 2012 Server containing two Tesla K10 cards. The host system is Windows 7, VS 2008. Both, target and host, are 64bit.

To analyse my kernel, I ran one of my test apps on the remote machine and selected 12 Experiments to be run on the second (in terms of call order) of my two kernels (I filter it using the “Kernels to Profile” input field in the experiment settings of the nvact window). One of the experiments is “Instruction Count” from the “Source-Level Experiments” group. The activity type was “Profile CUDA Application”.

The analysis runs fine, but in the results I encounter a strange thing: In the results for the “CUDA Instruction Count” experiment I see code lines which should never be reached by the kernel to be analysed (but by the first one). The code line is the return statement of a device function which will be called by the first kernel.

I assume this is a bug, or is it some pointer to a problem in my code?

Georg

Edit: I am using Cuda 5.0 and compiling solely for sm_30.

Greg · January 10, 2013, 6:53am

The Source-Level Experiments collect information per SASS instruction and roll the information up to PTX and C Source Code. If the kernel did not have line information or if the optimization significantly modified the code the tool will not provide a good roll-up of SASS instruction statistics to higher level source lines.

In your case I’m not sure if you compiler is generating poor line information for your kernel or if line information was not generated?

Did you enable “Generate Line Information” in your Release configuration?

If you run the experiment on a Debug kernel you should get very accurate line information. This can be useful for certain types of debugging such as looking at control flow statistics, memory access patterns, and find the source that generate double precision instructions (your other question).

GeorgT · January 10, 2013, 7:58am

Thanks for the information. These are the nvcc flags: “-code=sm_30 -arch=compute_30 -lineinfo -Xptxas -v”

Jeff_Davis · January 16, 2013, 3:55am

Do you see the Instruction Count information using a Debug configuration?

Topic		Replies	Views
Is there any tools that can collect the instruction information of one cuda program? CUDA Programming and Performance	5	1724	June 14, 2013
Count instructions of compilation Nsight Visual Studio Edition	1	871	May 7, 2013
Profiling at a Source Level Nsight Visual Studio Edition	0	1083	May 8, 2013
NSight 3.0 - How to find out which instructions are double precision? Nsight Visual Studio Edition	2	1582	January 10, 2013
Question about NVIDIA CUDA Visual Profiler Version 2.2 CUDA Programming and Performance	0	2932	November 13, 2009
Question of NVIDIA CUDA Visual Profiler Version 2.2 CUDA Programming and Performance	1	1025	November 13, 2009
instructions_issued and instructions_executed not available CUDA Programming and Performance	2	1042	March 15, 2011
"No Events Captured" - When using Nsight 2.2 analysis tool with vs2010 Nsight Visual Studio Edition	3	1947	March 11, 2013
Kernel launch fails with most Nsight options Nsight Visual Studio Edition	1	705	October 24, 2014
Profiler: Instruction count Details about the instruction count field in the profiler output CUDA Programming and Performance	1	1564	January 10, 2009

NSight 3.0 - CUDA Instruction Count

Related topics