the command is nv-nsight-cu-cli -f sumMatrix
output is :
sumMatrixOnGPU2D <<<(512,512), (32,32)>>> elapsed 1.790660 s
==PROF== Profiling - 1: 0%....50%....100%
==PROF== Report: profile.nsight-cuprof-report
sumMatrixOnGPU2D, 2019-Aug-14 17:31:28
Section: GPU Speed Of Light
---------------------------------------------------------------------- --------------- ------------------------------
Memory Frequency n/a
SOL FB n/a
Elapsed Cycles n/a
SM Frequency n/a
Memory [%] n/a
Duration n/a
SOL L2 n/a
SOL TEX n/a
SM [%] n/a
---------------------------------------------------------------------- --------------- ------------------------------
Section: Compute Workload Analysis
---------------------------------------------------------------------- --------------- ------------------------------
Executed Ipc Active n/a
Executed Ipc Elapsed n/a
Issued Ipc Active n/a
Issue Slots Busy n/a
SM Busy n/a
---------------------------------------------------------------------- --------------- ------------------------------
Section: Memory Workload Analysis
---------------------------------------------------------------------- --------------- ------------------------------
Memory Throughput n/a
Mem Busy n/a
Max Bandwidth n/a
L2 Hit Rate n/a
Mem Pipes Busy n/a
L1 Hit Rate n/a
---------------------------------------------------------------------- --------------- ------------------------------
Section: Scheduler Statistics
---------------------------------------------------------------------- --------------- ------------------------------
Active Warps Per Scheduler n/a
Eligible Warps Per Scheduler n/a
No Eligible n/a
Instructions Per Active Issue Slot n/a
Issued Warp Per Scheduler n/a
One or More Eligible n/a
---------------------------------------------------------------------- --------------- ------------------------------
Section: Warp State Statistics
---------------------------------------------------------------------- --------------- ------------------------------
Avg. Not Predicated Off Threads Per Warp n/a
Avg. Active Threads Per Warp n/a
Warp Cycles Per Executed Instruction n/a
Warp Cycles Per Issued Instruction n/a
Warp Cycles Per Issue Active n/a
---------------------------------------------------------------------- --------------- ------------------------------
Section: Instruction Statistics
---------------------------------------------------------------------- --------------- ------------------------------
Avg. Executed Instructions Per Scheduler n/a
Executed Instructions n/a
Avg. Issued Instructions Per Scheduler n/a
Issued Instructions n/a
---------------------------------------------------------------------- --------------- ------------------------------
Section: Launch Statistics
---------------------------------------------------------------------- --------------- ------------------------------
Block Size 1,024
Grid Size 262,144
Registers Per Thread register/thread 16
Shared Memory Configuration Size Kbyte 48
Dynamic Shared Memory Per Block byte/block 0
Static Shared Memory Per Block byte/block 0
Threads thread 268,435,456
Waves Per SM 8,738.13
---------------------------------------------------------------------- --------------- ------------------------------
Section: Occupancy
---------------------------------------------------------------------- --------------- ------------------------------
Block Limit SM block 16
Block Limit Registers register 4
Block Limit Local Mem byte nan
Block Limit Warps warp 1
Achieved Active Warps Per SM n/a
Achieved Occupancy n/a
Theoretical Active Warps per SM warp/cycle 32
Theoretical Occupancy % 100
---------------------------------------------------------------------- --------------- ------------------------------
why there is so many n/a in output?
thanks in advance.