I want to measure global memory load efficiency,but when I run
nvprof --metrics gld_efficiency ./HellowWorld.exe
,it shows
nvprof --metrics gld_throughput .\HellowWorld.exe 32 32
======== Warning: Skipping profiling on device 0 since profiling is not supported on devices with compute capability greater than 7.2
==12656== NVPROF is profiling process 12656, command: .\HellowWorld.exe 32 32
SumOnGPU Time Cost 6.138 ms
sumMatrixOnGPU2D <<<(512,512), (32,32)>>>
==12656== Profiling application: .\HellowWorld.exe 32 32
==12656== Profiling result:
No events/metrics were profiled.
How can I measure global memory load efficiency on RTX2080ti,which compute capability is 7.5.
And when I run it on 1080ti(another GPU on my computer),it shows:
nvprof --metrics gld_throughput .\HellowWorld.exe 32 32
======== Warning: Skipping profiling on device 0 since profiling is not supported on devices with compute capability greater than 7.2
==16568== NVPROF is profiling process 16568, command: .\HellowWorld.exe 32 32
==16568== Error: Internal profiling error 4292:1.
SumOnGPU Time Cost 18.090 ms
======== Error: CUDA profiling error.
OS Win10 x64
CUDA 10.0
In my code,I use
cudaSetDevice();
to select GPU,but nvprof also shows "Skipping profiling on device 0 ",should I Shield 2080Ti with Environment Variable?