Hi, I am running various CUDA samples (e.g. nbody and BlackScholes) on my TK1 to try and understand the performance statistic output using tegrastats (and also cat /sys/devices/platform/host1x/gk20a.0/load). I have edited the BlackScholes to run for many minutes on the GPU (e.g. > 10min) to make sure that there would not be some kind of statistical sampling issue. Note that I have manually overclocked the GPU and also turned on all 4 cores. Throughout the entire time I run the simulation using CUDA calls I get the following from tegrastats:
RAM 542/1746MB (lfb 170x4MB) cpu [0%,100%,0%,0%]@2320 EMC 71%@924 AVP 0%@40 VDE 120 GR3D 0%@804 EDP limit 2320
Wouldnt this indicate that the GPU is never used? Note that at the end of the simulation the BlackScholes executable the CPU and GPU times are printed so I assume it is calling CUDA routines.
Why is the GPU load always zero in this case (I have also seen this when I run nbody even when nbody gets similar performance as reported here for TK1 GPU usage: https://www.pugetsystems.com/labs/hpc/NVIDIA-Jetson-TK1-CUDA-performance-569/ … Do I need to turn on the counters or set some environment variable? I checked many other posts that report this issue and there was no clear answer. Thanks.