How to test have used all the GPU computing resources on the tx2

Hi.
I use one thread to do inference on the tx2, and run “sudo ~/tegrastats”

RAM 4245/7846MB (lfb 4x2MB) CPU [48%@2018,0%@2034,0%@2035,42%@2016,37%@2025,45%@2022] EMC_FREQ 17%@1866 GR3D_FREQ 57%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42.5C PLL@41.5C Tboard@36C Tdiode@40.5C PMIC@100C thermal@41.7C VDD_IN 9265/9279 VDD_CPU 1372/1397 VDD_GPU 3276/3164 VDD_SOC 1220/1174 VDD_WIFI 115/134 VDD_DDR 2131/2147
RAM 4245/7846MB (lfb 4x2MB) CPU [46%@2036,0%@2034,0%@2036,43%@2036,35%@2036,49%@2035] EMC_FREQ 17%@1866 GR3D_FREQ 50%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42.5C PLL@41.5C Tboard@36C Tdiode@40.75C PMIC@100C thermal@41.9C VDD_IN 9265/9279 VDD_CPU 1372/1396 VDD_GPU 3200/3166 VDD_SOC 1143/1173 VDD_WIFI 230/138 VDD_DDR 2131/2146
RAM 4245/7846MB (lfb 4x2MB) CPU [42%@2005,0%@2036,0%@2035,42%@2019,30%@2016,46%@2016] EMC_FREQ 17%@1866 GR3D_FREQ 75%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42.5C PLL@41.5C Tboard@36C Tdiode@40.75C PMIC@100C thermal@41.9C VDD_IN 9265/9278 VDD_CPU 1372/1395 VDD_GPU 3048/3161 VDD_SOC 1143/1172 VDD_WIFI 288/144 VDD_DDR 2150/2147
RAM 4245/7846MB (lfb 4x2MB) CPU [45%@2034,0%@2035,0%@2035,42%@2037,33%@2036,44%@2035] EMC_FREQ 17%@1866 GR3D_FREQ 45%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42C PLL@41.5C Tboard@36C Tdiode@40.75C PMIC@100C thermal@41.9C VDD_IN 8845/9262 VDD_CPU 1295/1391 VDD_GPU 3125/3160 VDD_SOC 1143/1171 VDD_WIFI 96/142 VDD_DDR 2092/2145
RAM 4245/7846MB (lfb 4x2MB) CPU [43%@2036,1%@2034,0%@2035,43%@2035,31%@2034,43%@2032] EMC_FREQ 17%@1866 GR3D_FREQ 76%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42.5C PLL@41.5C Tboard@36C Tdiode@40.75C PMIC@100C thermal@41.9C VDD_IN 8960/9251 VDD_CPU 1295/1388 VDD_GPU 3125/3159 VDD_SOC 1143/1170 VDD_WIFI 76/140 VDD_DDR 2112/2143
RAM 4247/7846MB (lfb 266x1MB) CPU [50%@2015,0%@2035,1%@2036,44%@2020,30%@2020,41%@2018] EMC_FREQ 17%@1866 GR3D_FREQ 69%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42.5C PLL@41.5C Tboard@36C Tdiode@40.25C PMIC@100C thermal@41.9C VDD_IN 9303/9253 VDD_CPU 1372/1387 VDD_GPU 2971/3152 VDD_SOC 1296/1174 VDD_WIFI 192/142 VDD_DDR 2112/2142
RAM 4247/7846MB (lfb 266x1MB) CPU [45%@2035,0%@2035,0%@2036,42%@2035,30%@2036,44%@2035] EMC_FREQ 17%@1866 GR3D_FREQ 71%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42.5C PLL@41.5C Tboard@36C Tdiode@40.75C PMIC@100C thermal@42.2C VDD_IN 9226/9252 VDD_CPU 1372/1387 VDD_GPU 3276/3156 VDD_SOC 1220/1176 VDD_WIFI 57/139 VDD_DDR 2131/2142
RAM 4246/7846MB (lfb 266x1MB) CPU [44%@2034,0%@2034,0%@2035,45%@2034,30%@2035,42%@2035] EMC_FREQ 17%@1866 GR3D_FREQ 69%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42.5C PLL@41.5C Tboard@36C Tdiode@40.75C PMIC@100C thermal@41.9C VDD_IN 9112/9248 VDD_CPU 1295/1384 VDD_GPU 3276/3160 VDD_SOC 1143/1175 VDD_WIFI 134/139 VDD_DDR 2092/2140
RAM 4247/7846MB (lfb 266x1MB) CPU [40%@2023,0%@2035,0%@2035,41%@2035,27%@2031,46%@2029] EMC_FREQ 17%@1866 GR3D_FREQ 64%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42.5C PLL@41.5C Tboard@36C Tdiode@40.75C PMIC@100C thermal@41.9C VDD_IN 8998/9240 VDD_CPU 1295/1381 VDD_GPU 3201/3161 VDD_SOC 1143/1174 VDD_WIFI 134/138 VDD_DDR 2131/2140
RAM 4247/7846MB (lfb 265x1MB) CPU [45%@2034,1%@2036,0%@2036,41%@2037,33%@2035,43%@2035] EMC_FREQ 17%@1866 GR3D_FREQ 75%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42.5C PLL@41.5C Tboard@37C Tdiode@40.75C PMIC@100C thermal@41.7C VDD_IN 8921/9230 VDD_CPU 1372/1381 VDD_GPU 3201/3163 VDD_SOC 1143/1173 VDD_WIFI 38/135 VDD_DDR 2150/2140
RAM 4247/7846MB (lfb 265x1MB) CPU [46%@2034,1%@2034,1%@2034,41%@2036,33%@2035,48%@2034] EMC_FREQ 17%@1866 GR3D_FREQ 49%@1300 APE 150 MTS fg 0% bg 0% BCPU@41.5C MCPU@41.5C GPU@42.5C PLL@41.5C Tboard@37C Tdiode@41C PMIC@100C thermal@41.7C VDD_IN 8998/9223 VDD_CPU 1295/1378 VDD_GPU 3125/3161 VDD_SOC 1143/1172 VDD_WIFI 76/134 VDD_DDR 2112/2139
RAM 4248/7846MB (lfb 230x1MB) CPU [43%@2012,1%@2035,0%@2034,50%@2011,31%@2019,47%@2019] EMC_FREQ 17%@1866 GR3D_FREQ 69%@1300 APE 150 MTS fg 0% bg 0% BCPU@42C MCPU@42C GPU@42.5C PLL@42C Tboard@37C Tdiode@41C PMIC@100C thermal@41.9C VDD_IN 9150/9221 VDD_CPU 1448/1380 VDD_GPU 3123/3160 VDD_SOC 1372/1177 VDD_WIFI 76/132 VDD_DDR 2150/2140

GR3D_FREQ not reached 100%, so I think I have not use all the GPU computing resources and I run second thread to do inference use the same model as the first thread and test "GR3D_FREQ
" , it was not raised and the two thread’s model inference time is slower than one thread’s, did GPU computing resources have reached upper limit and the "GR3D_FREQ not reach 100% ?
"

Hi,

Depends on the application is memory-bound or computational-bound.
Try to optimize your application with this guide:
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#performance-guidelines

Thanks.