The issue of GPU usage in tensorrt dla inference models

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
[*] DRIVE OS 6.0.4 SDK
other

Target Operating System
[*] Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
[*] other

SDK Manager Version
1.9.1.10844
[*] other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
[*] other

When I ran our deep learning model using tensorrt’s DLA0 and DLA1, it was clear that only a few operators were running on the GPU, and we found that GR3D_FREQ 40% when we tested with tegrastats.

However, using Nsight Systems analysis, we found that GPU resources take up very little resources and time, as shown in the figure below.

What is this phenomenon and how is tegrastats’ GPU share calculated?

Dear @haihua.wei,
Do you see any difference with changing interval parameter in tegrastats?
Nsight here shows timeline view of application execution. Tegrastats shows the how much GPU is in use when it is sampled at that time. The display output gets refreshed in certain interval time. It does not indicate if the GPU is use constantly.

We haven’t tried tegrastats yet --interval, let’s try it and see what happens

tegrastats --interval,tried it and still had the same result

Are there other tools to visualize GPU usage, memory bandwidth usage, and MAC utilization statistics?

Dear @haihua.wei,
Did you check NVIDIA Nsight compute tool? It can be used to identify issues at CUDA kernel and get guidance. Also it track of various metrics. Please see Nsight Compute :: Nsight Compute Documentation as well.

Let me check internally on tegrastats behavior and update you.

Dear @haihua.wei,
how is tegrastats’ GPU share calculated?

GPU utilization here indicates how many sampled cycles are active during a period. so even GPU utilization shows 99%, it doesn’t mean GPU has achieved its maximum computing power. I hope it clarifies.

Where can I download the Drive OS version of Nsight Compute? Can it be used to monitor DLA?
@SivaRamaKrishnaNV

Dear @haihua.wei,
Do you see /usr/local/cuda/bin/ncu-ui?
For DLA trace we need to nsys.

Thanks, we found it.
@SivaRamaKrishnaNV

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.