GR3D_FREQ 0%@921 seems low?

I’m using an app based on deepstream-test3 with 4 test rtsp camera sources at 1080p. pgie interval is set to 4.

I notice that tegrastats reports the GR3D_FREG values very low… Generally at 0%@921 and occasionally 5 or 6% and rarely it might jump to 60% or even 99% but mainly seems to report 0%.

Is this right? I thought that running 4 streams batched through the nvinfer element would really heat up the GPU.?!?

Pls refer https://docs.nvidia.com/metropolis/deepstream/dev-guide/index.html#page/DeepStream%2520Development%2520Guide%2Fdeepstream_performance.html%23wwpID0E5HA and https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%2520Linux%2520Driver%2520Package%2520Development%2520Guide%2FAppendixTegraStats.html

HI
Which platform are you using? Here is performance for both DGPU and tegra device, you can find the data there.
https://docs.nvidia.com/metropolis/deepstream/dev-guide/index.html#page/DeepStream%2520Development%2520Guide%2Fdeepstream_performance.html%23wwpID0EQHA

I have referred to both of those links and read them already however they do not explain anything about typical use value and what I should expect to see.

I have already run these all the time:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Is 921 a typical frequency because I have seen other (old) posts where people are stating value over 1100.

Is it normal to see such low GPU usage percentages.

My pipeline is:

rtsp --|
rtsp --|-- decodebin (using nvv4l2h265dec) – streammux – pgie (interval 4) – tracker – streamdemux --|
rtsp --|
rtsp --|

After the streamdemux I have 4 tee’s with fakesinks linked (one for each corresponding source). Then I dynamically add other elements when a pgie detection is found in the meta.

So when I run this pipeline without any detections - so its all just streaming to fakesinks - I would still expect the gpu to be under load as we are inferencing on 4 streams continusouly… Buut mostly I just see a GR3D value of 0%. Occasionally I’ll see 60 and maybe once every 3 or 4 mins I’ll see 99.

Can you please provide some guidance on these number and what we should typically expect.

If I’m seeing low CPU and low GR3D values does that mean I should be increasing the number of surfaces used by the decoder (in the decodebin).

btw. My app is base don deepstream-test3.

Something else I’m interested to know is - the model engine file… I reuse an already generated one so that I don’t have to wait a couple of minutes while it generates every time - is this a bad practice? I’m wondering if that model engine file would be different depending on the batch size I specify? But I don’t know the batch size till runtime…

Forgot to mention I am using Jetson Nano.

Is 921 a typical frequency because I have seen other (old) posts where people are stating value over 1100.

Is it normal to see such low GPU usage percentages.

My pipeline is:

rtsp --|
rtsp --|-- decodebin (using nvv4l2h265dec) – streammux – pgie (interval 4) – tracker – streamdemux --|
rtsp --|
rtsp --|

After the streamdemux I have 4 tee’s with fakesinks linked (one for each corresponding source). Then I dynamically add other elements when a pgie detection is found in the meta.

So when I run this pipeline without any detections - so its all just streaming to fakesinks - I would still expect the gpu to be under load as we are inferencing on 4 streams continusouly… Buut mostly I just see a GR3D value of 0%. Occasionally I’ll see 60 and maybe once every 3 or 4 mins I’ll see 99.

Can you please provide some guidance on these number and what we should typically expect.

[from amycao]
Inference take GPU resource, so when your loading is low, low GPU usage is expected, in your case, interval
is 4, so inference will do when every 4 frame, other 3 frame will do nothing inference, so low GPU loading
is expected.

If I’m seeing low CPU and low GR3D values does that mean I should be increasing the number of surfaces used by the decoder (in the decodebin).

btw. My app is base don deepstream-test3.

[from amycao]
You do not need to increase surface number, you can set pgie interval to lower value like 0 or 1, let gie do
inferencing every frame or every 2 frames.

Something else I’m interested to know is - the model engine file… I reuse an already generated one so that I don’t have to wait a couple of minutes while it generates every time - is this a bad practice? I’m wondering if that model engine file would be different depending on the batch size I specify? But I don’t know the batch size till runtime…

[from amycao]
but in your pipeline, you have 4 rtsp streams?

Thanks @amycao that is helpful.

My last question before closing this thread is the GPU frequency - is 921 an expected value (after running (jetson_clocks)?

Thanks @amycao that is helpful.

My last question before closing this thread is the GPU frequency - is 921 an expected value (after running (jetson_clocks)?

[from amycao]
yes, maxn for GPU frequency on NANO is 921Mhz.