Deepstream 7.1 runs SUPER SLOW after Jetpack 6.1 update

I recently upgraded my jetson orin nano to to:
JetPack 6.1
L4T: 36.4.0
CUDA Arch BIN: 8.7
CUDA: 12.6.68
TensorRT: 10.3.0.30

I installed OPanCV 4.10.0 with CUDA support using DustyNV’s method.

I then ran the first deepstream test :

deepstream-app -c source30_1080p_dec_infer-resnet_tiled_display_int8.txt

This used to take a couple of minutes the first time you ran it and then it was lightning fast to run through the demo. The second time you ran it it started almost straight away.

Now with JetPack 6.1 it is taking 32 minutes to start the first time and the frame rate is 8.27 seconds per frame. It takes forever to run

Every frame has an error message:

There may be a timestamping problem, or this computer is too slow.
WARNING from sink_sub_bin_sink1: A lot of buffers are being dropped.

When I launch a second time, it still takes another 32 minutes to load. There is an intial warning message:

WARNING: [TRT]: DLA requests all profiles have same min, max, and opt value. All dla layers are falling back to GPU
WARNING: Serialize engine failed because of file path: /opt/nvidia/deepstream/deepstream-7.1/samples/models/Primary_Detector/resnet18_trafficcamnet_pruned.onnx_b30_gpu0_int8.engine opened error

Why is JetPack 6.1 hundreds of time slower than Jetpack 5? Should I revert back as performance has almost dropped to my old Jetson Nano days?

Clearly I am doing something wrong, have I missed installing some pre-requisites?

About this issue, do you have the resnet18_trafficcamnet_pruned.onnx_b30_gpu0_int8.engine file generated in the /opt/nvidia/deepstream/deepstream-7.1/samples/models/Primary_Detector/ directory after running it once?

Can you attach the fps printed separately in both scenarios?

There is no resnet18_trafficcamnet_pruned.onnx_b30_gpu0_int8.engine in that folder. There is a resnet18_trafficcamnet_pruned.onnx file.

While running under JetPack 6.1 with MAXN enabled I got a throttling warning this time. I didn’t get this when using 15 watts. My power supply is the genuine nvidia 45 Watt unit.

I cant show how it used to run as I would have to start over again flashing my NVME drive with the old JetPack 5 version. Believe me , it was super smooth

The engine file cannot be found in this directory may be caused by a permission issue. This causes the file to be regenerated each time. You can run the sudo chown -R user:user /opt/nvidia/ command first.

About the perf issue, could you run the command below to boost the clocks first?

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

sudo chown -R user:user /opt/nvidia/ should be my username right? so sudo chown -R simon:simon /opt/nvidia/

I had already set the nvpmodel and jetson clocks.

I did this and it still did not create the file you mentioned and it still looks like it will take 32 minutes to load. I am waiting for it.

It finished eventually and this time it did create the file resnet18_trafficcamnet_pruned.onnx_b30_gpu0_int8.engine in that folder

I checked on the NVIDIA site and it says the Jetson Orin Nano should be able to handle 30 fps.

Yes. This way the engine file will not be regenerated on the second run.

It says it can support 8 No. of Stream @30 FPS for h264. You are running 30 No. of stream in this scenario. You can try to use 8 No. of stream to test that.

Well that worked. I feel like an idiot. However I am still getting throttling. The 8 displays (i modified the 4 screen version source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt to give me 8 feeds of H265) ran smoothly at 20 fps initially then 15 fps once throttling started. I looked at the Jetson Power Gui and it shows instantaneouos VDD_IN peaking at 15 watts. I though that MAXN would let me go to 25 watts. Am I missing something?

The CPU is now clocking up to 1700 and the GPU up to 1 GHz so the rest of ‘super’ features in jetPack 6.1 seem to be working.

You can see it briefly got to 16 watts. Or should I add VDD_IN, SOC and CPU GPU CV figures together?

Hi,
As a quick solution please fall back to Jetpack 5. And share us the steps to reproduce it on Jetpack 6.1. We will set up Orin Nano developer kit and check.

I really don’t want to have to go through all the setup again to test Jetpack 5. I will stick with JetPack 6.1
I note I was getting 15 - 20 fps on 8 feeds of the sample H265 video, when the documentation says I should be able to run 13 H265 feeds at 30 fps. So I am getting only about 50% of the stated performance at best, and probably much less since I had 8 feeds instead of 13.

Have you followed our Guide to modify the configuration file before running?
Change the following items in the config file

I hadn’t seen that. I have now disabled display render, disabled OSD, and set batches to 12 for primary GIE and StreamMUX. Re-running now with MAXN and jetson_clocks running.

I assume to enable IOU tracker I just comment out the existing tracker and uncomment the tracker with IOU in the name. Since I am running in MAXN I assumed I would beat the fps figures in the table and I did.

With 12 streams of H265 I got 40 to 42 fps. With 16 streams I got 36 fps.

Thrilled!