Deepstream TRTIS Example Apps on Jetsons extremely slow

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
Both Jetson Nano Developer Kit and Jetson Xavier NX Developer Kit

• DeepStream Version

Docker image used: nvcr.io/nvidia/deepstream-l4t:5.1-21.02-samples

• JetPack Version (valid for Jetson only)

jetson-nx-jp451-sd-card-image.zip and jetson-nano-jp451-sd-card-image

• TensorRT Version
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)

Bug / Question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

Flash the Jetson (both Nano and Xavier NX).

Run the following commands:

xhost +

sudo docker run -it --rm
–net=host
–runtime nvidia
-e DISPLAY=DISPLAY \ -w /opt/nvidia/deepstream/deepstream-5.1 \ -v (pwd):/code
-v $(pwd)/files/faster_rcnn_inception_v2:/opt/nvidia/deepstream/deepstream-5.1/samples/trtis_model_repo/faster_rcnn_inception_v2
-v /tmp/.X11-unix/:/tmp/.X11-unix
nvcr.io/nvidia/deepstream-l4t:5.1-21.02-samples

Inside the Docker

cd /opt/nvidia/deepstream/deepstream-5.1/samples

./prepare_ds_trtis_model_repo.sh

cd /opt/nvidia/deepstream/deepstream-5.1/samples/configs/deepstream-app-trtis

apt-get update && apt-get install ffmpeg

deepstream-app -c source1_primary_classifier.txt

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

N/A

Basically I am trying to run the Deepstream Triton Inference Server examples on both a Jetson Nano Developer kit and a Jetson Xavier NX Developer Kit.

When I try to run the example on either platform, the performance of the unit will slow to a complete halt so that even the clock freezes, mouse freezes etc. After a long while (1 hour + ) the Xavier eventually showed a static image of a bus and also the terminal will show some text indicating 0fps but then back to frozen.

What am I doing wrong here?

Did you boost the CPU/GPU/DDR clock by commands below?
If you didn’t, please take a try.

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

I can confirm that this does make it run on the Xavier, although I get a warning that the system is throttling due to overcurrent.

The perf looks like this:

deepstream-app -c source1_primary_classifier.txt

**PERF: 0.00 (0.00)
**PERF: 0.00 (0.00)
**PERF: 0.00 (0.00)
**PERF: 0.00 (0.00)
**PERF: 0.60 (0.59)
**PERF: 18.96 (8.44)
**PERF: 22.36 (12.61)

It also only shows the goldfish (the first image in the video) and none of the others.

I’ll have a play about with more examples.

Running:
deepstream-app -c source1_primary_detector.txt

on the Xavier gives very stuttery performance and overcurrent/throttling warnings. It seems to stall the video (for minutes!) half way through too. Is this expected?

**PERF: 0.00 (0.00)
**PERF: 0.00 (0.00)
**PERF: 0.00 (0.00)
**PERF: 4.27 (3.86)
**PERF: 5.08 (4.88)
**PERF: 5.94 (5.28)
**PERF: 5.97 (5.01)
**PERF: 0.00 (3.82)
**PERF: 0.00 (3.10)
**PERF: 0.00 (2.63)
**PERF: 0.00 (2.25)
**PERF: 0.00 (1.97)
**PERF: 0.00 (1.77)
**PERF: 0.00 (1.61)
**PERF: 0.00 (1.46)

**PERF: FPS 0 (Avg)
**PERF: 0.00 (1.34)
**PERF: 0.00 (1.25)
**PERF: 0.00 (1.16)
**PERF: 0.00 (1.08)
**PERF: 0.00 (1.01)
**PERF: 0.00 (0.96)
**PERF: 0.00 (0.91)
**PERF: 0.00 (0.86)
**PERF: 0.00 (0.81)
**PERF: 0.00 (0.78)
**PERF: 0.00 (0.74)
**PERF: 0.00 (0.71)
**PERF: 0.00 (0.68)
**PERF: 0.00 (0.65)
**PERF: 0.00 (0.63)
**PERF: 0.00 (0.61)
**PERF: 0.00 (0.59)
**PERF: 0.00 (0.57)
**PERF: 0.00 (0.55)
**PERF: 0.00 (0.53)

**PERF: FPS 0 (Avg)
**PERF: 0.00 (0.51)
**PERF: 0.00 (0.50)
**PERF: 0.00 (0.48)
**PERF: 0.00 (0.47)
**PERF: 0.00 (0.46)
**PERF: 0.00 (0.44)
**PERF: 0.00 (0.43)
**PERF: 0.00 (0.42)
**PERF: 0.00 (0.41)
**PERF: 0.00 (0.40)
**PERF: 0.00 (0.39)
**PERF: 0.00 (0.38)
**PERF: 0.00 (0.37)
**PERF: 0.00 (0.37)
**PERF: 0.00 (0.36)
**PERF: 0.00 (0.35)
**PERF: 0.00 (0.34)
**PERF: 0.00 (0.34)
**PERF: 0.00 (0.33)
**PERF: 0.00 (0.32)

**PERF: FPS 0 (Avg)
**PERF: 0.04 (0.35)
**PERF: 4.70 (0.43)
**PERF: 3.35 (0.43)
**PERF: 0.00 (0.43)
**PERF: 0.00 (0.42)
**PERF: 0.00 (0.41)
**PERF: 0.00 (0.40)
**PERF: 0.00 (0.40)
**PERF: 0.00 (0.39)
**PERF: 0.41 (0.44)
**PERF: 5.98 (0.53)
**PERF: 6.79 (0.63)
**PERF: 7.15 (0.72)
**PERF: 7.49 (0.83)
**PERF: 7.61 (0.93)
**PERF: 7.66 (1.03)
**PERF: 7.75 (1.13)
**PERF: 7.73 (1.23)
**PERF: 7.75 (1.32)
**PERF: 7.51 (1.40)

**PERF: FPS 0 (Avg)
**PERF: 7.73 (1.49)
**PERF: 7.75 (1.58)
**PERF: 7.73 (1.66)
**PERF: 7.74 (1.74)
**PERF: 7.43 (1.82)
**PERF: 7.66 (1.89)
**PERF: 7.67 (1.96)
**PERF: 7.70 (2.04)
**PERF: 7.68 (2.11)
**PERF: 7.80 (2.18)
**PERF: 7.74 (2.24)
**PERF: 7.79 (2.31)
**PERF: 7.79 (2.38)
**PERF: 7.77 (2.44)
**PERF: 7.76 (2.50)
**PERF: 7.91 (2.56)
**PERF: 7.75 (2.62)
**PERF: 7.21 (2.67)
**PERF: 7.88 (2.73)
**PERF: 7.79 (2.79)

**PERF: FPS 0 (Avg)
**PERF: 7.85 (2.84)
**PERF: 7.77 (2.89)
**PERF: 7.81 (2.95)
**PERF: 7.76 (3.00)

sorry for delay! We are in holiday in the past days.

Could you share the tegrastats log captured with below steps?

  1. Running the test, e.g.
    deepstream-app -c source1_primary_detector.txt
  2. capture tegrastats log in another terminal
    $ sudo tegrastats
    share the log of tegrastats , ~ 30 seconds’ log is ok.

I can reproduce the fps on NX as you saw.
I think it’s expected, triton perform worse than TensorRT, I would recommend you to run the model with TensorRT @FP16 or @INT8 precision to get better performance.