Back to Back detector performance

Hi

Is there a way to get higher performance of running back-to-back-detectors with Deepstream5.0 on NX?

We tested back-to-back-detectors with DS5 on NX and got not smooth inference result. The speed might be lower than 15 FPS.
Also the GPU usage was around 99%. (see attached.)
Already set nvpmodel to 2 and enabled jetson-clock.
Moreover, modifying interval did not see any change.

Is it normal?

Thank you for any advice,

Hi,

Please try the following command to maximize the performance:

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

If the above cannot meet your requirement, please run the pipeline with FP16 or INT8 mode.
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinfer.html#gst-nvinfer-file-configuration-specifications

[property]
...
network-mode=1  # 0: FP32 1: INT8 2: FP16

Thanks.

Hi AastaLLL,

Thank you for your prompt support.

May I check again?
On NX , should we set nvpmodel to 0 or 2 for max performance?

As network-mode was set to 1 and we got “INT calibration file not specified, TRYING FP16” ( see attached.)
However, using FP 16 seems better than FP32.
Just like to know, is it possible to use INT 8?

Thank you,

Hi,

1.
You can choose nvpmodel based on your use case.

Mode 0 has a higher clock rate, but only 2 CPU enabled.
Mode 2 enables all the CPUs, but the clock rate is lower.

So if your task is not CPU-intensive (parallel to more than 2 CPU), mode 0 should give you a better performance.
https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%2520Linux%2520Driver%2520Package%2520Development%2520Guide%2Fpower_management_jetson_xavier.html%23wwpID0E0NO0HA

2.
INT8 requires GPU architecture > 7.x, and Xavier NX does meet the requirement.
The error indicates you need a calibration table to map a floating number into an integer.

Based on the configure below, it should find a calibration table named cal_trt.bin.
Could you check if the file exists in your environment first?

Thanks.