• Hardware Platform: GPU
• DeepStream Version: 6.1 (nvcr.io/nvidia/deepstream:6.1-triton)
• JetPack Version: -
• TensorRT Version: 8.2.5.1
• NVIDIA GPU Driver Version 515.57
• Issue Type: Question/Bug?
Hi, we are investigating whether we can use the nvinferserver plugin instead of the nvinfer plugin for real-time inference (i.e. > 30 FPS).
In the past, we did real-time inference using a YoloV5 model converted to TensorRT with the nvinfer plugin on a resolution of 640x640 or 1088x1920 (We have a different TRT engines for each resolution). In this setup, we achieved at least 30 FPS with any of the two models.
Since nvinferserver offers more options to deploy unoptimized models for demo and test purposes, we tried to run the same gstreamer pipelines using nvinferserver instead of nvinfer for inference.
• How to reproduce the issue?
Gst-launch pipelines:
nvinfer:
GST_DEBUG=markout:5 GST_PLUGIN_PATH=/gstreamer_timestamp_marking/src/ gst-launch-1.0 filesrc location=/videos/video.mp4 ! qtdemux ! video/x-h264 ! h264parse ! avdec_h264 ! videorate ! video/x-raw,framerate=30/1 ! videoconvert ! nvvideoconvert ! m.sink_0 nvstreammux name=m batch-size=1 width=1920 height=1080 ! markin ! nvinfer config-file-path=/model_repo/nvinfer_config_object_detection_yolov5_trt.txt ! markout ! nvvideoconvert ! nvdsosd ! nvvideoconvert ! videoconvert ! fpsdisplaysink video-sink="nveglglessink" sync=1 2> /model_repo/nvinfer_time_measures.txt
nvinferserver:
GST_DEBUG=markout:5 GST_PLUGIN_PATH=/gstreamer_timestamp_marking/src/ gst-launch-1.0 filesrc location=/videos/video.mp4 ! qtdemux ! video/x-h264 ! h264parse ! avdec_h264 ! videorate ! video/x-raw,framerate=30/1 ! videoconvert ! nvvideoconvert ! m.sink_0 nvstreammux name=m batch-size=1 width=1920 height=1080 ! markin ! nvinferserver config-file-path=/model_repo/nvinferserver_config_object_detection_yolov5_trt.txt ! markout ! nvvideoconvert ! nvdsosd ! nvvideoconvert ! videoconvert ! fpsdisplaysink video-sink="nveglglessink" sync=1 2> /model_repo/nvinferserver_trt_time_measures.txt
For all tests we used the following configs:
We set batch size to 1 and interval to 0. Other parameters such as iou threshold, confidence threshold, process_mode, etc. were set up to be identical to the nvinfer settings. Both nvinfer and nvinferserver utilize the same TRT engine.
Results:
For the 640x640 TRT model, the inference times were identical for nvinfer and nvinferserver.
For the 1088x1920 TRT model, nvinferserver was significantly slower than nvinfer.
The 640x640 model could be run at 30 FPS with nvinfer and nvinferserver
The 1088x1920 model could run at 30 FPS with nvinfer, but only at ~23 FPS with nvinferserver. Running the 1088x1920 model with nvinferserver at 30 FPS resulted in many frame drops and laggy video.
Question:
Did we miss something that could lead to worse results on nvinferserver, when the image resolution increases?
Follow-up Question:
Can you recommend a way to get more reliable inference time stamping? → How can we accurately measure, how long it takes for a frame to be inferenced in ms in the deepstream framework?
We used Markin and Markout (GitHub - trawn3333/gstreamer_timestamp_marking: Utility elements for measuring timestamps between arbitrary ranges in gstreamer pipelines.) to track the time needed for the inference plugin on 640x640 resolution. This approach resulted in:
- average 65ms with nvinfer
- average 65 ms with nvinferserver
Since we suspected inaccurate/false measurements, we also implemented our own time stamping based on probes that write into and read from gstreamer buffer when a frame enters/leaves the inference plugin and we obtained exactly the same inference times.
The issue is that if those time measurements were correct, the stream could only run with max. ~20 FPS. However, from our observations, the stream runs at 60 FPS without any issues, so inference time must be lower than 16ms!
Best,
Alex