Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
• DeepStream Version 5.0
• TensorRT Version 7
• NVIDIA GPU Driver Version (450, CUDA 11.00)
We are evaluating suitability of NVIDIA GPUs for video analytics application and have used the deepstream-app as the basis for evaluation and have added support for Nvidia Optical Flow and nvdsanalytics plugins in the pipeline . We have tested the performance on RTX 2080 Ti and Jetson (testing on Jetson has been minimal so far as it doesn’t support Optical Flow). For the purpose of testing, we disable the Optical Flow and nvdsanalytics plugins in the pipeline and limited ourselves to YoloV3 model provided by NVIDIA as part of the samples.
We need to process as many input camera streams (of 1080p x 30FPS over RTSP) as possible (say, 8+) and generate the output RTSP streams with each of the output streams carrying the detection output (OSD).
The problem with RTX 2080 Ti is it limits the number of concurrent encoder sessions to 3.
On the other hand, a Quadro or similar processor do not restrict the number of concurrent sessions. However, it is not clear how many concurrent NVENC sessions can be used on such GPUs to effectively stream 1080p x 30 FPS over RTSP (or even 720p x 30FPS).
Looking at NVIDIA whitepaper on Turing platforms didn’t help. Our own testing so far has been not very encouraging on RTX 2080 Ti (with 1080p or even 720p)
So:
- Which GPU is suitable for implementing DeepLearning Inference (using DeepStream)+ RTSP streaming of 4+ and 8+ streams
- The performance of DeepStream (deepstream-app) doesn’t seem to be different (and continues to be poor with jitter and significant end-to-end delay as well as buffer caching) when incoming streams of lower resolution (e.g. 720p) is used. Would lowering of incoming frame resolution improve the performance?
Thanks for your inputs.