For an end-to-end application(read image file → pre-process → Inference using network → post-process)(not displaying output), Is it generally suggested(considering only FPS) to use DeepStream? or TensorRT Python/C++ API?
Is DeepStream C++ API faster than TensorRT C++ API, for the above pipeline?
When do we construct the entire pipeline using TensorRT C++ API & when is DS used?
Talking exclusively about the DNN part, is the same done faster by DS? or by TRT C++ API?
If TRT C++API does DNN part faster, will this speed improvement outweigh the speed improvement provided by DS from other parts like img read, pre-processing, post-processing?
Apologies if I wasn’t clear before.
P.S:
In the first post, By TRT C++ API, I meant DNN part using TRT C++ API & the rest using a different C++ code.
Talking exclusively about the DNN part, is the same done faster by DS? or by TRT C++ API?
For DNN itself, DS calls the TRT API to do inference, so they have the same perf.
But, for whole pipeline, DS has many optimization for DNN(s), such as, reducing the memory to save memory traffic which is helpful for improving the DNN perf, getting several DNN(s) running in parallel
Is ‘The support of other frameworks like TF, PyTorch, etc’ the ONLY reason why one might want to use TRTIS?
Also, is it the case that the DNN part’s speed of ‘DS with nvinfer’ will be faster than TRTIS?(since DS has TensorRT backend whereas TRTIS doesn’t(Correct me if this assumption is wrong))?
One last query:
It is written on one of nvidia’s website that TRTIS supports TensorRT as well.
The same website says “It runs models concurrently on GPUs maximizing utilization”. Can you please expand this statement?
I’m developing an application for object detection using yolo. I’m just trying to evaluate which of TRTIS / DS will be faster for my application. Any comments on that?
I’m developing an application for object detection using yolo. I’m just trying to evaluate which of TRTIS / DS will be faster for my application. Any comments on that?
About perf, I have explained above. You can also benchmark it with TRTIS/DS respectively.