Refer to official website/blogs/forum from Nvidia, Tesla(such as P4) for INT8 accelerating inference is a good choice, and GTX GPUs(such as 1080) is not recommended for accelerating.
But I have try several times, and it seems that 1080 GPU is faster than P4, below is my test environments:
- deepstream 3.0
- tensorRT 5.0
- one GTX 1080 gpu & one Tesla P4 gpu
- ubuntu 18.08
- 1080p h264 video
With INT8 accelerating, SSD object detector inference, GTX 1080 can run 24 streams in real-time, but Tesla P4 can only run 20 streams in real-time. both GPU-Util is between 60%~70%;
With INT8 accelerating, YoloV3 object detector inference, GTX 1080 can run 8 streams in real-time, but Tesla P4 can only run 4 streams in real-time. both GPU-Util is 80% ~90%;
I used nvinfer plugin in both 2 test pipelines, and SSD inference is referred to demo from DeepStream 3.0 Release package, YoloV3 inference is referred to github(https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/tree/master/yolo/samples/objectDetector_YoloV3)
So, who can tell me why 1080 is faster than P4 with INT8 accelerating? thanks!