GTX 1080 is fatser than Tesla P4 with INT8 accelerating ???

Hello,

Refer to official website/blogs/forum from Nvidia, Tesla(such as P4) for INT8 accelerating inference is a good choice, and GTX GPUs(such as 1080) is not recommended for accelerating.

But I have try several times, and it seems that 1080 GPU is faster than P4, below is my test environments:

  1. deepstream 3.0
  2. tensorRT 5.0
  3. one GTX 1080 gpu & one Tesla P4 gpu
  4. ubuntu 18.08
  5. 1080p h264 video

results:

  1. With INT8 accelerating, SSD object detector inference, GTX 1080 can run 24 streams in real-time, but Tesla P4 can only run 20 streams in real-time. both GPU-Util is between 60%~70%;

  2. With INT8 accelerating, YoloV3 object detector inference, GTX 1080 can run 8 streams in real-time, but Tesla P4 can only run 4 streams in real-time. both GPU-Util is 80% ~90%;

I used nvinfer plugin in both 2 test pipelines, and SSD inference is referred to demo from DeepStream 3.0 Release package, YoloV3 inference is referred to github(https://github.com/NVIDIA-AI-IOT/deepstream_reference_apps/tree/master/yolo/samples/objectDetector_YoloV3)

So, who can tell me why 1080 is faster than P4 with INT8 accelerating? thanks!

1080 has more cuda cores.

1080 has 3584 cuda cores.
Tesla P4 has 2560 cuda cores.

And the pclk of 1080 can be higher.

From nvidia policy, 1080 can’t be used to do inference.

OK, Thanks!

So the result is reasonable.