Inference Time between Tesla K80 and GTX 1080 in Tensorflow

Hi, all.
I have a question.
I run object detection application in tensorflow
But K80 inference time is higher than gtx 1080.

A : Tesla K80 1, Windows Server 2012 R2, CUDA 9.0
B : GTX 1080 1, Windows 7, CUDA 9.0

A inference time is 3.2s ~ 3.4s
B inference time is 1.6s ~ 1.8s

Why ?


Could be due to several factors including system setup, storage configuration, make sure the tensorrt used to optimize the model is the same as the tensorrt used for inference.

can you provide details on the platforms you are using?

Linux distro and version
GPU type
nvidia driver version
CUDA version
CUDNN version
Python version [if using python]
Tensorflow version
TensorRT version

To help us debug, can you share a small repro containing the model, inference code, and sample input data that demonstrate the performance difference?