RTX3070 performance with TensorRT


I use RTX3070 to do a ResNet18 inference and get a throughput of <300/s at batch 50. At the same environment, the RTX2080Ti can reach 500/s.
In this a normal result? Or the drivers should be optimized?


TensorRT Version: 7.2.1
GPU Type: RTX3070
Nvidia Driver Version:455.45
CUDA Version: 11.1
CUDNN Version: 8.0.5
Operating System + Version: Ubuntu 16.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files


Steps To Reproduce


Hi @gahang,

Could you please share the model and script file to reproduce this issue?
Also if possible, could you please share the verbose logs and profiler output of both the test runs?