Peoplenet pruned Int8 model inference speed

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version: 7.2
GPU Type: MX130
Nvidia Driver Version: 455.12
CUDA Version: 11.1
CUDNN Version: 8.0
Operating System + Version: Ubuntu18.01
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.8
Baremetal or Container (if container which image + tag):

When i convert peoplenet pruned int8 model from .etlt to .engine and ran the inference i am getting the desired output but i am nto able to get the speed required , the inference speed is higher than pruned res34 model is it due to the mx130 gpu version ??

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hello, this forum is dedicated to discussions related to using the sanitizer tools and API.
Questions related to CUDA can be raised at CUDA - NVIDIA Developer Forums