RTX3090 vs A100


RTX3090 bad results comparing A100 detections


TensorRT Version:
GPU Type: RTX3090 (on desktop) and A100 (on virtual machine)
Nvidia Driver Version: 525.60.13
CUDA Version: 12.0 (the docker images says it uses 11.7)
CUDNN Version:
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): python3.8
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

I trained a yolov4 model on the desktop PC (RTX3090) with tao toolkit.
I converted the .etlt file to RTX3090 compatible tensorrt file (model.plan) on the desktop PC, inside a docker image. Then I converted the .etlt file to A100 compatible tensorrt file (model.plan) on the VM in the same docker image.

Steps To Reproduce

I use the same triton based docker image for inferencing on desktop PC (RTX3090) and on the VM (A100) too. The RTX3090 detections are much more poor, not just the confidence levels are lower for the same object, but many times missing the object that has high confidence when I infer on A100 VM. A100 detections are acceptable, but RTX3090 is really noisy. Is this normal? Any advice how top improve the RTX3090 detections with the same model?


This looks like a TAO Toolkit related issue. We will move this post to the TAO Toolkit forum.


I am not sure. Can you confirm that I should get the same detections on RTX3090 and on T4 also if I use the same CUDA driver?