Description
A clear and concise description of the bug or issue.
Environment
TensorRT Version : 7.2.2.1
GPU Type : host rtx3090, target tx2
Nvidia Driver Version : 455.32
CUDA Version : 11.1
CUDNN Version : 8.0.5.43
Operating System + Version : host ubuntu1804 target tx2 (jetpack4.4)
Python Version (if applicable) : 3.8.5
TensorFlow Version (if applicable) :
PyTorch Version (if applicable) : 1.9.0+cu111
Baremetal or Container (if container which image + tag) : container nvcr.io/nvidia/tensorrt:20.12-py3
Relevant Files
Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)
Steps To Reproduce
Please include:
- Exact steps/commands to build your repro
- Exact steps/commands to run your repro
- Full traceback of errors encountered
The original idea was to create a torch of the fp32 model, then convert it to onnx, and finally convert it to fp16, int8. But when you’ve done a lot of debugging, you have to decide from the beginning to make the conversion exactly fp16. Is the workflow supposed to be like this? Or is there a problem with my code? I got the sample code from tensorrt and executed it. Or is it because the current host ubuntu18+rtx3090 does not support fp16?