Description
I am debuging QAT with pytorch_quantization(TensorRT/tools/pytorch-quantization at main · NVIDIA/TensorRT · GitHub).
The onnx model with QDQ from pth can be convertered successfully.
But when i am trying to convert the simple ONNX model to TensorRT, it failed.
[01/13/2023-08:20:26] [V] [TRT] Removing QuantizeLinear_70
[01/13/2023-08:20:26] [V] [TRT] Removing DequantizeLinear_41
[01/13/2023-08:20:26] [V] [TRT] Removing DequantizeLinear_44
[01/13/2023-08:20:26] [V] [TRT] ConstWeightsFusion: Fusing conv23.weight + QuantizeLinear_43 with Conv_45
[01/13/2023-08:20:26] [E] Error[2]: [graphOptimizer.cpp::fusePattern::1777] Error Code 2: Internal Error (Assertion matchPattern(context, first) && matchBackend(first) failed. )
[01/13/2023-08:20:26] [E] Error[2]: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
[01/13/2023-08:20:26] [E] Engine could not be created from network
[01/13/2023-08:20:26] [E] Building engine failed
[01/13/2023-08:20:26] [E] Failed to create engine from model or file.
[01/13/2023-08:20:26] [E] Engine set up failed
Environment
I test tensorrt in Docker
Platform : Orin
Jetpack Version : 5.0.2-b231
TensorRT Version : 8.4.1
CUDA Version : 11.4
CUDNN Version : 8.4.1
Operating System + Version : ubuntu20.04
Python Version (if applicable) : 3.8.10
PyTorch Version (if applicable): 1.10
Baremetal or Container (if container which image + tag): dustynv/ros:noetic-ros-base-l4t-r35.1.0(GitHub - dusty-nv/jetson-containers: Machine Learning Containers for NVIDIA Jetson and JetPack-L4T))
Relevant Files
0.pth.onnx (173.1 KB)
0.pth.onnx.log (98.0 KB)
Steps To Reproduce
trtexec --verbose --nvtxMode=verbose --buildOnly --workspace=8192 --onnx=0.pth.onnx --saveEngine=0.pth.onnx.engine --timingCacheFile=./timing.cache --profilingVerbosity=detailed --fp16 --int8
another same topic: https://forums.developer.nvidia.com/t/error-code-2-internal-error-assertion-matchpattern-context-first-matchbackend-first-failed/223247/7