[TensorRT] Failed to load engine file converted by torch2trt

user22169 · October 30, 2021, 10:21am

Description

I am trying to implement yolact_edge using TensorRT c++ APIs. I convert original PyTorch model to INT8 .trt model with torch2trt. The original model is splited into modules, such like the backbone, the FPN, the protonet, the prediction head…

After the converting, I tried to load the converted .trt model using C++ APIs, but it raise erros like “Serialization Error in verifyHeader: 0 (Magic tag does not match)”. I noticed that this error always caused by using different TensorRT versions. But the version of TensorRT I used to convert and load model is the sam.

Then I decided to convert original PyTorch model to .onnx model first, however I failed to convert one of the modules to .onnx model. It just raised a segmentation fault (core dumped) while converting the fpn_phase_1 to .onnx model(you can check the exact fpn_phase_1 module in yolact.py file). I don’t know how to fix that error.

Environment

TensorRT Version: TensorRT 7.2.3.4
GPU Type: RTX 3080
Nvidia Driver Version: 460.91.03
CUDA Version: 11.1
CUDNN Version: 8.2.1
Operating System + Version: Ubuntu 20.04 LTS
Python Version (if applicable): 3.7.11
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.9.0
Baremetal or Container (if container which image + tag):

Relevant Files

The eval.py is used to load and inference. During the inference, the model will be splited into modules and each module will be converted to both .onnx and .trt model using torch.onnx.export and torch2trt respectively.

yolact.py is the main model script. I add some torch.onnx.export codes to convert modules to .onnx. So if you want to check the export part, just search torch.onnx. I didn’t complete the conversion of prediction, because i don’t know how to do it. It seems like there are several predition layers and torch2trt convert all of them together. I don’t know how to do it using torch.onnx.export.

Steps To Reproduce

Just set up the env following the instruction, and run inference

python eval.py --trained_model=weights/yolact_edge_54_800000.pth --score_threshold=0.3 --top_k=100 --image=input_image.png:output_image.png

NVES · October 31, 2021, 7:11am

Hi, Please refer to the below links to perform inference in INT8

Thanks!

user22169 · November 1, 2021, 9:26am

Hi @NVES

Thanks for your reply!
Do you mean that the " Serialization Error in verifyHeader: 0 (Magic tag does not match)" is caused by the incompability between python INT8 and C++ INT8? The main problem for me, however, is the .trt engine file converted from .pth model directly using torch2trt can’t be loaded using c++ APIs, and the whole procedure uses the same TensorRT version.

I will attach outputs while using trtexec to test the .trt engine files

./trtexec --loadEngine=/home/hilbert/catkin_ws/src/trt_test/weights/fp16/yolact_edge_54_800000.pth.backbone_bs_1.trt --batch=1 --verbose
&&&& RUNNING TensorRT.trtexec # ./trtexec --loadEngine=/home/hilbert/catkin_ws/src/trt_test/weights/fp16/yolact_edge_54_800000.pth.backbone_bs_1.trt --batch=1 --verbose
[11/01/2021-09:56:52] [I] === Model Options ===
[11/01/2021-09:56:52] [I] Format: *
[11/01/2021-09:56:52] [I] Model: 
[11/01/2021-09:56:52] [I] Output:
[11/01/2021-09:56:52] [I] === Build Options ===
[11/01/2021-09:56:52] [I] Max batch: 1
[11/01/2021-09:56:52] [I] Workspace: 16 MiB
[11/01/2021-09:56:52] [I] minTiming: 1
[11/01/2021-09:56:52] [I] avgTiming: 8
[11/01/2021-09:56:52] [I] Precision: FP32
[11/01/2021-09:56:52] [I] Calibration: 
[11/01/2021-09:56:52] [I] Refit: Disabled
[11/01/2021-09:56:52] [I] Safe mode: Disabled
[11/01/2021-09:56:52] [I] Save engine: 
[11/01/2021-09:56:52] [I] Load engine: /home/hilbert/catkin_ws/src/trt_test/weights/fp16/yolact_edge_54_800000.pth.backbone_bs_1.trt
[11/01/2021-09:56:52] [I] Builder Cache: Enabled
[11/01/2021-09:56:52] [I] NVTX verbosity: 0
[11/01/2021-09:56:52] [I] Tactic sources: Using default tactic sources
[11/01/2021-09:56:52] [I] Input(s)s format: fp32:CHW
[11/01/2021-09:56:52] [I] Output(s)s format: fp32:CHW
[11/01/2021-09:56:52] [I] Input build shapes: model
[11/01/2021-09:56:52] [I] Input calibration shapes: model
[11/01/2021-09:56:52] [I] === System Options ===
[11/01/2021-09:56:52] [I] Device: 0
[11/01/2021-09:56:52] [I] DLACore: 
[11/01/2021-09:56:52] [I] Plugins:
[11/01/2021-09:56:52] [I] === Inference Options ===
[11/01/2021-09:56:52] [I] Batch: 1
[11/01/2021-09:56:52] [I] Input inference shapes: model
[11/01/2021-09:56:52] [I] Iterations: 10
[11/01/2021-09:56:52] [I] Duration: 3s (+ 200ms warm up)
[11/01/2021-09:56:52] [I] Sleep time: 0ms
[11/01/2021-09:56:52] [I] Streams: 1
[11/01/2021-09:56:52] [I] ExposeDMA: Disabled
[11/01/2021-09:56:52] [I] Data transfers: Enabled
[11/01/2021-09:56:52] [I] Spin-wait: Disabled
[11/01/2021-09:56:52] [I] Multithreading: Disabled
[11/01/2021-09:56:52] [I] CUDA Graph: Disabled
[11/01/2021-09:56:52] [I] Separate profiling: Disabled
[11/01/2021-09:56:52] [I] Skip inference: Disabled
[11/01/2021-09:56:52] [I] Inputs:
[11/01/2021-09:56:52] [I] === Reporting Options ===
[11/01/2021-09:56:52] [I] Verbose: Enabled
[11/01/2021-09:56:52] [I] Averages: 10 inferences
[11/01/2021-09:56:52] [I] Percentile: 99
[11/01/2021-09:56:52] [I] Dump refittable layers:Disabled
[11/01/2021-09:56:52] [I] Dump output: Disabled
[11/01/2021-09:56:52] [I] Profile: Disabled
[11/01/2021-09:56:52] [I] Export timing to JSON file: 
[11/01/2021-09:56:52] [I] Export output to JSON file: 
[11/01/2021-09:56:52] [I] Export profile to JSON file: 
[11/01/2021-09:56:52] [I] 
[11/01/2021-09:56:52] [I] === Device Information ===
[11/01/2021-09:56:52] [I] Selected Device: GeForce RTX 3080
[11/01/2021-09:56:52] [I] Compute Capability: 8.6
[11/01/2021-09:56:52] [I] SMs: 68
[11/01/2021-09:56:52] [I] Compute Clock Rate: 1.77 GHz
[11/01/2021-09:56:52] [I] Device Global Memory: 9995 MiB
[11/01/2021-09:56:52] [I] Shared Memory per SM: 100 KiB
[11/01/2021-09:56:52] [I] Memory Bus Width: 320 bits (ECC disabled)
[11/01/2021-09:56:52] [I] Memory Clock Rate: 9.501 GHz
[11/01/2021-09:56:52] [I] 
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::NMS_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::Reorg_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::Region_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::Clip_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::LReLU_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::Normalize_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::RPROI_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::CropAndResize version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::Proposal version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::Split version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[11/01/2021-09:56:52] [V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[11/01/2021-09:56:53] [E] [TRT] coreReadArchive.cpp (32) - Serialization Error in verifyHeader: 0 (Magic tag does not match)
[11/01/2021-09:56:53] [E] [TRT] INVALID_STATE: std::exception
[11/01/2021-09:56:53] [E] [TRT] INVALID_CONFIG: Deserialize the cuda engine failed.
[11/01/2021-09:56:53] [E] Engine creation failed
[11/01/2021-09:56:53] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # ./trtexec --loadEngine=/home/hilbert/catkin_ws/src/trt_test/weights/fp16/yolact_edge_54_800000.pth.backbone_bs_1.trt --batch=1 --verbose

user22169 · November 1, 2021, 4:26pm

Ok Ive solved the problem, it seem I have to serialize the .trt model and save it as .engine, then the C++ APIs work fine.

364083042 · November 2, 2021, 2:45am

Hi I met the same problem as yours, and solved it using the same way. Do you have any idea why this method worked or what might be the cause of such kind of problems

spolisetty · November 3, 2021, 1:16pm

Hi,

We recommend you to please run on the latest TensorRT version 8.2 EA.
Please make sure your model is valid, were you able to generate onnx model ?

Following similar issue may be helpful.

Thank you.