Help converting a pytorch model to TensorRT

gbenel · July 4, 2020, 2:38pm

I am trying to convert a pytorch model used for SiamRPN tracking for use on the Xavier NX and have been having significant trouble.

The project github is here: GitHub - STVIR/pysot: SenseTime Research platform for single object tracking, implementing algorithms like SiamRPN and SiamMask., and the model is located here:
siamrpn_alex_dwxcorr - Google Drive

When I try the following, I get the error shown below.
Thanks very much for the guidance.

I tried converting to ONNX and got the following error:

$ /usr/src/tensorrt/bin/trtexec --onnx=model.pth
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=model.pth
[07/01/2020-12:43:58] [I] === Model Options ===
[07/01/2020-12:43:58] [I] Format: ONNX
[07/01/2020-12:43:58] [I] Model: model.pth
[07/01/2020-12:43:58] [I] Output:
[07/01/2020-12:43:58] [I] === Build Options ===
[07/01/2020-12:43:58] [I] Max batch: 1
[07/01/2020-12:43:58] [I] Workspace: 16 MB
[07/01/2020-12:43:58] [I] minTiming: 1
[07/01/2020-12:43:58] [I] avgTiming: 8
[07/01/2020-12:43:58] [I] Precision: FP32
[07/01/2020-12:43:58] [I] Calibration: 
[07/01/2020-12:43:58] [I] Safe mode: Disabled
[07/01/2020-12:43:58] [I] Save engine: 
[07/01/2020-12:43:58] [I] Load engine: 
[07/01/2020-12:43:58] [I] Builder Cache: Enabled
[07/01/2020-12:43:58] [I] NVTX verbosity: 0
[07/01/2020-12:43:58] [I] Inputs format: fp32:CHW
[07/01/2020-12:43:58] [I] Outputs format: fp32:CHW
[07/01/2020-12:43:58] [I] Input build shapes: model
[07/01/2020-12:43:58] [I] Input calibration shapes: model
[07/01/2020-12:43:58] [I] === System Options ===
[07/01/2020-12:43:58] [I] Device: 0
[07/01/2020-12:43:58] [I] DLACore: 
[07/01/2020-12:43:58] [I] Plugins:
[07/01/2020-12:43:58] [I] === Inference Options ===
[07/01/2020-12:43:58] [I] Batch: 1
[07/01/2020-12:43:58] [I] Input inference shapes: model
[07/01/2020-12:43:58] [I] Iterations: 10
[07/01/2020-12:43:58] [I] Duration: 3s (+ 200ms warm up)
[07/01/2020-12:43:58] [I] Sleep time: 0ms
[07/01/2020-12:43:58] [I] Streams: 1
[07/01/2020-12:43:58] [I] ExposeDMA: Disabled
[07/01/2020-12:43:58] [I] Spin-wait: Disabled
[07/01/2020-12:43:58] [I] Multithreading: Disabled
[07/01/2020-12:43:58] [I] CUDA Graph: Disabled
[07/01/2020-12:43:58] [I] Skip inference: Disabled
[07/01/2020-12:43:58] [I] Inputs:
[07/01/2020-12:43:58] [I] === Reporting Options ===
[07/01/2020-12:43:58] [I] Verbose: Disabled
[07/01/2020-12:43:58] [I] Averages: 10 inferences
[07/01/2020-12:43:58] [I] Percentile: 99
[07/01/2020-12:43:58] [I] Dump output: Disabled
[07/01/2020-12:43:58] [I] Profile: Disabled
[07/01/2020-12:43:58] [I] Export timing to JSON file: 
[07/01/2020-12:43:58] [I] Export output to JSON file: 
[07/01/2020-12:43:58] [I] Export profile to JSON file: 
[07/01/2020-12:43:58] [I] 
----------------------------------------------------------------
Input filename:   model.pth
ONNX IR version:  0.0.0
Opset version:    0
Producer name:    
Producer version: 
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[07/01/2020-12:44:00] [E] [TRT] Network must have at least one output
[07/01/2020-12:44:00] [E] [TRT] Network validation failed.
[07/01/2020-12:44:00] [E] Engine creation failed
[07/01/2020-12:44:00] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=model.pth

AastaLLL · July 6, 2020, 3:09am

Hi,

There are some misunderstanding here.

Please noticed that our trtexec is used for converting onnx model into TensorRT engine.
The output the onnx format from pyTorch instead .pth file first.
https://pytorch.org/docs/master/onnx.html

Thanks.

gbenel · July 9, 2020, 6:36pm

Julie Bareeva at Learn OpenCV just released a mini tutorial on converting a pytorch model to trt

I will try following that and post here with the results.

gbenel · July 12, 2020, 12:19am

I tried running with the basic example presented on the site you linked to first:
https://pytorch.org/docs/master/onnx.html

and created an onnx model as they show using the code

import torch
import torchvision

dummy_input = torch.randn(10, 3, 224, 224, device='cuda')
model = torchvision.models.alexnet(pretrained=True).cuda()

# Providing input and output names sets the display names for values
# within the model's graph. Setting these does not change the semantics
# of the graph; it is only for readability.
#
# The inputs to the network consist of the flat list of inputs (i.e.
# the values you would pass to the forward() method) followed by the
# flat list of parameters. You can partially specify names, i.e. provide
# a list here shorter than the number of inputs to the model, and we will
# only set that subset of names, starting from the beginning.
input_names = [ "actual_input_1" ] + [ "learned_%d" % i for i in range(16) ]
output_names = [ "output1" ]

torch.onnx.export(model, dummy_input, "alexnet.onnx", verbose=True, input_names=input_names, output_names=output_names)

This created the file “alexnet.onnx”.

I than ran the command
/usr/src/tensorrt/bin/trtexec --onnx=alexnet.onnx`

and got the result 
        &&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=alexnet.onnx
        [07/11/2020-20:12:25] [I] === Model Options ===
        [07/11/2020-20:12:25] [I] Format: ONNX
        [07/11/2020-20:12:25] [I] Model: alexnet.onnx
        [07/11/2020-20:12:25] [I] Output:
        [07/11/2020-20:12:25] [I] === Build Options ===
        [07/11/2020-20:12:25] [I] Max batch: 1
        [07/11/2020-20:12:25] [I] Workspace: 16 MB
        [07/11/2020-20:12:25] [I] minTiming: 1
        [07/11/2020-20:12:25] [I] avgTiming: 8
        [07/11/2020-20:12:25] [I] Precision: FP32
        [07/11/2020-20:12:25] [I] Calibration: 
        [07/11/2020-20:12:25] [I] Safe mode: Disabled
        [07/11/2020-20:12:25] [I] Save engine: 
        [07/11/2020-20:12:25] [I] Load engine: 
        [07/11/2020-20:12:25] [I] Builder Cache: Enabled
        [07/11/2020-20:12:25] [I] NVTX verbosity: 0
        [07/11/2020-20:12:25] [I] Inputs format: fp32:CHW
        [07/11/2020-20:12:25] [I] Outputs format: fp32:CHW
        [07/11/2020-20:12:25] [I] Input build shapes: model
        [07/11/2020-20:12:25] [I] Input calibration shapes: model
        [07/11/2020-20:12:25] [I] === System Options ===
        [07/11/2020-20:12:25] [I] Device: 0
        [07/11/2020-20:12:25] [I] DLACore: 
        [07/11/2020-20:12:25] [I] Plugins:
        [07/11/2020-20:12:25] [I] === Inference Options ===
        [07/11/2020-20:12:25] [I] Batch: 1
        [07/11/2020-20:12:25] [I] Input inference shapes: model
        [07/11/2020-20:12:25] [I] Iterations: 10
        [07/11/2020-20:12:25] [I] Duration: 3s (+ 200ms warm up)
        [07/11/2020-20:12:25] [I] Sleep time: 0ms
        [07/11/2020-20:12:25] [I] Streams: 1
        [07/11/2020-20:12:25] [I] ExposeDMA: Disabled
        [07/11/2020-20:12:25] [I] Spin-wait: Disabled
        [07/11/2020-20:12:25] [I] Multithreading: Disabled
        [07/11/2020-20:12:25] [I] CUDA Graph: Disabled
        [07/11/2020-20:12:25] [I] Skip inference: Disabled
        [07/11/2020-20:12:25] [I] Inputs:
        [07/11/2020-20:12:25] [I] === Reporting Options ===
        [07/11/2020-20:12:25] [I] Verbose: Disabled
        [07/11/2020-20:12:25] [I] Averages: 10 inferences
        [07/11/2020-20:12:25] [I] Percentile: 99
        [07/11/2020-20:12:25] [I] Dump output: Disabled
        [07/11/2020-20:12:25] [I] Profile: Disabled
        [07/11/2020-20:12:25] [I] Export timing to JSON file: 
        [07/11/2020-20:12:25] [I] Export output to JSON file: 
        [07/11/2020-20:12:25] [I] Export profile to JSON file: 
        [07/11/2020-20:12:25] [I] 
        ----------------------------------------------------------------
        Input filename:   alexnet.onnx
        ONNX IR version:  0.0.4
        Opset version:    9
        Producer name:    pytorch
        Producer version: 1.3
        Domain:           
        Model version:    0
        Doc string:       
        ----------------------------------------------------------------
        [07/11/2020-20:12:31] [W] [TRT] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result.
        [07/11/2020-20:12:31] [W] [TRT] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result.
        [07/11/2020-20:12:31] [W] [TRT] Calling isShapeTensor before the entire network is constructed may result in an inaccurate result.
        [07/11/2020-20:12:31] [I] [TRT] 
        [07/11/2020-20:12:31] [I] [TRT] --------------- Layers running on DLA: 
        [07/11/2020-20:12:31] [I] [TRT] 
        [07/11/2020-20:12:31] [I] [TRT] --------------- Layers running on GPU: 
        [07/11/2020-20:12:31] [I] [TRT] (Unnamed Layer* 0) [Convolution] + (Unnamed Layer* 1) [Activation], (Unnamed Layer* 2) [Pooling], (Unnamed Layer* 3) [Convolution] + (Unnamed Layer* 4) [Activation], (Unnamed Layer* 5) [Pooling], (Unnamed Layer* 6) [Convolution] + (Unnamed Layer* 7) [Activation], (Unnamed Layer* 8) [Convolution] + (Unnamed Layer* 9) [Activation], (Unnamed Layer* 10) [Convolution] + (Unnamed Layer* 11) [Activation], (Unnamed Layer* 12) [Pooling], (Unnamed Layer* 13) [Pooling], (Unnamed Layer* 14) [Shuffle], (Unnamed Layer* 16) [Constant], (Unnamed Layer* 17) [Matrix Multiply], (Unnamed Layer* 18) [Constant] + (Unnamed Layer* 19) [Shuffle], (Unnamed Layer* 20) [ElementWise] + (Unnamed Layer* 21) [Activation], (Unnamed Layer* 23) [Constant], (Unnamed Layer* 24) [Matrix Multiply], (Unnamed Layer* 25) [Constant] + (Unnamed Layer* 26) [Shuffle], (Unnamed Layer* 27) [ElementWise] + (Unnamed Layer* 28) [Activation], (Unnamed Layer* 30) [Constant], (Unnamed Layer* 31) [Matrix Multiply], (Unnamed Layer* 32) [Constant] + (Unnamed Layer* 33) [Shuffle], (Unnamed Layer* 34) [ElementWise], 
        [07/11/2020-20:12:37] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
        [07/11/2020-20:12:45] [I] [TRT] Detected 1 inputs and 1 output network tensors.
        [07/11/2020-20:12:46] [I] Starting inference threads
        [07/11/2020-20:12:49] [I] Warmup completed 8 queries over 200 ms
        [07/11/2020-20:12:49] [I] Timing trace has 119 queries over 3.06582 s
        [07/11/2020-20:12:49] [I] Trace averages of 10 runs:
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 24.3995 ms - Host latency: 24.6575 ms (end to end 24.6674 ms)
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 24.6877 ms - Host latency: 24.9496 ms (end to end 24.997 ms)
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 25.0552 ms - Host latency: 25.3201 ms (end to end 25.3291 ms)
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 25.6317 ms - Host latency: 25.8955 ms (end to end 25.9071 ms)
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 25.8685 ms - Host latency: 26.1265 ms (end to end 26.1363 ms)
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 25.6341 ms - Host latency: 25.8938 ms (end to end 25.9044 ms)
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 26.0824 ms - Host latency: 26.3414 ms (end to end 26.3499 ms)
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 25.715 ms - Host latency: 25.9726 ms (end to end 25.9819 ms)
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 25.7232 ms - Host latency: 25.984 ms (end to end 26.0108 ms)
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 25.7114 ms - Host latency: 25.9733 ms (end to end 25.9825 ms)
        [07/11/2020-20:12:49] [I] Average on 10 runs - GPU latency: 25.7021 ms - Host latency: 25.9606 ms (end to end 25.9693 ms)
        [07/11/2020-20:12:49] [I] Host latency
        [07/11/2020-20:12:49] [I] min: 24.4 ms (end to end 24.4104 ms)
        [07/11/2020-20:12:49] [I] max: 28.1171 ms (end to end 28.1255 ms)
        [07/11/2020-20:12:49] [I] mean: 25.7483 ms (end to end 25.7624 ms)
        [07/11/2020-20:12:49] [I] median: 25.8259 ms (end to end 25.8373 ms)
        [07/11/2020-20:12:49] [I] percentile: 27.9119 ms at 99% (end to end 27.9301 ms at 99%)
        [07/11/2020-20:12:49] [I] throughput: 38.8151 qps
        [07/11/2020-20:12:49] [I] walltime: 3.06582 s
        [07/11/2020-20:12:49] [I] GPU Compute
        [07/11/2020-20:12:49] [I] min: 24.1408 ms
        [07/11/2020-20:12:49] [I] max: 27.8528 ms
        [07/11/2020-20:12:49] [I] mean: 25.4878 ms
        [07/11/2020-20:12:49] [I] median: 25.5653 ms
        [07/11/2020-20:12:49] [I] percentile: 27.6552 ms at 99%
        [07/11/2020-20:12:49] [I] total compute time: 3.03305 s
        &&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=alexnet.onnx

Some questions:

It looks like it succesfully created a tensorRT model. Where does that model live now? I don’t see it.
I am not sure completely how to convert my model now, it is not like the simple example shown here.

Thanks!

gbenel · July 12, 2020, 2:14pm

The models I am looking to convert are located here:

github.com

STVIR/pysot/blob/master/MODEL_ZOO.md

# PySOT Model Zoo

## Introduction

This file documents a large collection of baselines trained with pysot. All configurations for these baselines are located in the [`experiments`](experiments) directory. The tables below provide results about inference. Links to the trained models as well as their output are provided. 

## Visual Tracking Baselines

### Short-term Tracking

| <sub>Model</br>（arch+backbone+xcorr）</sub> | <sub>VOT16</br> (EAO/A/R) </sub> | <sub>VOT18</br> (EAO/A/R) </sub> | <sub>VOT19</br> (EAO/A/R) </sub> | <sub>OTB2015</br> (AUC/Prec.) </sub> | <sub>VOT18-LT</br>(F1)</sub> | <sub>Speed</br> (fps) </sub> | <sub>url</sub> |
|:---------------------------------:|:-:|:------------------------:|:--------------------:|:----------------:|:--------------:|:------------:|:-----------:|
|      <sub>siamrpn_alex_dwxcorr</sub>     | <sub>0.393/0.618/0.238</sub> | <sub>0.352/0.576/0.290</sub> | <sub>0.260/0.573/0.547</sub>|             -        |         -        | <sub>180</sub> | [link](https://drive.google.com/open?id=1t62x56Jl7baUzPTo0QrC4jJnwvPZm-2m) |
|    <sub>siamrpn_alex_dwxcorr_otb</sub>   |              -               |             -                | - |<sub>0.666/0.876</sub> |         -        | <sub>180</sub> | [link](https://drive.google.com/open?id=1gCpmR85Qno3C-naR3SLqRNpVfU7VJ2W0) |
|    <sub>siamrpn_r50_l234_dwxcorr</sub>   | <sub>0.464/0.642/0.196</sub> | <sub>0.415/0.601/0.234</sub> | <sub>0.287/0.595/0.467</sub> |            -        |         -        | <sub>35</sub>  | [link](https://drive.google.com/open?id=1Q4-1563iPwV6wSf_lBHDj5CPFiGSlEPG) |
|  <sub>siamrpn_r50_l234_dwxcorr_otb</sub> |              -               |             -                | - |<sub>0.696/0.914</sub> |         -        | <sub>35</sub>  | [link](https://drive.google.com/open?id=1Cx_oHu6o0gNeH7F9zZrgevfAGdyWC4D5) |
|<sub>siamrpn_mobilev2_l234_dwxcorr</sub>| <sub>0.455/0.624/0.214</sub> | <sub>0.410/0.586/0.229</sub> | <sub>0.292/0.580/0.446</sub>|            -        |         -        | <sub>75</sub>  | [link](https://drive.google.com/open?id=1JB94pZTvB1ZByU-qSJn4ZAIfjLWE5EBJ) |
|  <sub>siammask_r50_l3</sub>        | <sub>0.455/0.634/0.219</sub> | <sub>0.423/0.615/0.248</sub> | <sub>0.283/0.597/0.461</sub> |            -        |         -        | <sub>56</sub>  | [link](https://drive.google.com/open?id=1YbPUQVTYw_slAvk_DchvRY-7B6rnSXP9) |
|  <sub>siamrpn_r50_l234_dwxcorr_lt</sub>  |              -               |             -                | - |            -        | <sub>0.629</sub> | <sub>20</sub>  | [link](https://drive.google.com/open?id=1lOOTedwGLbGZ7MAbqJimIcET3ANJd29A) |

This file has been truncated. show original

AastaLLL · July 22, 2020, 5:26am

Hi,

Sorry for the late update.

The output TensorRT engine is by default disabled.
If you want to get the serialized engine file, please add --saveEngine=<file> when running trtexec.

$ /usr/src/tensorrt/bin/trtexec --onnx=alexnet.onnx --saveEngine=alexnet.trt

For your customized model, please generate the onnx based model from pyTorch first.
Then you can inference it with TensorRT as the command above.

Thanks.

Topic		Replies	Views
SiamMask on Jetson Xavier NX, pytorch, slow FPS Jetson Xavier NX pytorch	22	3057	October 18, 2021
Issues while converting ONNX to TRT Jetson Nano tensorrt , onnx	9	1255	October 18, 2021
tensorRT inference unstable compared onnxruntime TensorRT	4	1276	May 4, 2021
Inference result gets worse when converting pytorch model to TensorRT model TensorRT pytorch	6	1057	January 19, 2022
ERORR with ONNX2TRT : Unknown embedded device detected Jetson Xavier NX onnx	18	4460	April 27, 2022
I do not get any performance improvement after using TensorRT provider for object detection model Jetson Nano tensorrt , onnx	7	1378	July 12, 2022
Tensorflow model acceleration on AGX Jetson AGX Xavier tensorflow	14	1172	October 7, 2022
Error loading .trt model Jetson AGX Orin tensorrt	7	57	November 6, 2024
Human pose detection model (MoveNet) TensorRT conversion on NVIDIA Jetson Jetson Xavier NX tensorrt , tensorflow , jetson-inference	7	2489	June 16, 2022
Some PyTorch model with slicing operation fails on inference TensorRT tensorrt , pytorch , onnx , deepstream	2	1409	January 7, 2022

Help converting a pytorch model to TensorRT

Related Topics