Issues while converting ONNX to TRT

sparsh-b · September 2, 2020, 6:00am

Do I need to give names of input & output nodes of the onnx model somehow while converting it to TRT?

I’m using this code(I’m also uploading the code file as I was not able to indent properly here):
import torch
import pycuda.driver as cuda
import pycuda.autoinit
import numpy as np
import tensorrt as trt

ONNX_FILE_PATH = "resnet50_ssd_onnx1.5.onnx"
TRT_LOGGER = trt.Logger()

def build_engine(onnx_file_path):
builder = trt.Builder(TRT_LOGGER)
network = builder.create_network()
parser = trt.OnnxParser(network, TRT_LOGGER)

builder.max_workspace_size = 1 << 30
builder.max_batch_size = 1
if builder.platform_has_fast_fp16:
builder.fp16_mode = True

with open(onnx_file_path, 'rb') as model:
print('Beginning ONNX file parsing')
parser.parse(model.read())
print('Completed parsing of ONNX file')

print("network.num_layers = ", network.num_layers)
print('Building an engine...')
engine = builder.build_cuda_engine(network)
print("Completed creating Engine")
return engine

def main():
engine = build_engine(ONNX_FILE_PATH)
buf = engine.serialize()
file_name = "resnet50_ssd.engine"
with open(file_name, 'wb') as f:
f.write(buf)

Output:
Beginning ONNX file parsing
Completed parsing of ONNX file
network.num_layers = 0
Building an engine...
[TensorRT] ERROR: Network must have at least one output
[TensorRT] ERROR: Network validation failed.
Completed creating Engine
Traceback (most recent call last):
File "onnx_to_trt.py", line 39, in <module>
main()
File "onnx_to_trt.py", line 34, in main
buf = engine.serialize()
AttributeError: 'NoneType' object has no attribute 'serialize'

Tensorrt version: 7.1.3.0
Torch version: 1.6.0

P.S:
I’ll be sharing the onnx model in a bit.

Thanks!

onnx_to_trt.zip (690 Bytes)

sparsh-b · September 2, 2020, 6:29am

This is the onnx model:

AastaLLL · September 2, 2020, 7:18am

Hi,

Just test your model with trtexec, TensorRT can inference it without issue.
Would you mind to double check it?

$ /usr/src/tensorrt/bin/trtexec --onnx=resnet50_ssd_onnx1.5.onnx

&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=resnet50_ssd_onnx1.5.onnx
[09/02/2020-15:15:43] [I] === Model Options ===
[09/02/2020-15:15:43] [I] Format: ONNX
[09/02/2020-15:15:43] [I] Model: resnet50_ssd_onnx1.5.onnx
[09/02/2020-15:15:43] [I] Output:
[09/02/2020-15:15:43] [I] === Build Options ===
[09/02/2020-15:15:43] [I] Max batch: 1
[09/02/2020-15:15:43] [I] Workspace: 16 MB
[09/02/2020-15:15:43] [I] minTiming: 1
[09/02/2020-15:15:43] [I] avgTiming: 8
[09/02/2020-15:15:43] [I] Precision: FP32
[09/02/2020-15:15:43] [I] Calibration:
[09/02/2020-15:15:43] [I] Safe mode: Disabled
[09/02/2020-15:15:43] [I] Save engine:
[09/02/2020-15:15:43] [I] Load engine:
[09/02/2020-15:15:43] [I] Builder Cache: Enabled
[09/02/2020-15:15:43] [I] NVTX verbosity: 0
[09/02/2020-15:15:43] [I] Inputs format: fp32:CHW
[09/02/2020-15:15:43] [I] Outputs format: fp32:CHW
[09/02/2020-15:15:43] [I] Input build shapes: model
[09/02/2020-15:15:43] [I] Input calibration shapes: model
[09/02/2020-15:15:43] [I] === System Options ===
[09/02/2020-15:15:43] [I] Device: 0
[09/02/2020-15:15:43] [I] DLACore:
[09/02/2020-15:15:43] [I] Plugins:
[09/02/2020-15:15:43] [I] === Inference Options ===
[09/02/2020-15:15:43] [I] Batch: 1
[09/02/2020-15:15:43] [I] Input inference shapes: model
[09/02/2020-15:15:43] [I] Iterations: 10
[09/02/2020-15:15:43] [I] Duration: 3s (+ 200ms warm up)
[09/02/2020-15:15:43] [I] Sleep time: 0ms
[09/02/2020-15:15:43] [I] Streams: 1
[09/02/2020-15:15:43] [I] ExposeDMA: Disabled
[09/02/2020-15:15:43] [I] Spin-wait: Disabled
[09/02/2020-15:15:43] [I] Multithreading: Disabled
[09/02/2020-15:15:43] [I] CUDA Graph: Disabled
[09/02/2020-15:15:43] [I] Skip inference: Disabled
[09/02/2020-15:15:43] [I] Inputs:
[09/02/2020-15:15:43] [I] === Reporting Options ===
[09/02/2020-15:15:43] [I] Verbose: Disabled
[09/02/2020-15:15:43] [I] Averages: 10 inferences
[09/02/2020-15:15:43] [I] Percentile: 99
[09/02/2020-15:15:43] [I] Dump output: Disabled
[09/02/2020-15:15:43] [I] Profile: Disabled
[09/02/2020-15:15:43] [I] Export timing to JSON file:
[09/02/2020-15:15:43] [I] Export output to JSON file:
[09/02/2020-15:15:43] [I] Export profile to JSON file:
[09/02/2020-15:15:43] [I]
----------------------------------------------------------------
Input filename:   resnet50_ssd_onnx1.5.onnx
ONNX IR version:  0.0.6
Opset version:    9
Producer name:    pytorch
Producer version: 1.6
Domain:
Model version:    0
Doc string:
----------------------------------------------------------------
[09/02/2020-15:15:44] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[09/02/2020-15:15:44] [I] [TRT]
[09/02/2020-15:15:44] [I] [TRT] --------------- Layers running on DLA:
[09/02/2020-15:15:44] [I] [TRT]
[09/02/2020-15:15:44] [I] [TRT] --------------- Layers running on GPU:
[09/02/2020-15:15:44] [I] [TRT] Conv_0 + Relu_2, MaxPool_3, Conv_4 + Relu_6, Conv_7 + Relu_9, Conv_10, Conv_12 + Add_14 + Relu_15, Conv_16 + Relu_18, Conv_19 + Relu_21, Conv_22 + Add_24 + Relu_25, Conv_26 + Relu_28, Conv_29 + Relu_31, Conv_32 + Add_34 + Relu_35, Conv_36 + Relu_38, Conv_39 + Relu_41, Conv_42, Conv_44 + Add_46 + Relu_47, Conv_48 + Relu_50, Conv_51 + Relu_53, Conv_54 + Add_56 + Relu_57, Conv_58 + Relu_60, Conv_61 + Relu_63, Conv_64 + Add_66 + Relu_67, Conv_68 + Relu_70, Conv_71 + Relu_73, Conv_74 + Add_76 + Relu_77, Conv_78 + Relu_80, Conv_81 + Relu_83, Conv_84, Conv_86 + Add_88 + Relu_89, Conv_90 + Relu_92, Conv_93 + Relu_95, Conv_96 + Add_98 + Relu_99, Conv_100 + Relu_102, Conv_103 + Relu_105, Conv_106 + Add_108 + Relu_109, Conv_110 + Relu_112, Conv_113 + Relu_115, Conv_116 + Add_118 + Relu_119, Conv_120 + Relu_122, Conv_123 + Relu_125, Conv_126 + Add_128 + Relu_129, Conv_130 + Relu_132, Conv_133 + Relu_135, Conv_136 + Add_138 + Relu_139, Conv_170 || Conv_177, Reshape_183, Reshape_176, Conv_140 + Relu_142, Conv_143 + Relu_145, Conv_184 || Conv_191, Reshape_197, Reshape_190, Conv_146 + Relu_148, Conv_149 + Relu_151, Conv_198 || Conv_205, Reshape_211, Reshape_204, Conv_152 + Relu_154, Conv_155 + Relu_157, Conv_212 || Conv_219, Reshape_225, Reshape_218, Conv_158 + Relu_160, Conv_161 + Relu_163, Conv_226 || Conv_233, Reshape_239, Reshape_232, Conv_164 + Relu_166, Conv_167 + Relu_169, Conv_240 || Conv_247, Reshape_253, 534 copy, 556 copy, 578 copy, 600 copy, 622 copy, 644 copy, Reshape_246, 523 copy, 545 copy, 567 copy, 589 copy, 611 copy, 633 copy,
/usr/src/tensorrt/bin/trtexec/usr/src/tensorrt/bin/trtexec[09/02/2020-15:15:54] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[09/02/2020-15:17:44] [I] [TRT] Detected 1 inputs and 14 output network tensors.
[09/02/2020-15:17:44] [I] Starting inference threads
[09/02/2020-15:17:48] [I] Warmup completed 4 queries over 200 ms
[09/02/2020-15:17:48] [I] Timing trace has 56 queries over 3.09555 s
[09/02/2020-15:17:48] [I] Trace averages of 10 runs:
[09/02/2020-15:17:48] [I] Average on 10 runs - GPU latency: 55.0649 ms - Host latency: 55.2661 ms (end to end 55.2778 ms, enqueue 2.28474 ms)
[09/02/2020-15:17:48] [I] Average on 10 runs - GPU latency: 55.0191 ms - Host latency: 55.2195 ms (end to end 55.2305 ms, enqueue 2.17393 ms)
[09/02/2020-15:17:48] [I] Average on 10 runs - GPU latency: 55.0887 ms - Host latency: 55.2896 ms (end to end 55.2999 ms, enqueue 2.37067 ms)
[09/02/2020-15:17:48] [I] Average on 10 runs - GPU latency: 55.0917 ms - Host latency: 55.2921 ms (end to end 55.3034 ms, enqueue 2.31514 ms)
[09/02/2020-15:17:48] [I] Average on 10 runs - GPU latency: 55.0704 ms - Host latency: 55.2717 ms (end to end 55.2831 ms, enqueue 2.17344 ms)
[09/02/2020-15:17:48] [I] Host Latency
[09/02/2020-15:17:48] [I] min: 55.1567 ms (end to end 55.168 ms)
[09/02/2020-15:17:48] [I] max: 55.5354 ms (end to end 55.5509 ms)
[09/02/2020-15:17:48] [I] mean: 55.2665 ms (end to end 55.2776 ms)
[09/02/2020-15:17:48] [I] median: 55.2511 ms (end to end 55.2613 ms)
[09/02/2020-15:17:48] [I] percentile: 55.5354 ms at 99% (end to end 55.5509 ms at 99%)
[09/02/2020-15:17:48] [I] throughput: 18.0905 qps
[09/02/2020-15:17:48] [I] walltime: 3.09555 s
[09/02/2020-15:17:48] [I] Enqueue Time
[09/02/2020-15:17:48] [I] min: 2.00232 ms
[09/02/2020-15:17:48] [I] max: 2.57129 ms
[09/02/2020-15:17:48] [I] median: 2.21167 ms
[09/02/2020-15:17:48] [I] GPU Compute
[09/02/2020-15:17:48] [I] min: 54.9579 ms
[09/02/2020-15:17:48] [I] max: 55.3339 ms
[09/02/2020-15:17:48] [I] mean: 55.0662 ms
[09/02/2020-15:17:48] [I] median: 55.0502 ms
[09/02/2020-15:17:48] [I] percentile: 55.3339 ms at 99%
[09/02/2020-15:17:48] [I] total compute time: 3.0837 s
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=resnet50_ssd_onnx1.5.onnx

Thanks.

petercoffin80 · September 2, 2020, 7:27am

I would recommend you to simply visit github for this problem. You will find solutions in no time related to it.

sparsh-b · September 2, 2020, 7:43am

HI!

Yes, it ran without any issue. Here’s the output log:

&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=resnet50_ssd_onnx1.5.onnx
[09/02/2020-13:06:18] [I] === Model Options ===
[09/02/2020-13:06:18] [I] Format: ONNX
[09/02/2020-13:06:18] [I] Model: resnet50_ssd_onnx1.5.onnx
[09/02/2020-13:06:18] [I] Output:
[09/02/2020-13:06:18] [I] === Build Options ===
[09/02/2020-13:06:18] [I] Max batch: 1
[09/02/2020-13:06:18] [I] Workspace: 16 MB
[09/02/2020-13:06:18] [I] minTiming: 1
[09/02/2020-13:06:18] [I] avgTiming: 8
[09/02/2020-13:06:18] [I] Precision: FP32
[09/02/2020-13:06:18] [I] Calibration:
[09/02/2020-13:06:18] [I] Safe mode: Disabled
[09/02/2020-13:06:18] [I] Save engine:
[09/02/2020-13:06:18] [I] Load engine:
[09/02/2020-13:06:18] [I] Builder Cache: Enabled
[09/02/2020-13:06:18] [I] NVTX verbosity: 0
[09/02/2020-13:06:18] [I] Inputs format: fp32:CHW
[09/02/2020-13:06:18] [I] Outputs format: fp32:CHW
[09/02/2020-13:06:18] [I] Input build shapes: model
[09/02/2020-13:06:18] [I] Input calibration shapes: model
[09/02/2020-13:06:18] [I] === System Options ===
[09/02/2020-13:06:18] [I] Device: 0
[09/02/2020-13:06:18] [I] DLACore:
[09/02/2020-13:06:18] [I] Plugins:
[09/02/2020-13:06:18] [I] === Inference Options ===
[09/02/2020-13:06:18] [I] Batch: 1
[09/02/2020-13:06:18] [I] Input inference shapes: model
[09/02/2020-13:06:18] [I] Iterations: 10
[09/02/2020-13:06:18] [I] Duration: 3s (+ 200ms warm up)
[09/02/2020-13:06:18] [I] Sleep time: 0ms
[09/02/2020-13:06:18] [I] Streams: 1
[09/02/2020-13:06:18] [I] ExposeDMA: Disabled
[09/02/2020-13:06:18] [I] Spin-wait: Disabled
[09/02/2020-13:06:18] [I] Multithreading: Disabled
[09/02/2020-13:06:18] [I] CUDA Graph: Disabled
[09/02/2020-13:06:18] [I] Skip inference: Disabled
[09/02/2020-13:06:18] [I] Inputs:
[09/02/2020-13:06:18] [I] === Reporting Options ===
[09/02/2020-13:06:18] [I] Verbose: Disabled
[09/02/2020-13:06:18] [I] Averages: 10 inferences
[09/02/2020-13:06:18] [I] Percentile: 99
[09/02/2020-13:06:18] [I] Dump output: Disabled
[09/02/2020-13:06:18] [I] Profile: Disabled
[09/02/2020-13:06:18] [I] Export timing to JSON file:
[09/02/2020-13:06:18] [I] Export output to JSON file:
[09/02/2020-13:06:18] [I] Export profile to JSON file:
[09/02/2020-13:06:18] [I]

Input filename: resnet50_ssd_onnx1.5.onnx
ONNX IR version: 0.0.6
Opset version: 9
Producer name: pytorch
Producer version: 1.6
Domain:
Model version: 0
Doc string:

[09/02/2020-13:06:23] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[09/02/2020-13:06:43] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[09/02/2020-13:08:05] [I] [TRT] Detected 1 inputs and 14 output network tensors.
[09/02/2020-13:08:05] [I] Starting inference threads
[09/02/2020-13:08:09] [I] Warmup completed 1 queries over 200 ms
[09/02/2020-13:08:09] [I] Timing trace has 12 queries over 3.5892 s
[09/02/2020-13:08:09] [I] Trace averages of 10 runs:
[09/02/2020-13:08:09] [I] Average on 10 runs - GPU latency: 298.723 ms - Host latency: 299.117 ms (end to end 299.127 ms, enqueue 12.4263 ms)
[09/02/2020-13:08:09] [I] Host Latency
[09/02/2020-13:08:09] [I] min: 297.5 ms (end to end 297.511 ms)
[09/02/2020-13:08:09] [I] max: 301.41 ms (end to end 301.421 ms)
[09/02/2020-13:08:09] [I] mean: 299.089 ms (end to end 299.099 ms)
[09/02/2020-13:08:09] [I] median: 298.948 ms (end to end 298.958 ms)
[09/02/2020-13:08:09] [I] percentile: 301.41 ms at 99% (end to end 301.421 ms at 99%)
[09/02/2020-13:08:09] [I] throughput: 3.34337 qps
[09/02/2020-13:08:09] [I] walltime: 3.5892 s
[09/02/2020-13:08:09] [I] Enqueue Time
[09/02/2020-13:08:09] [I] min: 3.70299 ms
[09/02/2020-13:08:09] [I] max: 18.671 ms
[09/02/2020-13:08:09] [I] median: 13.4623 ms
[09/02/2020-13:08:09] [I] GPU Compute
[09/02/2020-13:08:09] [I] min: 297.108 ms
[09/02/2020-13:08:09] [I] max: 301.017 ms
[09/02/2020-13:08:09] [I] mean: 298.696 ms
[09/02/2020-13:08:09] [I] median: 298.555 ms
[09/02/2020-13:08:09] [I] percentile: 301.017 ms at 99%
[09/02/2020-13:08:09] [I] total compute time: 3.58435 s
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=resnet50_ssd_onnx1.5.onnx

Also, if it helps, I tried to do inference on onnx model as well. It was also smooth.

sparsh-b · September 2, 2020, 7:51am

Hi @petercoffin80

Thanks for your inputs.
I did that already. I tried the solution in [TensorRT] ERROR: Network must have at least one output · Issue #183 · NVIDIA/TensorRT · GitHub but to no avail.
Because,
network.num_layers returns 0 for me (I shared this in the error log in the 1st post above as well).

Also, I tried to simplify the onnx model using: GitHub - daquexian/onnx-simplifier: Simplify your onnx model. But the simplified model also thros the same error
Network must have at least one output

AastaLLL · September 3, 2020, 9:12am

Hi,

trtexec can run correctly indicates that your model can be compile and execute with TensorRT correctly.
You can check the TensorRT sample below to see if anything missing in your implementation.

/usr/src/tensorrt/samples/trtexec/

Thanks.

sparsh-b · September 3, 2020, 4:35pm

I’m in a kind of hurry.
The workaround I’m using is: to convert onnx → TRT using onnx2trt command line tool mentioned in GitHub - onnx/onnx-tensorrt: ONNX-TensorRT: TensorRT backend for ONNX.

I’ll update if I solve the above issue.

Thanks!

sparsh-b · September 10, 2020, 11:16am

onnx2trt had some issues.

I, finally, used the --saveEngine argument of trtexec to save the .engine file & used it in DeepStream.

I generated an onnx model with dynamix axes using: tensorrt-utils/alexnet_onnx.py at master · rmccorm4/tensorrt-utils · GitHub & passed it to trtexec.

The exact cmd I used was:
/usr/src/tensorrt/bin/trtexec --saveEngine=resnet50_ssd_dyn32_onnx1.6.engine --onnx=resnet50_ssd_dyn_onnx1.6.onnx --minShapes=input:1x3x300x300 --optShapes=input:2x3x300x300 --maxShapes=input:16x3x300x300

Topic		Replies	Views
Model onnx trt engine generation process report different results compared between PC and jetson XAVIER NX Jetson Xavier NX tensorrt	19	1013	September 28, 2022
ERORR with ONNX2TRT : Unknown embedded device detected Jetson Xavier NX onnx	18	4549	April 27, 2022
Error loading .trt model Jetson AGX Orin tensorrt	7	118	November 6, 2024
Tensor RT optimization causes performance downgrade compared to onnx model TensorRT	4	849	January 26, 2022
Erorr with onnx to trt Jetson Xavier NX tensorrt	8	1236	March 30, 2022
tensorRT inference unstable compared onnxruntime TensorRT	4	1306	May 4, 2021
I do not get any performance improvement after using TensorRT provider for object detection model Jetson Nano tensorrt , onnx	7	1396	July 12, 2022
Help converting a pytorch model to TensorRT Jetson Xavier NX tensorrt , pytorch	6	2852	October 18, 2021
Why different input size causes different performance? TensorRT	4	772	October 12, 2021
TensorRT --- non-int8 fallback when trying to calibrate ONNX model DeepStream SDK tensorrt , deepstream	11	424	July 1, 2024

Issues while converting ONNX to TRT

Related topics