Onnx to trt conversion

Description

error converting onnx model Tensorrt ( Steps To Reproduce)

Environment

TensorRT Version: 7
GPU Type: 930M
Nvidia Driver Version: 440
CUDA Version: 10.2
CUDNN Version: 7.6.5
Operating System + Version: ubuntu 18.04
Python Version (if applicable): 3.7
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Steps To Reproduce

I saved a tf 2.1 float32 model using model.save()
after that I convert this model to onnx by tf2onnx python api
with both opset 10 and 11.
for TENSORRT i used this code: ops11 failed with this …

./trtexec --explicitBatch --onnx=model_op11.onnx

[04/12/2020-15:37:34] [I] === Model Options ===
[04/12/2020-15:37:34] [I] Format: ONNX
[04/12/2020-15:37:34] [I] Model: model_op11.onnx
[04/12/2020-15:37:34] [I] Output:
[04/12/2020-15:37:34] [I] === Build Options ===
[04/12/2020-15:37:34] [I] Max batch: explicit
[04/12/2020-15:37:34] [I] Workspace: 16 MB
[04/12/2020-15:37:34] [I] minTiming: 1
[04/12/2020-15:37:34] [I] avgTiming: 8
[04/12/2020-15:37:34] [I] Precision: FP32
[04/12/2020-15:37:34] [I] Calibration:
[04/12/2020-15:37:34] [I] Safe mode: Disabled
[04/12/2020-15:37:34] [I] Save engine:
[04/12/2020-15:37:34] [I] Load engine:
[04/12/2020-15:37:34] [I] Inputs format: fp32:CHW
[04/12/2020-15:37:34] [I] Outputs format: fp32:CHW
[04/12/2020-15:37:34] [I] Input build shapes: model
[04/12/2020-15:37:34] [I] === System Options ===
[04/12/2020-15:37:34] [I] Device: 0
[04/12/2020-15:37:34] [I] DLACore:
[04/12/2020-15:37:34] [I] Plugins:
[04/12/2020-15:37:34] [I] === Inference Options ===
[04/12/2020-15:37:34] [I] Batch: Explicit
[04/12/2020-15:37:34] [I] Iterations: 10
[04/12/2020-15:37:34] [I] Duration: 3s (+ 200ms warm up)
[04/12/2020-15:37:34] [I] Sleep time: 0ms
[04/12/2020-15:37:34] [I] Streams: 1
[04/12/2020-15:37:34] [I] ExposeDMA: Disabled
[04/12/2020-15:37:34] [I] Spin-wait: Disabled
[04/12/2020-15:37:34] [I] Multithreading: Disabled
[04/12/2020-15:37:34] [I] CUDA Graph: Disabled
[04/12/2020-15:37:34] [I] Skip inference: Disabled
[04/12/2020-15:37:34] [I] Inputs:
[04/12/2020-15:37:34] [I] === Reporting Options ===
[04/12/2020-15:37:34] [I] Verbose: Disabled
[04/12/2020-15:37:34] [I] Averages: 10 inferences
[04/12/2020-15:37:34] [I] Percentile: 99
[04/12/2020-15:37:34] [I] Dump output: Disabled
[04/12/2020-15:37:34] [I] Profile: Disabled
[04/12/2020-15:37:34] [I] Export timing to JSON file:
[04/12/2020-15:37:34] [I] Export output to JSON file:
[04/12/2020-15:37:34] [I] Export profile to JSON file:
[04/12/2020-15:37:34] [I]

Input filename: model_op11.onnx
ONNX IR version: 0.0.6
Opset version: 11
Producer name: tf2onnx
Producer version: 1.6.0
Domain:
Model version: 0
Doc string:

[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/12/2020-15:37:34] [W] [TRT] onnx2trt_utils.cpp:198: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
terminate called after throwing an instance of ‘std::out_of_range’
what(): Attribute not found: pads

Hi,

Could you please share the script and model file to reproduce this issue so we can better help?

Thanks

1 Like

this my model from tensorflow 2 save model to onnx
https://easyupload.io/q61v8v

b

I sent my models. now its your turn to answer!

Thanks for sharing the model file, we will look into it and update you accordingly.

Hi,
The TRT7 binary parser does not support opset 11 padding.
Looks like this padding opset11 support was just added to master here:

Can you build the latest open source parser and try to generate the model?

Thanks

1 Like

Hello,
thank for your answer. I built open source parser. and results are printed here…

  • so is it acceptable for my weak gpu to get this results? :min: 4166.33 ms isn’t too high
  • what does it mean? “Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles”
  • to summary … do you think that this optimizing is acceptable?

./trtexec --explicitBatch --onnx=op11.onnx --workspace=1024 --saveEngine=op11.trt
[04/20/2020-12:08:21] [I] === Model Options ===
[04/20/2020-12:08:21] [I] Format: ONNX
[04/20/2020-12:08:21] [I] Model: op11.onnx
[04/20/2020-12:08:21] [I] Output:
[04/20/2020-12:08:21] [I] === Build Options ===
[04/20/2020-12:08:21] [I] Max batch: explicit
[04/20/2020-12:08:21] [I] Workspace: 1024 MB
[04/20/2020-12:08:21] [I] minTiming: 1
[04/20/2020-12:08:21] [I] avgTiming: 8
[04/20/2020-12:08:21] [I] Precision: FP32
[04/20/2020-12:08:21] [I] Calibration:
[04/20/2020-12:08:21] [I] Safe mode: Disabled
[04/20/2020-12:08:21] [I] Save engine: op11.trt
[04/20/2020-12:08:21] [I] Load engine:
[04/20/2020-12:08:21] [I] Inputs format: fp32:CHW
[04/20/2020-12:08:21] [I] Outputs format: fp32:CHW
[04/20/2020-12:08:21] [I] Input build shapes: model
[04/20/2020-12:08:21] [I] === System Options ===
[04/20/2020-12:08:21] [I] Device: 0
[04/20/2020-12:08:21] [I] DLACore:
[04/20/2020-12:08:21] [I] Plugins:
[04/20/2020-12:08:21] [I] === Inference Options ===
[04/20/2020-12:08:21] [I] Batch: Explicit
[04/20/2020-12:08:21] [I] Iterations: 10
[04/20/2020-12:08:21] [I] Duration: 3s (+ 200ms warm up)
[04/20/2020-12:08:21] [I] Sleep time: 0ms
[04/20/2020-12:08:21] [I] Streams: 1
[04/20/2020-12:08:21] [I] ExposeDMA: Disabled
[04/20/2020-12:08:21] [I] Spin-wait: Disabled
[04/20/2020-12:08:21] [I] Multithreading: Disabled
[04/20/2020-12:08:21] [I] CUDA Graph: Disabled
[04/20/2020-12:08:21] [I] Skip inference: Disabled
[04/20/2020-12:08:21] [I] Inputs:
[04/20/2020-12:08:21] [I] === Reporting Options ===
[04/20/2020-12:08:21] [I] Verbose: Disabled
[04/20/2020-12:08:21] [I] Averages: 10 inferences
[04/20/2020-12:08:21] [I] Percentile: 99
[04/20/2020-12:08:21] [I] Dump output: Disabled
[04/20/2020-12:08:21] [I] Profile: Disabled
[04/20/2020-12:08:21] [I] Export timing to JSON file:
[04/20/2020-12:08:21] [I] Export output to JSON file:
[04/20/2020-12:08:21] [I] Export profile to JSON file:
[04/20/2020-12:08:21] [I]

Input filename: op11.onnx
ONNX IR version: 0.0.6
Opset version: 11
Producer name: tf2onnx
Producer version: 1.6.0
Domain:
Model version: 0
Doc string:

[04/20/2020-12:08:22] [W] [TRT] /workspace/TensorRT/parsers/onnx/onnx2trt_utils.cpp:235: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with same pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with caffe pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with same pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with caffe pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with same pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with caffe pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with same pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with caffe pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with same pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with caffe pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with same pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with caffe pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with same pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with caffe pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with same pad mode
[04/20/2020-12:08:22] [W] [TRT] Can’t fuse pad and convolution with caffe pad mode
[04/20/2020-12:08:52] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.

[04/20/2020-12:22:18] [I] [TRT] Detected 1 inputs and 15 output network tensors.
[04/20/2020-12:22:19] [W] [TRT] Current optimization profile is: 0. Please ensure there are no enqueued operations pending in this context prior to switching profiles
[04/20/2020-12:23:06] [I] Warmup completed 0 queries over 200 ms
[04/20/2020-12:23:06] [I] Timing trace has 0 queries over 46.3995 s
[04/20/2020-12:23:06] [I] Trace averages of 10 runs:
[04/20/2020-12:23:06] [I] Average on 10 runs - GPU latency: 4220.6 ms - Host latency: 4224.96 ms (end to end 8437.71 ms)
[04/20/2020-12:23:06] [I] Host latency
[04/20/2020-12:23:06] [I] min: 4170.64 ms (end to end 8388.33 ms)
[04/20/2020-12:23:06] [I] max: 4247.51 ms (end to end 8460.77 ms)
[04/20/2020-12:23:06] [I] mean: 4224.96 ms (end to end 8437.71 ms)
[04/20/2020-12:23:06] [I] median: 4226.46 ms (end to end 8441.91 ms)
[04/20/2020-12:23:06] [I] percentile: 4247.51 ms at 99% (end to end 8460.77 ms at 99%)
[04/20/2020-12:23:06] [I] throughput: 0 qps
[04/20/2020-12:23:06] [I] walltime: 46.3995 s
[04/20/2020-12:23:06] [I] GPU Compute
[04/20/2020-12:23:06] [I] min: 4166.33 ms
[04/20/2020-12:23:06] [I] max: 4242.56 ms
[04/20/2020-12:23:06] [I] mean: 4220.6 ms
[04/20/2020-12:23:06] [I] median: 4222.12 ms
[04/20/2020-12:23:06] [I] percentile: 4242.56 ms at 99%
[04/20/2020-12:23:06] [I] total compute time: 42.206 s
&&&& PASSED TensorRT.trtexec # ./trtexec --explicitBatch --onnx=op11.onnx --workspace=1024 --saveEngine=op11.trt

Hi,

Could you please check the performance of model before and after conversion?
Also, if FP16 and int8 is supported on your GPU, please try to generated the model in FP16/Int8 mode.

Thanks