Error while working with trtexec to create an engine with onnx file

I am working on a scream detection model and I converted a pb file to onnx using this code
“”“python -m tf2onnx.convert --saved-model /home/hipe/Documents/code/training/model --output model.onnx “””
and when I try to create an engine for TensorRT with the onnx I am facing issues.

Below is my log when I run
./trtexec --onnx=model.onnx

&&&& RUNNING TensorRT.trtexec [TensorRT v8200] # ./trtexec --onnx=model.onnx
[06/30/2022-11:23:41] [I] === Model Options ===
[06/30/2022-11:23:41] [I] Format: ONNX
[06/30/2022-11:23:41] [I] Model: model.onnx
[06/30/2022-11:23:41] [I] Output:
[06/30/2022-11:23:41] [I] === Build Options ===
[06/30/2022-11:23:41] [I] Max batch: explicit batch
[06/30/2022-11:23:41] [I] Workspace: 16 MiB
[06/30/2022-11:23:41] [I] minTiming: 1
[06/30/2022-11:23:41] [I] avgTiming: 8
[06/30/2022-11:23:41] [I] Precision: FP32
[06/30/2022-11:23:41] [I] Calibration:
[06/30/2022-11:23:41] [I] Refit: Disabled
[06/30/2022-11:23:41] [I] Sparsity: Disabled
[06/30/2022-11:23:41] [I] Safe mode: Disabled
[06/30/2022-11:23:41] [I] Strict mode: Disabled
[06/30/2022-11:23:41] [I] Restricted mode: Disabled
[06/30/2022-11:23:41] [I] Save engine:
[06/30/2022-11:23:41] [I] Load engine:
[06/30/2022-11:23:41] [I] Profiling verbosity: 0
[06/30/2022-11:23:41] [I] Tactic sources: Using default tactic sources
[06/30/2022-11:23:41] [I] timingCacheMode: local
[06/30/2022-11:23:41] [I] timingCacheFile:
[06/30/2022-11:23:41] [I] Input(s)s format: fp32:CHW
[06/30/2022-11:23:41] [I] Output(s)s format: fp32:CHW
[06/30/2022-11:23:41] [I] Input build shapes: model
[06/30/2022-11:23:41] [I] Input calibration shapes: model
[06/30/2022-11:23:41] [I] === System Options ===
[06/30/2022-11:23:41] [I] Device: 0
[06/30/2022-11:23:41] [I] DLACore:
[06/30/2022-11:23:41] [I] Plugins:
[06/30/2022-11:23:41] [I] === Inference Options ===
[06/30/2022-11:23:41] [I] Batch: Explicit
[06/30/2022-11:23:41] [I] Input inference shapes: model
[06/30/2022-11:23:41] [I] Iterations: 10
[06/30/2022-11:23:41] [I] Duration: 3s (+ 200ms warm up)
[06/30/2022-11:23:41] [I] Sleep time: 0ms
[06/30/2022-11:23:41] [I] Streams: 1
[06/30/2022-11:23:41] [I] ExposeDMA: Disabled
[06/30/2022-11:23:41] [I] Data transfers: Enabled
[06/30/2022-11:23:41] [I] Spin-wait: Disabled
[06/30/2022-11:23:41] [I] Multithreading: Disabled
[06/30/2022-11:23:41] [I] CUDA Graph: Disabled
[06/30/2022-11:23:41] [I] Separate profiling: Disabled
[06/30/2022-11:23:41] [I] Time Deserialize: Disabled
[06/30/2022-11:23:41] [I] Time Refit: Disabled
[06/30/2022-11:23:41] [I] Skip inference: Disabled
[06/30/2022-11:23:41] [I] Inputs:
[06/30/2022-11:23:41] [I] === Reporting Options ===
[06/30/2022-11:23:41] [I] Verbose: Disabled
[06/30/2022-11:23:41] [I] Averages: 10 inferences
[06/30/2022-11:23:41] [I] Percentile: 99
[06/30/2022-11:23:41] [I] Dump refittable layers:Disabled
[06/30/2022-11:23:41] [I] Dump output: Disabled
[06/30/2022-11:23:41] [I] Profile: Disabled
[06/30/2022-11:23:41] [I] Export timing to JSON file:
[06/30/2022-11:23:41] [I] Export output to JSON file:
[06/30/2022-11:23:41] [I] Export profile to JSON file:
[06/30/2022-11:23:41] [I]
[06/30/2022-11:23:41] [I] === Device Information ===
[06/30/2022-11:23:41] [I] Selected Device: Quadro RTX 8000
[06/30/2022-11:23:41] [I] Compute Capability: 7.5
[06/30/2022-11:23:41] [I] SMs: 72
[06/30/2022-11:23:41] [I] Compute Clock Rate: 1.77 GHz
[06/30/2022-11:23:41] [I] Device Global Memory: 48601 MiB
[06/30/2022-11:23:41] [I] Shared Memory per SM: 64 KiB
[06/30/2022-11:23:41] [I] Memory Bus Width: 384 bits (ECC disabled)
[06/30/2022-11:23:41] [I] Memory Clock Rate: 7.001 GHz
[06/30/2022-11:23:41] [I]
[06/30/2022-11:23:41] [I] TensorRT version: 8204
[06/30/2022-11:23:42] [I] [TRT] [MemUsageChange] Init CUDA: CPU +321, GPU +0, now: CPU 332, GPU 334 (MiB)
[06/30/2022-11:23:42] [I] [TRT] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 332 MiB, GPU 334 MiB
[06/30/2022-11:23:42] [I] [TRT] [MemUsageSnapshot] End constructing builder kernel library: CPU 467 MiB, GPU 368 MiB
[06/30/2022-11:23:42] [I] Start parsing network model
[06/30/2022-11:23:42] [I] [TRT] ----------------------------------------------------------------
[06/30/2022-11:23:42] [I] [TRT] Input filename: model.onnx
[06/30/2022-11:23:42] [I] [TRT] ONNX IR version: 0.0.4
[06/30/2022-11:23:42] [I] [TRT] Opset version: 9
[06/30/2022-11:23:42] [I] [TRT] Producer name: tf2onnx
[06/30/2022-11:23:42] [I] [TRT] Producer version: 1.9.2
[06/30/2022-11:23:42] [I] [TRT] Domain:
[06/30/2022-11:23:42] [I] [TRT] Model version: 0
[06/30/2022-11:23:42] [I] [TRT] Doc string:
[06/30/2022-11:23:42] [I] [TRT] ----------------------------------------------------------------
[06/30/2022-11:23:42] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[06/30/2022-11:23:42] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/30/2022-11:23:42] [W] [TRT] parsers/onnx/ShapedWeights.cpp:171: Weights StatefulPartitionedCall/sequential/dense/MatMul/ReadVariableOp:0 has been transposed with permutation of (1, 0)! If you plan on overwriting the weights with the Refitter API, the new weights must be pre-transposed.
[06/30/2022-11:23:42] [I] Finish parsing network model
[06/30/2022-11:23:42] [W] Dynamic dimensions required for input: lstm_input, but no shapes were provided. Automatically overriding shape to: 1x110x16
[06/30/2022-11:23:42] [E] Error[4]: [graphShapeAnalyzer.cpp::processCheck::581] Error Code 4: Internal Error (StatefulPartitionedCall/sequential/lstm/PartitionedCall/while_loop:7: tensor volume exceeds (2^31)-1, dimensions are [2147483647,1,16])
[06/30/2022-11:23:42] [E] Error[2]: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
[06/30/2022-11:23:42] [E] Engine could not be created from network
[06/30/2022-11:23:42] [E] Building engine failed
[06/30/2022-11:23:42] [E] Failed to create engine from model.
[06/30/2022-11:23:42] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8200] # ./trtexec --onnx=model.onnx
Segmentation fault (core dumped)

Hi,

Could you please try on the latest TensorRT version 8.4 GA.
If you still face the issue, please share with us issue repro ONNX model for better debugging.

Thank you.

Hello,
Thank you for your reply to my issue.
I will check the versions and will run it on the latest TensorRT version and I will send you the log details.
Please don’t close this topic, I will send the update on my issue
Thank you.

1 Like

Hi Teams,

Please find the attached onnx model.

model.onnx (31.9 KB)

saved_model.pb (1.41 MB)

Hi,

Currently, TensorRT does not support tensors with more than 2^31-1 elements.
We do not have a workaround except modifying the network.

[07/05/2022-06:37:10] [E] Error[4]: [graphShapeAnalyzer.cpp::processCheck::587] Error Code 4: Internal Error (StatefulPartitionedCall/sequential/lstm/PartitionedCall/while_loop:7: tensor volume exceeds (2^31)-1, dimensions are [2147483647,1,16])
[07/05/2022-06:37:10] [E] Error[2]: [builder.cpp::buildSerializedNetwork::636] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
[07/05/2022-06:37:10] [E] Engine could not be created from network
[07/05/2022-06:37:10] [E] Building engine failed
[07/05/2022-06:37:10] [E] Failed to create engine from model.
[07/05/2022-06:37:10] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8401] # /opt/tensorrt/bin/trtexec --onnx=model.onnx --verbose

Thank you.

Hello,

Can you please give me an alternative approach to work with?

Thank you!

Sorry, I don’t think we have a alternative workaround.