I am working on a scream detection model and I converted a pb file to onnx using this code
“”“python -m tf2onnx.convert --saved-model /home/hipe/Documents/code/training/model --output model.onnx “””
and when I try to create an engine for TensorRT with the onnx I am facing issues.
Below is my log when I run
./trtexec --onnx=model.onnx
&&&& RUNNING TensorRT.trtexec [TensorRT v8200] # ./trtexec --onnx=model.onnx
[06/30/2022-11:23:41] [I] === Model Options ===
[06/30/2022-11:23:41] [I] Format: ONNX
[06/30/2022-11:23:41] [I] Model: model.onnx
[06/30/2022-11:23:41] [I] Output:
[06/30/2022-11:23:41] [I] === Build Options ===
[06/30/2022-11:23:41] [I] Max batch: explicit batch
[06/30/2022-11:23:41] [I] Workspace: 16 MiB
[06/30/2022-11:23:41] [I] minTiming: 1
[06/30/2022-11:23:41] [I] avgTiming: 8
[06/30/2022-11:23:41] [I] Precision: FP32
[06/30/2022-11:23:41] [I] Calibration:
[06/30/2022-11:23:41] [I] Refit: Disabled
[06/30/2022-11:23:41] [I] Sparsity: Disabled
[06/30/2022-11:23:41] [I] Safe mode: Disabled
[06/30/2022-11:23:41] [I] Strict mode: Disabled
[06/30/2022-11:23:41] [I] Restricted mode: Disabled
[06/30/2022-11:23:41] [I] Save engine:
[06/30/2022-11:23:41] [I] Load engine:
[06/30/2022-11:23:41] [I] Profiling verbosity: 0
[06/30/2022-11:23:41] [I] Tactic sources: Using default tactic sources
[06/30/2022-11:23:41] [I] timingCacheMode: local
[06/30/2022-11:23:41] [I] timingCacheFile:
[06/30/2022-11:23:41] [I] Input(s)s format: fp32:CHW
[06/30/2022-11:23:41] [I] Output(s)s format: fp32:CHW
[06/30/2022-11:23:41] [I] Input build shapes: model
[06/30/2022-11:23:41] [I] Input calibration shapes: model
[06/30/2022-11:23:41] [I] === System Options ===
[06/30/2022-11:23:41] [I] Device: 0
[06/30/2022-11:23:41] [I] DLACore:
[06/30/2022-11:23:41] [I] Plugins:
[06/30/2022-11:23:41] [I] === Inference Options ===
[06/30/2022-11:23:41] [I] Batch: Explicit
[06/30/2022-11:23:41] [I] Input inference shapes: model
[06/30/2022-11:23:41] [I] Iterations: 10
[06/30/2022-11:23:41] [I] Duration: 3s (+ 200ms warm up)
[06/30/2022-11:23:41] [I] Sleep time: 0ms
[06/30/2022-11:23:41] [I] Streams: 1
[06/30/2022-11:23:41] [I] ExposeDMA: Disabled
[06/30/2022-11:23:41] [I] Data transfers: Enabled
[06/30/2022-11:23:41] [I] Spin-wait: Disabled
[06/30/2022-11:23:41] [I] Multithreading: Disabled
[06/30/2022-11:23:41] [I] CUDA Graph: Disabled
[06/30/2022-11:23:41] [I] Separate profiling: Disabled
[06/30/2022-11:23:41] [I] Time Deserialize: Disabled
[06/30/2022-11:23:41] [I] Time Refit: Disabled
[06/30/2022-11:23:41] [I] Skip inference: Disabled
[06/30/2022-11:23:41] [I] Inputs:
[06/30/2022-11:23:41] [I] === Reporting Options ===
[06/30/2022-11:23:41] [I] Verbose: Disabled
[06/30/2022-11:23:41] [I] Averages: 10 inferences
[06/30/2022-11:23:41] [I] Percentile: 99
[06/30/2022-11:23:41] [I] Dump refittable layers:Disabled
[06/30/2022-11:23:41] [I] Dump output: Disabled
[06/30/2022-11:23:41] [I] Profile: Disabled
[06/30/2022-11:23:41] [I] Export timing to JSON file:
[06/30/2022-11:23:41] [I] Export output to JSON file:
[06/30/2022-11:23:41] [I] Export profile to JSON file:
[06/30/2022-11:23:41] [I]
[06/30/2022-11:23:41] [I] === Device Information ===
[06/30/2022-11:23:41] [I] Selected Device: Quadro RTX 8000
[06/30/2022-11:23:41] [I] Compute Capability: 7.5
[06/30/2022-11:23:41] [I] SMs: 72
[06/30/2022-11:23:41] [I] Compute Clock Rate: 1.77 GHz
[06/30/2022-11:23:41] [I] Device Global Memory: 48601 MiB
[06/30/2022-11:23:41] [I] Shared Memory per SM: 64 KiB
[06/30/2022-11:23:41] [I] Memory Bus Width: 384 bits (ECC disabled)
[06/30/2022-11:23:41] [I] Memory Clock Rate: 7.001 GHz
[06/30/2022-11:23:41] [I]
[06/30/2022-11:23:41] [I] TensorRT version: 8204
[06/30/2022-11:23:42] [I] [TRT] [MemUsageChange] Init CUDA: CPU +321, GPU +0, now: CPU 332, GPU 334 (MiB)
[06/30/2022-11:23:42] [I] [TRT] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 332 MiB, GPU 334 MiB
[06/30/2022-11:23:42] [I] [TRT] [MemUsageSnapshot] End constructing builder kernel library: CPU 467 MiB, GPU 368 MiB
[06/30/2022-11:23:42] [I] Start parsing network model
[06/30/2022-11:23:42] [I] [TRT] ----------------------------------------------------------------
[06/30/2022-11:23:42] [I] [TRT] Input filename: model.onnx
[06/30/2022-11:23:42] [I] [TRT] ONNX IR version: 0.0.4
[06/30/2022-11:23:42] [I] [TRT] Opset version: 9
[06/30/2022-11:23:42] [I] [TRT] Producer name: tf2onnx
[06/30/2022-11:23:42] [I] [TRT] Producer version: 1.9.2
[06/30/2022-11:23:42] [I] [TRT] Domain:
[06/30/2022-11:23:42] [I] [TRT] Model version: 0
[06/30/2022-11:23:42] [I] [TRT] Doc string:
[06/30/2022-11:23:42] [I] [TRT] ----------------------------------------------------------------
[06/30/2022-11:23:42] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[06/30/2022-11:23:42] [W] [TRT] parsers/onnx/onnx2trt_utils.cpp:392: One or more weights outside the range of INT32 was clamped
[06/30/2022-11:23:42] [W] [TRT] parsers/onnx/ShapedWeights.cpp:171: Weights StatefulPartitionedCall/sequential/dense/MatMul/ReadVariableOp:0 has been transposed with permutation of (1, 0)! If you plan on overwriting the weights with the Refitter API, the new weights must be pre-transposed.
[06/30/2022-11:23:42] [I] Finish parsing network model
[06/30/2022-11:23:42] [W] Dynamic dimensions required for input: lstm_input, but no shapes were provided. Automatically overriding shape to: 1x110x16
[06/30/2022-11:23:42] [E] Error[4]: [graphShapeAnalyzer.cpp::processCheck::581] Error Code 4: Internal Error (StatefulPartitionedCall/sequential/lstm/PartitionedCall/while_loop:7: tensor volume exceeds (2^31)-1, dimensions are [2147483647,1,16])
[06/30/2022-11:23:42] [E] Error[2]: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
[06/30/2022-11:23:42] [E] Engine could not be created from network
[06/30/2022-11:23:42] [E] Building engine failed
[06/30/2022-11:23:42] [E] Failed to create engine from model.
[06/30/2022-11:23:42] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8200] # ./trtexec --onnx=model.onnx
Segmentation fault (core dumped)