TensorRT build engine failed nanoSAM

jacob.aizner · February 7, 2024, 8:11am

Hi,

Im trying to build nanoSAM model,
when I build the engine I get the following error:
[E] Error[4]: [graph.cpp::symbolicExecute::539] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor).

I tried to convert mobile_sam _decoder model to INT32, Still get the same problem.

The full output:
&&&& RUNNING TensorRT.trtexec [TensorRT v8601] # /usr/src/tensorrt/bin/trtexec --onnx=data/sam32.onnx --saveEngine=data/mobile_sam_mask_decoder.engine --minShapes=point_coords:1x1x2,point_labels:1x1 --optShapes=point_coords:1x1x2,point_labels:1x1 --maxShapes=point_coords:1x10x2,point_labels:1x10
[02/07/2024-08:01:59] [I] === Model Options ===
[02/07/2024-08:01:59] [I] Format: ONNX
[02/07/2024-08:01:59] [I] Model: data/sam32.onnx
[02/07/2024-08:01:59] [I] Output:
[02/07/2024-08:01:59] [I] === Build Options ===
[02/07/2024-08:01:59] [I] Max batch: explicit batch
[02/07/2024-08:01:59] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default
[02/07/2024-08:01:59] [I] minTiming: 1
[02/07/2024-08:01:59] [I] avgTiming: 8
[02/07/2024-08:01:59] [I] Precision: FP32
[02/07/2024-08:01:59] [I] LayerPrecisions:
[02/07/2024-08:01:59] [I] Layer Device Types:
[02/07/2024-08:01:59] [I] Calibration:
[02/07/2024-08:01:59] [I] Refit: Disabled
[02/07/2024-08:01:59] [I] Version Compatible: Disabled
[02/07/2024-08:01:59] [I] TensorRT runtime: full
[02/07/2024-08:01:59] [I] Lean DLL Path:
[02/07/2024-08:01:59] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[02/07/2024-08:01:59] [I] Exclude Lean Runtime: Disabled
[02/07/2024-08:01:59] [I] Sparsity: Disabled
[02/07/2024-08:01:59] [I] Safe mode: Disabled
[02/07/2024-08:01:59] [I] Build DLA standalone loadable: Disabled
[02/07/2024-08:01:59] [I] Allow GPU fallback for DLA: Disabled
[02/07/2024-08:01:59] [I] DirectIO mode: Disabled
[02/07/2024-08:01:59] [I] Restricted mode: Disabled
[02/07/2024-08:01:59] [I] Skip inference: Disabled
[02/07/2024-08:01:59] [I] Save engine: data/mobile_sam_mask_decoder.engine
[02/07/2024-08:01:59] [I] Load engine:
[02/07/2024-08:01:59] [I] Profiling verbosity: 0
[02/07/2024-08:01:59] [I] Tactic sources: Using default tactic sources
[02/07/2024-08:01:59] [I] timingCacheMode: local
[02/07/2024-08:01:59] [I] timingCacheFile:
[02/07/2024-08:01:59] [I] Heuristic: Disabled
[02/07/2024-08:01:59] [I] Preview Features: Use default preview flags.
[02/07/2024-08:01:59] [I] MaxAuxStreams: -1
[02/07/2024-08:01:59] [I] BuilderOptimizationLevel: -1
[02/07/2024-08:01:59] [I] Input(s)s format: fp32:CHW
[02/07/2024-08:01:59] [I] Output(s)s format: fp32:CHW
[02/07/2024-08:01:59] [I] Input build shape: point_coords=1x1x2+1x1x2+1x10x2
[02/07/2024-08:01:59] [I] Input build shape: point_labels=1x1+1x1+1x10
[02/07/2024-08:01:59] [I] Input calibration shapes: model
[02/07/2024-08:01:59] [I] === System Options ===
[02/07/2024-08:01:59] [I] Device: 0
[02/07/2024-08:01:59] [I] DLACore:
[02/07/2024-08:01:59] [I] Plugins:
[02/07/2024-08:01:59] [I] setPluginsToSerialize:
[02/07/2024-08:01:59] [I] dynamicPlugins:
[02/07/2024-08:01:59] [I] ignoreParsedPluginLibs: 0
[02/07/2024-08:01:59] [I]
[02/07/2024-08:01:59] [I] === Inference Options ===
[02/07/2024-08:01:59] [I] Batch: Explicit
[02/07/2024-08:01:59] [I] Input inference shape: point_labels=1x1
[02/07/2024-08:01:59] [I] Input inference shape: point_coords=1x1x2
[02/07/2024-08:01:59] [I] Iterations: 10
[02/07/2024-08:01:59] [I] Duration: 3s (+ 200ms warm up)
[02/07/2024-08:01:59] [I] Sleep time: 0ms
[02/07/2024-08:01:59] [I] Idle time: 0ms
[02/07/2024-08:01:59] [I] Inference Streams: 1
[02/07/2024-08:01:59] [I] ExposeDMA: Disabled
[02/07/2024-08:01:59] [I] Data transfers: Enabled
[02/07/2024-08:01:59] [I] Spin-wait: Disabled
[02/07/2024-08:01:59] [I] Multithreading: Disabled
[02/07/2024-08:01:59] [I] CUDA Graph: Disabled
[02/07/2024-08:01:59] [I] Separate profiling: Disabled
[02/07/2024-08:01:59] [I] Time Deserialize: Disabled
[02/07/2024-08:01:59] [I] Time Refit: Disabled
[02/07/2024-08:01:59] [I] NVTX verbosity: 0
[02/07/2024-08:01:59] [I] Persistent Cache Ratio: 0
[02/07/2024-08:01:59] [I] Inputs:
[02/07/2024-08:01:59] [I] === Reporting Options ===
[02/07/2024-08:01:59] [I] Verbose: Disabled
[02/07/2024-08:01:59] [I] Averages: 10 inferences
[02/07/2024-08:01:59] [I] Percentiles: 90,95,99
[02/07/2024-08:01:59] [I] Dump refittable layers:Disabled
[02/07/2024-08:01:59] [I] Dump output: Disabled
[02/07/2024-08:01:59] [I] Profile: Disabled
[02/07/2024-08:01:59] [I] Export timing to JSON file:
[02/07/2024-08:01:59] [I] Export output to JSON file:
[02/07/2024-08:01:59] [I] Export profile to JSON file:
[02/07/2024-08:01:59] [I]
[02/07/2024-08:01:59] [I] === Device Information ===
[02/07/2024-08:01:59] [I] Selected Device: Quadro M4000
[02/07/2024-08:01:59] [I] Compute Capability: 5.2
[02/07/2024-08:01:59] [I] SMs: 13
[02/07/2024-08:01:59] [I] Device Global Memory: 8112 MiB
[02/07/2024-08:01:59] [I] Shared Memory per SM: 96 KiB
[02/07/2024-08:01:59] [I] Memory Bus Width: 256 bits (ECC disabled)
[02/07/2024-08:01:59] [I] Application Compute Clock Rate: 0.7725 GHz
[02/07/2024-08:01:59] [I] Application Memory Clock Rate: 3.005 GHz
[02/07/2024-08:01:59] [I]
[02/07/2024-08:01:59] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[02/07/2024-08:01:59] [I]
[02/07/2024-08:01:59] [I] TensorRT version: 8.6.1
[02/07/2024-08:01:59] [I] Loading standard plugins
[02/07/2024-08:01:59] [I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 13, GPU 681 (MiB)
[02/07/2024-08:02:09] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +128, GPU +0, now: CPU 217, GPU 681 (MiB)
[02/07/2024-08:02:09] [I] Start parsing network model.
[02/07/2024-08:02:09] [I] [TRT] ----------------------------------------------------------------
[02/07/2024-08:02:09] [I] [TRT] Input filename: data/sam32.onnx
[02/07/2024-08:02:09] [I] [TRT] ONNX IR version: 0.0.9
[02/07/2024-08:02:09] [I] [TRT] Opset version: 16
[02/07/2024-08:02:09] [I] [TRT] Producer name: onnx-typecast
[02/07/2024-08:02:09] [I] [TRT] Producer version:
[02/07/2024-08:02:09] [I] [TRT] Domain:
[02/07/2024-08:02:09] [I] [TRT] Model version: 0
[02/07/2024-08:02:09] [I] [TRT] Doc string:
[02/07/2024-08:02:09] [I] [TRT] ----------------------------------------------------------------
[02/07/2024-08:02:09] [W] [TRT] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[02/07/2024-08:02:09] [E] Error[4]: [graph.cpp::symbolicExecute::539] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[02/07/2024-08:02:09] [E] [TRT] ModelImporter.cpp:771: While parsing node number 146 [Tile → “/Tile_output_0”]:
[02/07/2024-08:02:09] [E] [TRT] ModelImporter.cpp:772: — Begin node —
[02/07/2024-08:02:09] [E] [TRT] ModelImporter.cpp:773: input: “/Unsqueeze_3_output_0”
input: “/Reshape_2_output_0”
output: “/Tile_output_0”
name: “/Tile”
op_type: “Tile”

[02/07/2024-08:02:09] [E] [TRT] ModelImporter.cpp:774: — End node —
[02/07/2024-08:02:09] [E] [TRT] ModelImporter.cpp:777: ERROR: ModelImporter.cpp:195 In function parseGraph:
[6] Invalid Node - /Tile
[graph.cpp::symbolicExecute::539] Error Code 4: Internal Error (/OneHot: an IIOneHotLayer cannot be used to compute a shape tensor)
[02/07/2024-08:02:09] [E] Failed to parse onnx file
[02/07/2024-08:02:09] [I] Finished parsing network model. Parse time: 0.0837755
[02/07/2024-08:02:09] [E] Parsing model failed
[02/07/2024-08:02:09] [E] Failed to create engine from model or file.
[02/07/2024-08:02:09] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8601] # /usr/src/tensorrt/bin/trtexec --onnx=data/sam32.onnx --saveEngine=data/mobile_sam_mask_decoder.engine --minShapes=point_coords:1x1x2,point_labels:1x1 --optShapes=point_coords:1x1x2,point_labels:1x1 --maxShapes=point_coords:1x10x2,point_labels:1x10

AakankshaS · February 29, 2024, 4:55pm

Hi @jacob.aizner ,
Is it possible to constant fold using polygraphy first?

Topic		Replies	Views
[graphOptimizer.cpp::fusePattern] (!never(dim == ShapeContext::one()) \|\| !never(dim == squeezeSuccessorsOutputDims[i]) failed. ) TensorRT	0	13	November 19, 2024
Cannot convert the nanoSAM mobile_sam_mask_decoder.onnx to .engine TensorRT cudnn	1	727	December 31, 2023
Tensorrt fails shapeMachine.cpp TensorRT tensorrt , cudnn	2	379	February 16, 2024
Use trtexec to run LSTM int8 calibrator failed with Error Code 2: Internal Error (Assertion mIndex >= 0 failed. symbol is not concrete) TensorRT	1	400	November 15, 2023
I am trying to convert the ONNX SSD mobilnet v3 model into TensorRT Engine. I am getting the below error Jetson TX2 tensorrt , tensorflow	24	3679	February 17, 2022
I do not get any performance improvement after using TensorRT provider for object detection model Jetson Nano tensorrt , onnx	7	1388	July 12, 2022
Error while working with trtexec to create an engine with onnx file TensorRT	6	1581	July 14, 2022
I am trying to convert the ONNX SSD mobilnet v2 model into TensorRT Engine. I am getting the below error Jetson AGX Xavier tensorrt , jetson	8	789	December 8, 2021
Process killed during tensorrt conversion on Jetson orin NX (8 GB) Jetson Orin NX tensorrt	15	687	April 30, 2024
Can someone please guide me in resolving the issue ./trtexec --onnx=model.onnx General Discussion tensorrt , cuda , ubuntu , jetson-inference , python	0	670	June 30, 2022

TensorRT build engine failed nanoSAM

Related topics