Hi @archr.
I appreciate your help.
I have two questions.
-
At example onnx, the shape tensor was cast to int64 before use it. Is it treated as float shape tensor?
Can not “cast to int64” avoid the limitation of float shape tensors?
-
My real interesting network has not shape tensor as network input.
Following is simplified structure.
... - Conv - Shape - Cast(to float32) - SomeShapeCalculation - Cast(to int64) - Resize
... - BatchedNMS_TRT - Slice - Cast(to int64) - Resize
Is it violate the restriction?
I have example onnx file for BatchedNMS_TRT case. So i attached onnx.
trtexec “–verbose" log is following.
&&&& RUNNING TensorRT.trtexec [TensorRT v8001] # ./trtexec --onnx=/home/nvidia/nms_followed_by_resize_cause_internal_error.onnx --workspace=64 --saveEngine=/home/nvidia/engine.plan --buildOnly --verbose
[01/11/2022-12:03:58] [I] === Model Options ===
[01/11/2022-12:03:58] [I] Format: ONNX
[01/11/2022-12:03:58] [I] Model: /home/nvidia/nms_followed_by_resize_cause_internal_error.onnx
[01/11/2022-12:03:58] [I] Output:
[01/11/2022-12:03:58] [I] === Build Options ===
[01/11/2022-12:03:58] [I] Max batch: explicit
[01/11/2022-12:03:58] [I] Workspace: 64 MiB
[01/11/2022-12:03:58] [I] minTiming: 1
[01/11/2022-12:03:58] [I] avgTiming: 8
[01/11/2022-12:03:58] [I] Precision: FP32
[01/11/2022-12:03:58] [I] Calibration:
[01/11/2022-12:03:58] [I] Refit: Disabled
[01/11/2022-12:03:58] [I] Sparsity: Disabled
[01/11/2022-12:03:58] [I] Safe mode: Disabled
[01/11/2022-12:03:58] [I] Restricted mode: Disabled
[01/11/2022-12:03:58] [I] Save engine: /home/nvidia/engine.plan
[01/11/2022-12:03:58] [I] Load engine:
[01/11/2022-12:03:58] [I] NVTX verbosity: 0
[01/11/2022-12:03:58] [I] Tactic sources: Using default tactic sources
[01/11/2022-12:03:58] [I] timingCacheMode: local
[01/11/2022-12:03:58] [I] timingCacheFile:
[01/11/2022-12:03:58] [I] Input(s)s format: fp32:CHW
[01/11/2022-12:03:58] [I] Output(s)s format: fp32:CHW
[01/11/2022-12:03:58] [I] Input build shapes: model
[01/11/2022-12:03:58] [I] Input calibration shapes: model
[01/11/2022-12:03:58] [I] === System Options ===
[01/11/2022-12:03:58] [I] Device: 0
[01/11/2022-12:03:58] [I] DLACore:
[01/11/2022-12:03:58] [I] Plugins:
[01/11/2022-12:03:58] [I] === Inference Options ===
[01/11/2022-12:03:58] [I] Batch: Explicit
[01/11/2022-12:03:58] [I] Input inference shapes: model
[01/11/2022-12:03:58] [I] Iterations: 10
[01/11/2022-12:03:58] [I] Duration: 3s (+ 200ms warm up)
[01/11/2022-12:03:58] [I] Sleep time: 0ms
[01/11/2022-12:03:58] [I] Streams: 1
[01/11/2022-12:03:58] [I] ExposeDMA: Disabled
[01/11/2022-12:03:58] [I] Data transfers: Enabled
[01/11/2022-12:03:58] [I] Spin-wait: Disabled
[01/11/2022-12:03:58] [I] Multithreading: Disabled
[01/11/2022-12:03:58] [I] CUDA Graph: Disabled
[01/11/2022-12:03:58] [I] Separate profiling: Disabled
[01/11/2022-12:03:58] [I] Time Deserialize: Disabled
[01/11/2022-12:03:58] [I] Time Refit: Disabled
[01/11/2022-12:03:58] [I] Skip inference: Enabled
[01/11/2022-12:03:58] [I] Inputs:
[01/11/2022-12:03:58] [I] === Reporting Options ===
[01/11/2022-12:03:58] [I] Verbose: Enabled
[01/11/2022-12:03:58] [I] Averages: 10 inferences
[01/11/2022-12:03:58] [I] Percentile: 99
[01/11/2022-12:03:58] [I] Dump refittable layers:Disabled
[01/11/2022-12:03:58] [I] Dump output: Disabled
[01/11/2022-12:03:58] [I] Profile: Disabled
[01/11/2022-12:03:58] [I] Export timing to JSON file:
[01/11/2022-12:03:58] [I] Export output to JSON file:
[01/11/2022-12:03:58] [I] Export profile to JSON file:
[01/11/2022-12:03:58] [I]
[01/11/2022-12:03:58] [I] === Device Information ===
[01/11/2022-12:03:58] [I] Selected Device: Xavier
[01/11/2022-12:03:58] [I] Compute Capability: 7.2
[01/11/2022-12:03:58] [I] SMs: 8
[01/11/2022-12:03:58] [I] Compute Clock Rate: 1.377 GHz
[01/11/2022-12:03:58] [I] Device Global Memory: 15824 MiB
[01/11/2022-12:03:58] [I] Shared Memory per SM: 96 KiB
[01/11/2022-12:03:58] [I] Memory Bus Width: 256 bits (ECC disabled)
[01/11/2022-12:03:58] [I] Memory Clock Rate: 1.377 GHz
[01/11/2022-12:03:58] [I]
[01/11/2022-12:03:58] [I] TensorRT version: 8001
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::NMS_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Reorg_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Region_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Clip_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::LReLU_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Normalize_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::ScatterND version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::RPROI_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::CropAndResize version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::EfficientNMS_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Proposal version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Split version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[01/11/2022-12:04:00] [I] [TRT] [MemUsageChange] Init CUDA: CPU +354, GPU +0, now: CPU 372, GPU 2249 (MiB)
[01/11/2022-12:04:00] [I] Start parsing network model
[01/11/2022-12:04:00] [I] [TRT] ----------------------------------------------------------------
[01/11/2022-12:04:00] [I] [TRT] Input filename: /home/nvidia/nms_followed_by_resize_cause_internal_error.onnx
[01/11/2022-12:04:00] [I] [TRT] ONNX IR version: 0.0.7
[01/11/2022-12:04:00] [I] [TRT] Opset version: 11
[01/11/2022-12:04:00] [I] [TRT] Producer name:
[01/11/2022-12:04:00] [I] [TRT] Producer version:
[01/11/2022-12:04:00] [I] [TRT] Domain:
[01/11/2022-12:04:00] [I] [TRT] Model version: 0
[01/11/2022-12:04:00] [I] [TRT] Doc string:
[01/11/2022-12:04:00] [I] [TRT] ----------------------------------------------------------------
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::GridAnchor_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::GridAnchorRect_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::NMS_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Reorg_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Region_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Clip_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::LReLU_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::PriorBox_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Normalize_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::ScatterND version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::RPROI_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::BatchedNMS_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::BatchedNMSDynamic_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::FlattenConcat_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::CropAndResize version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::DetectionLayer_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::EfficientNMS_ONNX_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::EfficientNMS_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Proposal version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::ProposalLayer_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::PyramidROIAlign_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::ResizeNearest_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Split version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::SpecialSlice_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::InstanceNormalization_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Adding network input: image with dtype: float32, dimensions: (1, 3, 224, 224)
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: image for ONNX tensor: image
[01/11/2022-12:04:00] [V] [TRT] Adding network input: boxes with dtype: float32, dimensions: (16, 100, 1, 4)
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: boxes for ONNX tensor: boxes
[01/11/2022-12:04:00] [V] [TRT] Adding network input: scores with dtype: float32, dimensions: (16, 100, 20)
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: scores for ONNX tensor: scores
[01/11/2022-12:04:00] [V] [TRT] Importing initializer: slice1.starts
[01/11/2022-12:04:00] [W] [TRT] onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/11/2022-12:04:00] [V] [TRT] Importing initializer: slice1.ends
[01/11/2022-12:04:00] [V] [TRT] Importing initializer: slice1.axes
[01/11/2022-12:04:00] [V] [TRT] Importing initializer: resize_3.roi
[01/11/2022-12:04:00] [V] [TRT] Importing initializer: resize_3.scales
[01/11/2022-12:04:00] [V] [TRT] Parsing node: node_of_num_detections [BatchedNMS_TRT]
[01/11/2022-12:04:00] [V] [TRT] Searching for input: boxes
[01/11/2022-12:04:00] [V] [TRT] Searching for input: scores
[01/11/2022-12:04:00] [V] [TRT] node_of_num_detections [BatchedNMS_TRT] inputs: [boxes -> (16, 100, 1, 4)[FLOAT]], [scores -> (16, 100, 20)[FLOAT]],
[01/11/2022-12:04:00] [I] [TRT] No importer registered for op: BatchedNMS_TRT. Attempting to import as plugin.
[01/11/2022-12:04:00] [I] [TRT] Searching for plugin: BatchedNMS_TRT, plugin_version: 1, plugin_namespace:
[01/11/2022-12:04:00] [W] [TRT] builtin_op_importers.cpp:4552: Attribute scoreBits not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[01/11/2022-12:04:00] [I] [TRT] Successfully created plugin: BatchedNMS_TRT
[01/11/2022-12:04:00] [V] [TRT] Registering layer: node_of_num_detections for ONNX node: node_of_num_detections
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: num_detections for ONNX tensor: num_detections
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: nmsed_boxes for ONNX tensor: nmsed_boxes
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: nmsed_scores for ONNX tensor: nmsed_scores
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: nmsed_classes for ONNX tensor: nmsed_classes
[01/11/2022-12:04:00] [V] [TRT] node_of_num_detections [BatchedNMS_TRT] outputs: [num_detections -> (16)[INT32]], [nmsed_boxes -> (16, 50, 4)[FLOAT]], [nmsed_scores -> (16, 50)[FLOAT]], [nmsed_classes -> (16, 50)[FLOAT]],
[01/11/2022-12:04:00] [V] [TRT] Parsing node: node_of_slice_1 [Slice]
[01/11/2022-12:04:00] [V] [TRT] Searching for input: nmsed_boxes
[01/11/2022-12:04:00] [V] [TRT] Searching for input: slice1.starts
[01/11/2022-12:04:00] [V] [TRT] Searching for input: slice1.ends
[01/11/2022-12:04:00] [V] [TRT] Searching for input: slice1.axes
[01/11/2022-12:04:00] [V] [TRT] node_of_slice_1 [Slice] inputs: [nmsed_boxes -> (16, 50, 4)[FLOAT]], [slice1.starts -> (2)[INT32]], [slice1.ends -> (2)[INT32]], [slice1.axes -> (2)[INT32]],
[01/11/2022-12:04:00] [V] [TRT] Registering layer: node_of_slice_1 for ONNX node: node_of_slice_1
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: slice_1 for ONNX tensor: slice_1
[01/11/2022-12:04:00] [V] [TRT] node_of_slice_1 [Slice] outputs: [slice_1 -> (1, 1, 4)[FLOAT]],
[01/11/2022-12:04:00] [V] [TRT] Parsing node: node_of_cast_2 [Cast]
[01/11/2022-12:04:00] [V] [TRT] Searching for input: slice_1
[01/11/2022-12:04:00] [V] [TRT] node_of_cast_2 [Cast] inputs: [slice_1 -> (1, 1, 4)[FLOAT]],
[01/11/2022-12:04:00] [V] [TRT] Casting to type: int32
[01/11/2022-12:04:00] [V] [TRT] Registering layer: node_of_cast_2 for ONNX node: node_of_cast_2
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: cast_2 for ONNX tensor: cast_2
[01/11/2022-12:04:00] [V] [TRT] node_of_cast_2 [Cast] outputs: [cast_2 -> (1, 1, 4)[INT32]],
[01/11/2022-12:04:00] [V] [TRT] Parsing node: node_of_scaled_image [Resize]
[01/11/2022-12:04:00] [V] [TRT] Searching for input: image
[01/11/2022-12:04:00] [V] [TRT] Searching for input: resize_3.roi
[01/11/2022-12:04:00] [V] [TRT] Searching for input: resize_3.scales
[01/11/2022-12:04:00] [V] [TRT] Searching for input: cast_2
[01/11/2022-12:04:00] [V] [TRT] node_of_scaled_image [Resize] inputs: [image -> (1, 3, 224, 224)[FLOAT]], [resize_3.roi -> ()[FLOAT]], [resize_3.scales -> ()[FLOAT]], [cast_2 -> (1, 1, 4)[INT32]],
[01/11/2022-12:04:00] [V] [TRT] Registering layer: node_of_scaled_image for ONNX node: node_of_scaled_image
[01/11/2022-12:04:00] [E] Error[9]: [graph.cpp::computeInputExecutionUses::519] Error Code 9: Internal Error (node_of_num_detections: IPluginV2Layer cannot be used to compute a shape tensor)
[01/11/2022-12:04:00] [E] [TRT] ModelImporter.cpp:720: While parsing node number 3 [Resize -> "scaled_image"]:
[01/11/2022-12:04:00] [E] [TRT] ModelImporter.cpp:721: --- Begin node ---
[01/11/2022-12:04:00] [E] [TRT] ModelImporter.cpp:722: input: "image"
input: "resize_3.roi"
input: "resize_3.scales"
input: "cast_2"
output: "scaled_image"
op_type: "Resize"
[01/11/2022-12:04:00] [E] [TRT] ModelImporter.cpp:723: --- End node ---
[01/11/2022-12:04:00] [E] [TRT] ModelImporter.cpp:726: ERROR: ModelImporter.cpp:179 In function parseGraph:
[6] Invalid Node - node_of_scaled_image
[graph.cpp::computeInputExecutionUses::519] Error Code 9: Internal Error (node_of_num_detections: IPluginV2Layer cannot be used to compute a shape tensor)
[01/11/2022-12:04:00] [E] Failed to parse onnx file
[01/11/2022-12:04:00] [I] Finish parsing network model
[01/11/2022-12:04:00] [E] Parsing model failed
[01/11/2022-12:04:00] [E] Engine creation failed
[01/11/2022-12:04:00] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8001] # ./trtexec --onnx=/home/nvidia/nms_followed_by_resize_cause_internal_error.onnx --workspace=64 --saveEngine=/home/nvidia/engine.plan --buildOnly --verbose
Thanks.
nms_followed_by_resize_cause_internal_error.onnx (833 Bytes)