Floor - Cast - Resize(or Slice) cause internal error

maminus · December 9, 2021, 10:37am

Hi, NVIDIA guys.

I’m facing internal error when using trtexec to convert onnx to engine.
I provide a minimal test case to reproduce error. Would you please fix this issue?

1) Provide details on the platforms you are using:

Linux distro and version: JetPack 4.6
GPU type: Jetson AGX Xavier Developer Kit
Nvidia driver version: N/A
CUDA version: 10.2
CUDNN version: 8.2.1
Python version: N/A
Tensorflow version: N/A
TensorRT version: 8.0.1
Operating System + Version: L4T 32.6.1
PyTorch Version: N/A
Baremetal or Container (if container which image + tag): Baremetal

$ cat /etc/nv_tegra_release
# R32 (release), REVISION: 6.1, GCID: 27863751, BOARD: t186ref, EABI: aarch64, DATE: Mon Jul 26 19:36:31 UTC 2021

2) Describe the issue

If ONNX model has following nodes, TensorRT cause internal error.

node has float type output
Cast to int64
“Resize” or “Slice” takes casted tensor as it’s second and subsequent arguments

example, Floor - Cast(int64) - Resize, Resize node takes casted result as “sizes” input.

trtexec command outputs following message.

[graph.cpp::computeInputExecutionUses::519] Error Code 9: Internal Error (node_of_floor_0: IUnaryLayer cannot be used to compute a shape tensor)

It seems that TensorRT investigate “sizes” tensor shape, but didn’t know Floor’s shape.

I tried to break down conditions that this issue occurs.
And I got following results.

ONNX nodes	result
float - Floor - Cast - Resize	cause internal error
float - Floor - Cast - Slice	cause internal error
float - Relu - Cast - Slice	cause internal error
int - Mul - Slice	OK (don’t cause error)
float - Floor - Cast - Mul	OK (don’t cause error)

no Cast is OK. no Resize or no Slice is OK.

So cast node affects this issue, and Resize or Slice node only cause this issue.

3) Provide supporting code or data files

I attached ONNX files to reproduce this error, and logs when an error occurs.

Attached zip contains followings.

floor_cast_resize_or_slice.zip
  +-- floor_followed_by_resize_cause_internal_error.onnx : ONNX file at "Floor - Cast - Resize" case
  +-- floor_resize.log                                   : console log when trtexec run in "Floor - Cast - Resize" case
  +-- floor_followed_by_slice_cause_internal_error.onnx  : ONNX file at "Floor - Cast - Slice" case
  +-- floor_slice.log                                    : console log when trtexec run in "Floor - Cast - Slice" case

ONNX files passed onnx model checker(onnx.checker.check_model(model) is OK), and inference correctly by onnxruntime.

4) Reproducibility

reproducable steps

unzip attached zip file
run trtexec

# ./trtexec --onnx=/home/nvidia/floor_followed_by_resize_cause_internal_error.onnx --workspace=64 --saveEngine=/home/nvidia/engine.plan --buildOnly --verbose

Please see attached logs to see full traceback of errors.

Thanks.

floor_cast_resize_or_slice.zip (5.2 KB)

NVES · December 9, 2021, 5:11pm

Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:

validating your model with the below snippet

check_model.py

import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!

maminus · December 10, 2021, 1:33pm

Hi @NVES.

Thanks for reply.

validating your model with the below snippet

I tried it.
check_model.py is OK(no error).

I’m still facing this issue.

trtexec “”–verbose"" log is included in the zip attached above.
But I paste log bellow again.

&&&& RUNNING TensorRT.trtexec [TensorRT v8001] # ./trtexec --onnx=/home/nvidia/floor_followed_by_resize_cause_internal_error.onnx --workspace=64 --saveEngine=/home/nvidia/det.plan --buildOnly --verbose
[12/02/2021-22:09:41] [I] === Model Options ===
[12/02/2021-22:09:41] [I] Format: ONNX
[12/02/2021-22:09:41] [I] Model: /home/nvidia/floor_followed_by_resize_cause_internal_error.onnx
[12/02/2021-22:09:41] [I] Output:
[12/02/2021-22:09:41] [I] === Build Options ===
[12/02/2021-22:09:41] [I] Max batch: explicit
[12/02/2021-22:09:41] [I] Workspace: 64 MiB
[12/02/2021-22:09:41] [I] minTiming: 1
[12/02/2021-22:09:41] [I] avgTiming: 8
[12/02/2021-22:09:41] [I] Precision: FP32
[12/02/2021-22:09:41] [I] Calibration:
[12/02/2021-22:09:41] [I] Refit: Disabled
[12/02/2021-22:09:41] [I] Sparsity: Disabled
[12/02/2021-22:09:41] [I] Safe mode: Disabled
[12/02/2021-22:09:41] [I] Restricted mode: Disabled
[12/02/2021-22:09:41] [I] Save engine: /home/nvidia/det.plan
[12/02/2021-22:09:41] [I] Load engine:
[12/02/2021-22:09:41] [I] NVTX verbosity: 0
[12/02/2021-22:09:41] [I] Tactic sources: Using default tactic sources
[12/02/2021-22:09:41] [I] timingCacheMode: local
[12/02/2021-22:09:41] [I] timingCacheFile:
[12/02/2021-22:09:41] [I] Input(s)s format: fp32:CHW
[12/02/2021-22:09:41] [I] Output(s)s format: fp32:CHW
[12/02/2021-22:09:41] [I] Input build shapes: model
[12/02/2021-22:09:41] [I] Input calibration shapes: model
[12/02/2021-22:09:41] [I] === System Options ===
[12/02/2021-22:09:41] [I] Device: 0
[12/02/2021-22:09:41] [I] DLACore:
[12/02/2021-22:09:41] [I] Plugins:
[12/02/2021-22:09:41] [I] === Inference Options ===
[12/02/2021-22:09:41] [I] Batch: Explicit
[12/02/2021-22:09:41] [I] Input inference shapes: model
[12/02/2021-22:09:41] [I] Iterations: 10
[12/02/2021-22:09:41] [I] Duration: 3s (+ 200ms warm up)
[12/02/2021-22:09:41] [I] Sleep time: 0ms
[12/02/2021-22:09:41] [I] Streams: 1
[12/02/2021-22:09:41] [I] ExposeDMA: Disabled
[12/02/2021-22:09:41] [I] Data transfers: Enabled
[12/02/2021-22:09:41] [I] Spin-wait: Disabled
[12/02/2021-22:09:41] [I] Multithreading: Disabled
[12/02/2021-22:09:41] [I] CUDA Graph: Disabled
[12/02/2021-22:09:41] [I] Separate profiling: Disabled
[12/02/2021-22:09:41] [I] Time Deserialize: Disabled
[12/02/2021-22:09:41] [I] Time Refit: Disabled
[12/02/2021-22:09:41] [I] Skip inference: Enabled
[12/02/2021-22:09:41] [I] Inputs:
[12/02/2021-22:09:41] [I] === Reporting Options ===
[12/02/2021-22:09:41] [I] Verbose: Enabled
[12/02/2021-22:09:41] [I] Averages: 10 inferences
[12/02/2021-22:09:41] [I] Percentile: 99
[12/02/2021-22:09:41] [I] Dump refittable layers:Disabled
[12/02/2021-22:09:41] [I] Dump output: Disabled
[12/02/2021-22:09:41] [I] Profile: Disabled
[12/02/2021-22:09:41] [I] Export timing to JSON file:
[12/02/2021-22:09:41] [I] Export output to JSON file:
[12/02/2021-22:09:41] [I] Export profile to JSON file:
[12/02/2021-22:09:41] [I]
[12/02/2021-22:09:42] [I] === Device Information ===
[12/02/2021-22:09:42] [I] Selected Device: Xavier
[12/02/2021-22:09:42] [I] Compute Capability: 7.2
[12/02/2021-22:09:42] [I] SMs: 8
[12/02/2021-22:09:42] [I] Compute Clock Rate: 1.377 GHz
[12/02/2021-22:09:42] [I] Device Global Memory: 15824 MiB
[12/02/2021-22:09:42] [I] Shared Memory per SM: 96 KiB
[12/02/2021-22:09:42] [I] Memory Bus Width: 256 bits (ECC disabled)
[12/02/2021-22:09:42] [I] Memory Clock Rate: 1.377 GHz
[12/02/2021-22:09:42] [I]
[12/02/2021-22:09:42] [I] TensorRT version: 8001
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::NMS_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::Reorg_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::Region_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::Clip_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::LReLU_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::Normalize_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::ScatterND version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::RPROI_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::CropAndResize version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::EfficientNMS_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::Proposal version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::Split version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[12/02/2021-22:09:42] [V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[12/02/2021-22:09:43] [I] [TRT] [MemUsageChange] Init CUDA: CPU +353, GPU +0, now: CPU 371, GPU 2940 (MiB)
[12/02/2021-22:09:43] [I] Start parsing network model
[12/02/2021-22:09:43] [I] [TRT] ----------------------------------------------------------------
[12/02/2021-22:09:43] [I] [TRT] Input filename:   /home/nvidia/floor_followed_by_resize_cause_internal_error.onnx
[12/02/2021-22:09:43] [I] [TRT] ONNX IR version:  0.0.7
[12/02/2021-22:09:43] [I] [TRT] Opset version:    11
[12/02/2021-22:09:43] [I] [TRT] Producer name:
[12/02/2021-22:09:43] [I] [TRT] Producer version:
[12/02/2021-22:09:43] [I] [TRT] Domain:
[12/02/2021-22:09:43] [I] [TRT] Model version:    0
[12/02/2021-22:09:43] [I] [TRT] Doc string:
[12/02/2021-22:09:43] [I] [TRT] ----------------------------------------------------------------
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::GridAnchor_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::GridAnchorRect_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::NMS_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::Reorg_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::Region_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::Clip_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::LReLU_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::PriorBox_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::Normalize_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::ScatterND version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::RPROI_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::BatchedNMS_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::BatchedNMSDynamic_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::FlattenConcat_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::CropAndResize version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::DetectionLayer_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::EfficientNMS_ONNX_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::EfficientNMS_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::Proposal version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::ProposalLayer_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::PyramidROIAlign_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::ResizeNearest_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::Split version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::SpecialSlice_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Plugin creator already registered - ::InstanceNormalization_TRT version 1
[12/02/2021-22:09:43] [V] [TRT] Adding network input: image with dtype: float32, dimensions: (1, 3, 224, 224)
[12/02/2021-22:09:43] [V] [TRT] Registering tensor: image for ONNX tensor: image
[12/02/2021-22:09:43] [V] [TRT] Adding network input: output_size with dtype: float32, dimensions: (4)
[12/02/2021-22:09:43] [V] [TRT] Registering tensor: output_size for ONNX tensor: output_size
[12/02/2021-22:09:43] [V] [TRT] Importing initializer: resize_2.roi
[12/02/2021-22:09:43] [V] [TRT] Importing initializer: resize_2.scales
[12/02/2021-22:09:43] [V] [TRT] Parsing node: node_of_floor_0 [Floor]
[12/02/2021-22:09:43] [V] [TRT] Searching for input: output_size
[12/02/2021-22:09:43] [V] [TRT] node_of_floor_0 [Floor] inputs: [output_size -> (4)[FLOAT]],
[12/02/2021-22:09:43] [V] [TRT] Registering layer: node_of_floor_0 for ONNX node: node_of_floor_0
[12/02/2021-22:09:43] [V] [TRT] Registering tensor: floor_0 for ONNX tensor: floor_0
[12/02/2021-22:09:43] [V] [TRT] node_of_floor_0 [Floor] outputs: [floor_0 -> (4)[FLOAT]],
[12/02/2021-22:09:43] [V] [TRT] Parsing node: node_of_cast_1 [Cast]
[12/02/2021-22:09:43] [V] [TRT] Searching for input: floor_0
[12/02/2021-22:09:43] [V] [TRT] node_of_cast_1 [Cast] inputs: [floor_0 -> (4)[FLOAT]],
[12/02/2021-22:09:43] [V] [TRT] Casting to type: int32
[12/02/2021-22:09:43] [V] [TRT] Registering layer: node_of_cast_1 for ONNX node: node_of_cast_1
[12/02/2021-22:09:43] [V] [TRT] Registering tensor: cast_1 for ONNX tensor: cast_1
[12/02/2021-22:09:43] [V] [TRT] node_of_cast_1 [Cast] outputs: [cast_1 -> (4)[INT32]],
[12/02/2021-22:09:43] [V] [TRT] Parsing node: node_of_scaled_image [Resize]
[12/02/2021-22:09:43] [V] [TRT] Searching for input: image
[12/02/2021-22:09:43] [V] [TRT] Searching for input: resize_2.roi
[12/02/2021-22:09:43] [V] [TRT] Searching for input: resize_2.scales
[12/02/2021-22:09:43] [V] [TRT] Searching for input: cast_1
[12/02/2021-22:09:43] [V] [TRT] node_of_scaled_image [Resize] inputs: [image -> (1, 3, 224, 224)[FLOAT]], [resize_2.roi -> ()[FLOAT]], [resize_2.scales -> ()[FLOAT]], [cast_1 -> (4)[INT32]],
[12/02/2021-22:09:43] [V] [TRT] Registering layer: node_of_scaled_image for ONNX node: node_of_scaled_image
[12/02/2021-22:09:43] [E] Error[9]: [graph.cpp::computeInputExecutionUses::519] Error Code 9: Internal Error (node_of_floor_0: IUnaryLayer cannot be used to compute a shape tensor)
[12/02/2021-22:09:43] [E] [TRT] ModelImporter.cpp:720: While parsing node number 2 [Resize -> "scaled_image"]:
[12/02/2021-22:09:43] [E] [TRT] ModelImporter.cpp:721: --- Begin node ---
[12/02/2021-22:09:43] [E] [TRT] ModelImporter.cpp:722: input: "image"
input: "resize_2.roi"
input: "resize_2.scales"
input: "cast_1"
output: "scaled_image"
op_type: "Resize"

[12/02/2021-22:09:43] [E] [TRT] ModelImporter.cpp:723: --- End node ---
[12/02/2021-22:09:43] [E] [TRT] ModelImporter.cpp:726: ERROR: ModelImporter.cpp:179 In function parseGraph:
[6] Invalid Node - node_of_scaled_image
[graph.cpp::computeInputExecutionUses::519] Error Code 9: Internal Error (node_of_floor_0: IUnaryLayer cannot be used to compute a shape tensor)
[12/02/2021-22:09:43] [E] Failed to parse onnx file
[12/02/2021-22:09:43] [I] Finish parsing network model
[12/02/2021-22:09:43] [E] Parsing model failed
[12/02/2021-22:09:43] [E] Engine creation failed
[12/02/2021-22:09:43] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8001] # ./trtexec --onnx=/home/nvidia/floor_followed_by_resize_cause_internal_error.onnx --workspace=64 --saveEngine=/home/nvidia/det.plan --buildOnly --verbose

Thanks.

spolisetty · January 10, 2022, 5:21am

Hi @maminus,

Our team is looking into this issue. Let us get back to you soon.

Thank you.

archr · January 10, 2022, 6:03pm

Thanks for the concise reproducers. There are two problems:

The examples use floating-point shape tensors as network inputs, and shape-tensor I/O is limited to Int32. This limitation is buried in the C++ documentation for ITensor::isShapeTensor:

//! If a tensor is a shape tensor and becomes an engine input or output,
//! then ICudaEngine::isShapeBinding will be true for that tensor.
//! Such a shape tensor must have type Int32.

Shape tensors are tensors whose values are used to compute the dimensions of tensors. https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#exe_shape_tensors has the formal rules on what is considered a shape tensor (for 8.4, they can be float too, as long as they are not I/O tensors).

TensorRT did not diagnose violation of the restriction, and instead plowed ahead until the assertion failure.

It’s too late to relax (1) in TensorRT 8.4. (2) we’ll fix. Since floating-point shape-tensor I/O won’t be available, I was wondering if you have a way to avoid it in the networks of real interest.

maminus · January 11, 2022, 3:23am

Hi @archr.

I appreciate your help.

I have two questions.

At example onnx, the shape tensor was cast to int64 before use it. Is it treated as float shape tensor?

Can not “cast to int64” avoid the limitation of float shape tensors?
My real interesting network has not shape tensor as network input.
Following is simplified structure.
- ... - Conv - Shape - Cast(to float32) - SomeShapeCalculation - Cast(to int64) - Resize
- ... - BatchedNMS_TRT - Slice - Cast(to int64) - Resize
Is it violate the restriction?

I have example onnx file for BatchedNMS_TRT case. So i attached onnx.

trtexec “–verbose" log is following.

&&&& RUNNING TensorRT.trtexec [TensorRT v8001] # ./trtexec --onnx=/home/nvidia/nms_followed_by_resize_cause_internal_error.onnx --workspace=64 --saveEngine=/home/nvidia/engine.plan --buildOnly --verbose
[01/11/2022-12:03:58] [I] === Model Options ===
[01/11/2022-12:03:58] [I] Format: ONNX
[01/11/2022-12:03:58] [I] Model: /home/nvidia/nms_followed_by_resize_cause_internal_error.onnx
[01/11/2022-12:03:58] [I] Output:
[01/11/2022-12:03:58] [I] === Build Options ===
[01/11/2022-12:03:58] [I] Max batch: explicit
[01/11/2022-12:03:58] [I] Workspace: 64 MiB
[01/11/2022-12:03:58] [I] minTiming: 1
[01/11/2022-12:03:58] [I] avgTiming: 8
[01/11/2022-12:03:58] [I] Precision: FP32
[01/11/2022-12:03:58] [I] Calibration:
[01/11/2022-12:03:58] [I] Refit: Disabled
[01/11/2022-12:03:58] [I] Sparsity: Disabled
[01/11/2022-12:03:58] [I] Safe mode: Disabled
[01/11/2022-12:03:58] [I] Restricted mode: Disabled
[01/11/2022-12:03:58] [I] Save engine: /home/nvidia/engine.plan
[01/11/2022-12:03:58] [I] Load engine:
[01/11/2022-12:03:58] [I] NVTX verbosity: 0
[01/11/2022-12:03:58] [I] Tactic sources: Using default tactic sources
[01/11/2022-12:03:58] [I] timingCacheMode: local
[01/11/2022-12:03:58] [I] timingCacheFile:
[01/11/2022-12:03:58] [I] Input(s)s format: fp32:CHW
[01/11/2022-12:03:58] [I] Output(s)s format: fp32:CHW
[01/11/2022-12:03:58] [I] Input build shapes: model
[01/11/2022-12:03:58] [I] Input calibration shapes: model
[01/11/2022-12:03:58] [I] === System Options ===
[01/11/2022-12:03:58] [I] Device: 0
[01/11/2022-12:03:58] [I] DLACore:
[01/11/2022-12:03:58] [I] Plugins:
[01/11/2022-12:03:58] [I] === Inference Options ===
[01/11/2022-12:03:58] [I] Batch: Explicit
[01/11/2022-12:03:58] [I] Input inference shapes: model
[01/11/2022-12:03:58] [I] Iterations: 10
[01/11/2022-12:03:58] [I] Duration: 3s (+ 200ms warm up)
[01/11/2022-12:03:58] [I] Sleep time: 0ms
[01/11/2022-12:03:58] [I] Streams: 1
[01/11/2022-12:03:58] [I] ExposeDMA: Disabled
[01/11/2022-12:03:58] [I] Data transfers: Enabled
[01/11/2022-12:03:58] [I] Spin-wait: Disabled
[01/11/2022-12:03:58] [I] Multithreading: Disabled
[01/11/2022-12:03:58] [I] CUDA Graph: Disabled
[01/11/2022-12:03:58] [I] Separate profiling: Disabled
[01/11/2022-12:03:58] [I] Time Deserialize: Disabled
[01/11/2022-12:03:58] [I] Time Refit: Disabled
[01/11/2022-12:03:58] [I] Skip inference: Enabled
[01/11/2022-12:03:58] [I] Inputs:
[01/11/2022-12:03:58] [I] === Reporting Options ===
[01/11/2022-12:03:58] [I] Verbose: Enabled
[01/11/2022-12:03:58] [I] Averages: 10 inferences
[01/11/2022-12:03:58] [I] Percentile: 99
[01/11/2022-12:03:58] [I] Dump refittable layers:Disabled
[01/11/2022-12:03:58] [I] Dump output: Disabled
[01/11/2022-12:03:58] [I] Profile: Disabled
[01/11/2022-12:03:58] [I] Export timing to JSON file:
[01/11/2022-12:03:58] [I] Export output to JSON file:
[01/11/2022-12:03:58] [I] Export profile to JSON file:
[01/11/2022-12:03:58] [I]
[01/11/2022-12:03:58] [I] === Device Information ===
[01/11/2022-12:03:58] [I] Selected Device: Xavier
[01/11/2022-12:03:58] [I] Compute Capability: 7.2
[01/11/2022-12:03:58] [I] SMs: 8
[01/11/2022-12:03:58] [I] Compute Clock Rate: 1.377 GHz
[01/11/2022-12:03:58] [I] Device Global Memory: 15824 MiB
[01/11/2022-12:03:58] [I] Shared Memory per SM: 96 KiB
[01/11/2022-12:03:58] [I] Memory Bus Width: 256 bits (ECC disabled)
[01/11/2022-12:03:58] [I] Memory Clock Rate: 1.377 GHz
[01/11/2022-12:03:58] [I]
[01/11/2022-12:03:58] [I] TensorRT version: 8001
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::GridAnchorRect_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::NMS_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Reorg_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Region_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Clip_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::LReLU_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Normalize_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::ScatterND version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::RPROI_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::CropAndResize version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::EfficientNMS_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Proposal version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::Split version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[01/11/2022-12:03:58] [V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
[01/11/2022-12:04:00] [I] [TRT] [MemUsageChange] Init CUDA: CPU +354, GPU +0, now: CPU 372, GPU 2249 (MiB)
[01/11/2022-12:04:00] [I] Start parsing network model
[01/11/2022-12:04:00] [I] [TRT] ----------------------------------------------------------------
[01/11/2022-12:04:00] [I] [TRT] Input filename:   /home/nvidia/nms_followed_by_resize_cause_internal_error.onnx
[01/11/2022-12:04:00] [I] [TRT] ONNX IR version:  0.0.7
[01/11/2022-12:04:00] [I] [TRT] Opset version:    11
[01/11/2022-12:04:00] [I] [TRT] Producer name:
[01/11/2022-12:04:00] [I] [TRT] Producer version:
[01/11/2022-12:04:00] [I] [TRT] Domain:
[01/11/2022-12:04:00] [I] [TRT] Model version:    0
[01/11/2022-12:04:00] [I] [TRT] Doc string:
[01/11/2022-12:04:00] [I] [TRT] ----------------------------------------------------------------
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::GridAnchor_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::GridAnchorRect_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::NMS_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Reorg_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Region_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Clip_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::LReLU_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::PriorBox_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Normalize_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::ScatterND version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::RPROI_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::BatchedNMS_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::BatchedNMSDynamic_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::FlattenConcat_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::CropAndResize version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::DetectionLayer_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::EfficientNMS_ONNX_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::EfficientNMS_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Proposal version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::ProposalLayer_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::PyramidROIAlign_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::ResizeNearest_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::Split version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::SpecialSlice_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Plugin creator already registered - ::InstanceNormalization_TRT version 1
[01/11/2022-12:04:00] [V] [TRT] Adding network input: image with dtype: float32, dimensions: (1, 3, 224, 224)
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: image for ONNX tensor: image
[01/11/2022-12:04:00] [V] [TRT] Adding network input: boxes with dtype: float32, dimensions: (16, 100, 1, 4)
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: boxes for ONNX tensor: boxes
[01/11/2022-12:04:00] [V] [TRT] Adding network input: scores with dtype: float32, dimensions: (16, 100, 20)
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: scores for ONNX tensor: scores
[01/11/2022-12:04:00] [V] [TRT] Importing initializer: slice1.starts
[01/11/2022-12:04:00] [W] [TRT] onnx2trt_utils.cpp:364: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/11/2022-12:04:00] [V] [TRT] Importing initializer: slice1.ends
[01/11/2022-12:04:00] [V] [TRT] Importing initializer: slice1.axes
[01/11/2022-12:04:00] [V] [TRT] Importing initializer: resize_3.roi
[01/11/2022-12:04:00] [V] [TRT] Importing initializer: resize_3.scales
[01/11/2022-12:04:00] [V] [TRT] Parsing node: node_of_num_detections [BatchedNMS_TRT]
[01/11/2022-12:04:00] [V] [TRT] Searching for input: boxes
[01/11/2022-12:04:00] [V] [TRT] Searching for input: scores
[01/11/2022-12:04:00] [V] [TRT] node_of_num_detections [BatchedNMS_TRT] inputs: [boxes -> (16, 100, 1, 4)[FLOAT]], [scores -> (16, 100, 20)[FLOAT]],
[01/11/2022-12:04:00] [I] [TRT] No importer registered for op: BatchedNMS_TRT. Attempting to import as plugin.
[01/11/2022-12:04:00] [I] [TRT] Searching for plugin: BatchedNMS_TRT, plugin_version: 1, plugin_namespace:
[01/11/2022-12:04:00] [W] [TRT] builtin_op_importers.cpp:4552: Attribute scoreBits not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[01/11/2022-12:04:00] [I] [TRT] Successfully created plugin: BatchedNMS_TRT
[01/11/2022-12:04:00] [V] [TRT] Registering layer: node_of_num_detections for ONNX node: node_of_num_detections
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: num_detections for ONNX tensor: num_detections
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: nmsed_boxes for ONNX tensor: nmsed_boxes
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: nmsed_scores for ONNX tensor: nmsed_scores
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: nmsed_classes for ONNX tensor: nmsed_classes
[01/11/2022-12:04:00] [V] [TRT] node_of_num_detections [BatchedNMS_TRT] outputs: [num_detections -> (16)[INT32]], [nmsed_boxes -> (16, 50, 4)[FLOAT]], [nmsed_scores -> (16, 50)[FLOAT]], [nmsed_classes -> (16, 50)[FLOAT]],
[01/11/2022-12:04:00] [V] [TRT] Parsing node: node_of_slice_1 [Slice]
[01/11/2022-12:04:00] [V] [TRT] Searching for input: nmsed_boxes
[01/11/2022-12:04:00] [V] [TRT] Searching for input: slice1.starts
[01/11/2022-12:04:00] [V] [TRT] Searching for input: slice1.ends
[01/11/2022-12:04:00] [V] [TRT] Searching for input: slice1.axes
[01/11/2022-12:04:00] [V] [TRT] node_of_slice_1 [Slice] inputs: [nmsed_boxes -> (16, 50, 4)[FLOAT]], [slice1.starts -> (2)[INT32]], [slice1.ends -> (2)[INT32]], [slice1.axes -> (2)[INT32]],
[01/11/2022-12:04:00] [V] [TRT] Registering layer: node_of_slice_1 for ONNX node: node_of_slice_1
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: slice_1 for ONNX tensor: slice_1
[01/11/2022-12:04:00] [V] [TRT] node_of_slice_1 [Slice] outputs: [slice_1 -> (1, 1, 4)[FLOAT]],
[01/11/2022-12:04:00] [V] [TRT] Parsing node: node_of_cast_2 [Cast]
[01/11/2022-12:04:00] [V] [TRT] Searching for input: slice_1
[01/11/2022-12:04:00] [V] [TRT] node_of_cast_2 [Cast] inputs: [slice_1 -> (1, 1, 4)[FLOAT]],
[01/11/2022-12:04:00] [V] [TRT] Casting to type: int32
[01/11/2022-12:04:00] [V] [TRT] Registering layer: node_of_cast_2 for ONNX node: node_of_cast_2
[01/11/2022-12:04:00] [V] [TRT] Registering tensor: cast_2 for ONNX tensor: cast_2
[01/11/2022-12:04:00] [V] [TRT] node_of_cast_2 [Cast] outputs: [cast_2 -> (1, 1, 4)[INT32]],
[01/11/2022-12:04:00] [V] [TRT] Parsing node: node_of_scaled_image [Resize]
[01/11/2022-12:04:00] [V] [TRT] Searching for input: image
[01/11/2022-12:04:00] [V] [TRT] Searching for input: resize_3.roi
[01/11/2022-12:04:00] [V] [TRT] Searching for input: resize_3.scales
[01/11/2022-12:04:00] [V] [TRT] Searching for input: cast_2
[01/11/2022-12:04:00] [V] [TRT] node_of_scaled_image [Resize] inputs: [image -> (1, 3, 224, 224)[FLOAT]], [resize_3.roi -> ()[FLOAT]], [resize_3.scales -> ()[FLOAT]], [cast_2 -> (1, 1, 4)[INT32]],
[01/11/2022-12:04:00] [V] [TRT] Registering layer: node_of_scaled_image for ONNX node: node_of_scaled_image
[01/11/2022-12:04:00] [E] Error[9]: [graph.cpp::computeInputExecutionUses::519] Error Code 9: Internal Error (node_of_num_detections: IPluginV2Layer cannot be used to compute a shape tensor)
[01/11/2022-12:04:00] [E] [TRT] ModelImporter.cpp:720: While parsing node number 3 [Resize -> "scaled_image"]:
[01/11/2022-12:04:00] [E] [TRT] ModelImporter.cpp:721: --- Begin node ---
[01/11/2022-12:04:00] [E] [TRT] ModelImporter.cpp:722: input: "image"
input: "resize_3.roi"
input: "resize_3.scales"
input: "cast_2"
output: "scaled_image"
op_type: "Resize"

[01/11/2022-12:04:00] [E] [TRT] ModelImporter.cpp:723: --- End node ---
[01/11/2022-12:04:00] [E] [TRT] ModelImporter.cpp:726: ERROR: ModelImporter.cpp:179 In function parseGraph:
[6] Invalid Node - node_of_scaled_image
[graph.cpp::computeInputExecutionUses::519] Error Code 9: Internal Error (node_of_num_detections: IPluginV2Layer cannot be used to compute a shape tensor)
[01/11/2022-12:04:00] [E] Failed to parse onnx file
[01/11/2022-12:04:00] [I] Finish parsing network model
[01/11/2022-12:04:00] [E] Parsing model failed
[01/11/2022-12:04:00] [E] Engine creation failed
[01/11/2022-12:04:00] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8001] # ./trtexec --onnx=/home/nvidia/nms_followed_by_resize_cause_internal_error.onnx --workspace=64 --saveEngine=/home/nvidia/engine.plan --buildOnly --verbose

Thanks.
nms_followed_by_resize_cause_internal_error.onnx (833 Bytes)

archr · January 11, 2022, 5:34pm

Alas there’s no way to wiggle out of the shape tensor restrictions with casts. The shape tensor algebra is restricted so that the dimensions of any tensor can be computed solely from build-time information and the shape tensor inputs. Runtime execution is strictly partitioned into:

Compute shapes of GPU-side tensors on the CPU
Compute GPU-side tensor values

Conv - Shape - Cast(to float32) - SomeShapeCalculation - Cast(to int64) - Resize is supposed to work in TensorRT 8.4 as long as the shape calculation is elementwise/unary arithmetic and shuffle/slice/concat/gather rearrangement.

BatchedNMS_TRT - Slice - Cast(to int64) - Resize is going to be a problem because it depends on a plugin, and plugins currently don’t support shape-tensor inputs/outputs. There’s a hacky work-around by using empty tensors: add dummy inputs and outputs to the plugin that have a zero constant dimension (so they take no space) but have other dimensions that are shape-tensor integers of interest. But the plugin is still going to be restricted to using the arithmetic operations in DimensionOperation (declared in include/NvInferRuntime.h). I.e. there’s no way to make the calculation depend on values in a non-shape tensor.

maminus · January 12, 2022, 9:59am

Thank you for your explanation.

The questions have been cleared up.

I’ll try to modify my model, and I’m looking forward to release TensorRT 8.4.

I marked “resolved” this thread.

Thanks.

Topic		Replies	Views
Duplicated reshapes triggers "[graphOptimizer.cpp::findOne::510] Error Code 2: Internal Error (Assertion it != v.end() failed. )" TensorRT tensorrt , onnx	4	920	February 22, 2022
Slice & resize operators error: N shape values provided by optimization profile, which is not product of its dimensions [N] TensorRT	3	1061	April 12, 2022
BUG: Output TRT engine from trtexec has completely different inference than input model TensorRT tensorrt , debugging-and-troubleshooting	3	2236	January 4, 2022
How to export the Pytorch model Keypoint R-CNN to onnx and benchmark with trtexec TensorRT	7	1348	July 14, 2022
Trtexec cannot convert QAT onnx model to trt model TensorRT tensorrt	7	1187	January 26, 2023
TensorRT conversion from tensorflow with custom op TensorRT tensorrt , tensorflow	5	1366	August 12, 2023
Cuda OutOfMemory when creating tensor with 2^29 (~0.5 G) elements TensorRT tensorrt , cuda , onnx	6	1756	March 9, 2022
TensorRT generated QAT engine, why the engine is bigger than pretrained fp16 engine? TensorRT	3	1296	January 4, 2022
PyTorch/ONNX Model Involving Very Large Images: Myelin error: autotuning: CUDA error 2 allocating 0-byte buffer: out of memory TensorRT pytorch	4	1027	February 9, 2023
Issues with torch.nn.ReflectionPad2d(padding) conversion to TRT engine TensorRT tensorrt , pytorch , onnx	21	4199	February 8, 2022

Floor - Cast - Resize(or Slice) cause internal error

1) Provide details on the platforms you are using:

2) Describe the issue

3) Provide supporting code or data files

4) Reproducibility

check_model.py

Related topics