Description
I came across this error during onnx->tensorrt conversion with tensorrt package on Python. I’m new to tensorrt, so i dont know how to debug this error.
[03/28/2025-16:13:47] [TRT] [V] Running: NMSCastFusion on NonMaxSuppression_1537_716
[03/28/2025-16:13:47] [TRT] [E] [myelinBuilderUtils.cpp::nvinfer1::builder::`anonymous-namespace'::getMyelinSupportTypeNoConstraints::1027] Error Code 2: Internal Error (Assertion !scopedOp failed. DeviceToShapeHostNode should not have been inserted into a scope)
Also included the full conversion output (trt_conv.txt). I don’t know if this is useful to know, but ONNX model has one loop in post-processing phase, which has dynamic length (batch size) and NMS node. (So i had to use torch.jit.script to maintain dynamic nature of the model) How can I fix this error?
Environment
TensorRT Version: 10.9.0.34
GPU Type: RTX-4080
Nvidia Driver Version: 571.96
CUDA Version: 12.8
CUDNN Version: 9.5.1
Operating System + Version: Windows 11 Pro 10.0.26100
Python Version (if applicable): 3.10.11
TensorFlow Version (if applicable): N/A
PyTorch Version (if applicable): 2.6.0
Baremetal or Container (if container which image + tag): N/A
Relevant Files
Conversion log: trt_conv.txt (2.6 MB)
ONNX model: model.onnx - Google Drive
Steps To Reproduce
I coded up this script to convert onnx to tensorrt.
import tensorrt as trt
def convert_dynamic_onnx_to_tensorrt(
onnx_file_path, trt_file_path, min_shape, opt_shape, max_shape
):
TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE)
builder = trt.Builder(TRT_LOGGER)
network_flags = 1 << (int)(trt.NetworkDefinitionCreationFlag.STRONGLY_TYPED)
network = builder.create_network(network_flags)
parser = trt.OnnxParser(network, TRT_LOGGER)
# Parse the ONNX model
with open(onnx_file_path, "rb") as onnx_file:
if not parser.parse(onnx_file.read()):
for error in range(parser.num_errors):
TRT_LOGGER.log(
trt.Logger.ERROR, f"ONNX Parser Error: {parser.get_error(error)}"
)
raise RuntimeError("Failed to parse ONNX file.")
config = builder.create_builder_config()
config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1 << 30)
profile = builder.create_optimization_profile()
profile.set_shape("images", min=min_shape, opt=opt_shape, max=max_shape)
profile.set_shape(
"orig_image_shapes",
min=(min_shape[0], 2),
opt=(opt_shape[0], 2),
max=(max_shape[0], 2),
)
config.add_optimization_profile(profile)
# Build the engine
TRT_LOGGER.log(trt.Logger.INFO, "Building TensorRT engine with dynamic shapes...")
engine = builder.build_engine_with_config(network, config)
if engine is None:
raise RuntimeError("Failed to build TensorRT engine")
# Save the engine to file
with open(trt_file_path, "wb") as trt_file:
trt_file.write(engine.serialize())
TRT_LOGGER.log(trt.Logger.INFO, f"TensorRT engine saved to {trt_file_path}")
if __name__ == "__main__":
convert_dynamic_onnx_to_tensorrt(
onnx_file_path="model.onnx",
trt_file_path="model.trt",
min_shape=(1, 3, 224, 224),
opt_shape=(8, 3, 512, 512),
max_shape=(16, 3, 1024, 1024),
)