Error while trying to convert onnx to tensorrt engine

Description

I came across this error during onnx->tensorrt conversion with tensorrt package on Python. I’m new to tensorrt, so i dont know how to debug this error.

[03/28/2025-16:13:47] [TRT] [V] Running: NMSCastFusion on NonMaxSuppression_1537_716
[03/28/2025-16:13:47] [TRT] [E] [myelinBuilderUtils.cpp::nvinfer1::builder::`anonymous-namespace'::getMyelinSupportTypeNoConstraints::1027] Error Code 2: Internal Error (Assertion !scopedOp failed. DeviceToShapeHostNode should not have been inserted into a scope)

Also included the full conversion output (trt_conv.txt). I don’t know if this is useful to know, but ONNX model has one loop in post-processing phase, which has dynamic length (batch size) and NMS node. (So i had to use torch.jit.script to maintain dynamic nature of the model) How can I fix this error?

Environment

TensorRT Version: 10.9.0.34
GPU Type: RTX-4080
Nvidia Driver Version: 571.96
CUDA Version: 12.8
CUDNN Version: 9.5.1
Operating System + Version: Windows 11 Pro 10.0.26100
Python Version (if applicable): 3.10.11
TensorFlow Version (if applicable): N/A
PyTorch Version (if applicable): 2.6.0
Baremetal or Container (if container which image + tag): N/A

Relevant Files

Conversion log: trt_conv.txt (2.6 MB)
ONNX model: model.onnx - Google Drive

Steps To Reproduce

I coded up this script to convert onnx to tensorrt.

import tensorrt as trt


def convert_dynamic_onnx_to_tensorrt(
    onnx_file_path, trt_file_path, min_shape, opt_shape, max_shape
):
    TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE)
    builder = trt.Builder(TRT_LOGGER)
    network_flags = 1 << (int)(trt.NetworkDefinitionCreationFlag.STRONGLY_TYPED)
    network = builder.create_network(network_flags)
    parser = trt.OnnxParser(network, TRT_LOGGER)

    # Parse the ONNX model
    with open(onnx_file_path, "rb") as onnx_file:
        if not parser.parse(onnx_file.read()):
            for error in range(parser.num_errors):
                TRT_LOGGER.log(
                    trt.Logger.ERROR, f"ONNX Parser Error: {parser.get_error(error)}"
                )
            raise RuntimeError("Failed to parse ONNX file.")

    config = builder.create_builder_config()
    config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1 << 30)

    profile = builder.create_optimization_profile()
    profile.set_shape("images", min=min_shape, opt=opt_shape, max=max_shape)
    profile.set_shape(
        "orig_image_shapes",
        min=(min_shape[0], 2),
        opt=(opt_shape[0], 2),
        max=(max_shape[0], 2),
    )
    config.add_optimization_profile(profile)

    # Build the engine
    TRT_LOGGER.log(trt.Logger.INFO, "Building TensorRT engine with dynamic shapes...")
    engine = builder.build_engine_with_config(network, config)

    if engine is None:
        raise RuntimeError("Failed to build TensorRT engine")

    # Save the engine to file
    with open(trt_file_path, "wb") as trt_file:
        trt_file.write(engine.serialize())

    TRT_LOGGER.log(trt.Logger.INFO, f"TensorRT engine saved to {trt_file_path}")


if __name__ == "__main__":
    convert_dynamic_onnx_to_tensorrt(
        onnx_file_path="model.onnx", 
        trt_file_path="model.trt",
        min_shape=(1, 3, 224, 224),  
        opt_shape=(8, 3, 512, 512),  
        max_shape=(16, 3, 1024, 1024), 
    )

Hi @gil9103 ,
Please check the below pointers

  • Identify unsupported nodes using the ONNX parser. Check for errors related to nodes that fail during conversion.
  • Use debug tensors to track tensor values, types, and dimensions during runtime for better insight into issues.
  • Avoid dynamic loop operations if possible, as TensorRT may not support dynamic dimensions well. Consider restructuring your model to eliminate reliance on dynamic loops.
  • If your model includes NMS nodes, ensure they are defined in a way compatible with TensorRT or implement a custom NMS layer.
  • Monitor the number of errors and their details using TensorRT’s error-handling functions to understand the root cause of conversion issues.
  • Validate that the inference outputs between ONNX and TensorRT are within acceptable tolerances. Adjust tolerance settings as necessary to account for minor discrepancies in floating-point calculations.
  • Make sure your calibration dataset matches the inference input data to maintain accuracy while quantizing the model.