The issue of width and height when exporting onnx from rtdetr

2295098451 · October 10, 2025, 3:00am

Hello，I have encountered a problem

• Hardware
ubuntu22.04
tao toolkit6.25.9
NVIDIA version 535.183.01
RTX4090
• Network Type (rtdetr)

The size I set during training is:

When I export onnx, I set:

error report：

File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward result = self.forward(*input, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 203, in forward output = layer(output, src_mask=src_mask, pos_embed=pos_embed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward result = self.forward(*input, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 172, in forward q = k = self.with_pos_embed(src, pos_embed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 165, in with_pos_embed return tensor if pos_embed is None else tensor + pos_embed ~~~~~~~^~~~~~~~~~~ RuntimeError: The size of tensor a (527) must match the size of tensor b (400) at non-singleton dimension 1

I can only set 640 * 640 to successfully export. Does rtdetr not support rectangular export? And during my training, I couldn’t export so many sizes, for example, setting 992 * 992 would also result in an error when exporting

Hello, could you please take a look

2295098451 · October 10, 2025, 5:50am

@Morganh

Hello, can you see how to solve it

Morganh · October 10, 2025, 9:02am

What is the error when export 992x992?

2295098451 · October 13, 2025, 12:13am

Error executing job with overrides: [‘export.checkpoint=/results/run/act03/rtdetr_model_latest.pth’, ‘export.onnx_file=/results/run/act03/export/rtdetr_model_latest.onnx’, ‘results_dir=/results/run/act03/’]Traceback (most recent call last):
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/core/decorators/workflow.py”, line 72, in _func
raise e
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/core/decorators/workflow.py”, line 51, in _func
runner(cfg, **kwargs)
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/scripts/export.py”, line 55, in main
run_export(cfg)
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/scripts/export.py”, line 182, in run_export
onnx_export.export_model(model, batch_size,
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/deformable_detr/utils/onnx_export.py”, line 74, in export_model
torch.onnx.export(model, dummy_input, onnx_file,
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/init.py”, line 383, in export
export(
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/utils.py”, line 495, in export
_export(
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/utils.py”, line 1428, in _export
graph, params_dict, torch_out = _model_to_graph(
^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/utils.py”, line 1053, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/utils.py”, line 937, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/utils.py”, line 844, in _trace_and_get_graph_from_model
trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/jit/_trace.py”, line 1498, in _get_trace_graph
outs = ONNXTracedModule(
^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/jit/_trace.py”, line 138, in forward
graph, _out = torch._C._create_graph_by_tracing(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/jit/_trace.py”, line 129, in wrapper
outs.append(self.inner(*trace_inputs))
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/build_nn_model.py”, line 157, in forward
x = self.model(x, targets)
^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/rtdetr.py”, line 88, in forward
x, proj_feats = self.encoder(feats)
^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 339, in forward
memory = self.encoder[i](src_flatten, pos_embed=pos_embed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 203, in forward
output = layer(output, src_mask=src_mask, pos_embed=pos_embed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 172, in forward
q = k = self.with_pos_embed(src, pos_embed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 165, in with_pos_embed
return tensor if pos_embed is None else tensor + pos_embed
~^~~~~
RuntimeError: The size of tensor a (961) must match the size of tensor b (400) at non-singleton dimension 1

Morganh · October 13, 2025, 2:32am

Could you please try more cases? How about export to 704x704 , 800×800 or 960×960？

2295098451 · October 13, 2025, 2:42am

This is an 800 * 800 error, I only gave a partial:

File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_implreturn self._call_impl(*args, **kwargs)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_implreturn forward_call(*args, **kwargs)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forwardresult = self.forward(*input, **kwargs)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 172, in forwardq = k = self.with_pos_embed(src, pos_embed)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 165, in with_pos_embedreturn tensor if pos_embed is None else tensor + pos_embed

RuntimeError: The size of tensor a (625) must match the size of tensor b (400) at non-singleton dimension 1

I think the main reasons are: RuntimeError: The size of tensor a (625) must match the size of tensor b (400) at non-singleton dimension 1
800/32 = 25, 25*25=625
640/32 = 20, 20*20=400
No matter what export size is used, as long as it is not 640 * 640, an error will be reported, similar to The size of tensor a (export size/32 * 2) must match the size of tensor b (400) at non-singleton dimension 1

Morganh · October 13, 2025, 3:36am

According to tao_pytorch_backend/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py at main · NVIDIA/tao_pytorch_backend · GitHub, please retry to check if below works.

Set eval_spatial_size to below. Order is (h,w).

eval_spatial_size:
   -800
   -800

Also set export part to

input_width: 800
input_height: 800

2295098451 · October 13, 2025, 5:38am

Training settings:

  augmentation:
    multi_scales:
      - 480
      - 512
      - 544
      - 576
      - 608
      - 640
      - 672
      - 704
      - 736
      - 768
      - 800
    train_spatial_size:
      - 800
      - 800
    eval_spatial_size:
      - 800
      - 800

Set during export:

input_width: 800
input_height: 800

Still reporting errors:

  File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1730, in _slow_forward
    result = self.forward(*input, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py", line 203, in forward
    output = layer(output, src_mask=src_mask, pos_embed=pos_embed)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1740, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1751, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1730, in _slow_forward
    result = self.forward(*input, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py", line 172, in forward
    q = k = self.with_pos_embed(src, pos_embed)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py", line 165, in with_pos_embed
    return tensor if pos_embed is None else tensor + pos_embed
                                            ~~~~~~~^~~~~~~~~~~
RuntimeError: The size of tensor a (625) must match the size of tensor b (400) at non-singleton dimension 1

This is the configuration file for training and exporting:

They are files in yml format, and I can only upload files in txt format, so I changed the extension

export.txt (1.2 KB)

train-act.txt (3.1 KB)

Morganh · October 13, 2025, 6:34am

Could you please check your trained pth model?

model = torch.load('your_model.pth')
print(model.encoder.pos_embed.shape)

2295098451 · October 13, 2025, 7:07am

report errors:

Traceback (most recent call last):
  File "d:\python\dataset\模型检查.py", line 4, in <module>
    print(model.encoder.pos_embed.shape)
AttributeError: 'dict' object has no attribute 'encoder'

I use the following code to print：

import torch

ckpt = torch.load(r"C:\Users\tianx\Desktop\rtdetr_model_latest.pth", map_location="cpu")
print(ckpt.keys())
state_dict = ckpt['state_dict']
print(state_dict.keys())
print(state_dict['model.encoder.pos_embed'].shape)

There is no model. encoder. pos_imbed inside

Morganh · October 14, 2025, 5:09am

Hi, I cannot reproduce the original issue. I can export to various onnx. Please refer to my steps.
1.$ docker run --runtime=nvidia -it --rm -v /lustre/fsw/portfolios/edgeai/users/morganh:/lustre/fsw/portfolios/edgeai/users/morganh nvcr.io/nvidia/tao/tao-toolkit:6.25.9-pyt /bin/bash
2. Run training.
root@batch-block4-1080:/lustre/fsw/portfolios/edgeai/users/morganh/forum_repro# rtdetr train -e /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/spec.yaml

dataset:
  augmentation:
    distortion_prob: 0.8
    eval_spatial_size:
    - 800
    - 800
    iou_crop_prob: 0.8
    multi_scales:
    - 480
    - 512
    - 544
    - 576
    - 608
    - 640
    - 672
    - 704
    - 736
    - 768
    - 800
    preserve_aspect_ratio: false
    train_spatial_size:
    - 800
    - 800
  batch_size: 4
  num_classes: 80
  remap_mscoco_category: true
  train_data_sources:
  - image_dir: /lustre/fsw/portfolios/edgeai/users/morganh/coco_3/coco_data/raw-data/val2017
    json_file: /lustre/fsw/portfolios/edgeai/users/morganh/coco_3/coco_data/raw-data/annotations/instances_val2017.json
  val_data_sources:
    image_dir: /lustre/fsw/portfolios/edgeai/users/morganh/coco_3/coco_data/raw-data/val2017
    json_file: /lustre/fsw/portfolios/edgeai/users/morganh/coco_3/coco_data/raw-data/annotations/instances_val2017.json
  workers: 6
export:
  batch_size: -1
  checkpoint: /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/results_no_code_change/train/rtdetr_model_latest.pth
  input_channel: 3
  input_height: 768
  input_width: 768
  on_cpu: false
  onnx_file: model.onnx
  opset_version: 12
model:
  act: silu
  alpha: 0.75
  aux_loss: true
  backbone: resnet_50
  backbone_names:
  - backbone.0
  bbox_loss_coef: 5.0
  dec_layers: 6
  depth_mult: 1
  dim_feedforward: 1024
  dn_number: 100
  dropout_ratio: 0.0
  enc_act: gelu
  enc_layers: 1
  eval_idx: -1
  expansion: 1.0
  feat_channels:
  - 256
  - 256
  - 256
  feat_strides:
  - 8
  - 16
  - 32
  gamma: 2.0
  giou_loss_coef: 2.0
  hidden_dim: 256
  linear_proj_names:
  - reference_points
  - sampling_offsets
  loss_types:
  - vfl
  - boxes
  nheads: 8
  num_feature_levels: 3
  num_queries: 300
  num_select: 300
  pe_temperature: 10000
  return_interm_indices:
  - 1
  - 2
  - 3
  train_backbone: false
  use_encoder_idx:
  - 2
  vfl_loss_coef: 1.0
train:
  activation_checkpoint: true
  checkpoint_interval: 1
  clip_grad_norm: 0.1
  distributed_strategy: ddp
  enable_ema: true
  gpu_ids:
  - 0
  num_epochs: 1
  num_gpus: 1
  num_nodes: 1
  optim:
    lr: 0.0001
    lr_backbone: 1.0e-05
    lr_decay: 0.1
    lr_scheduler: MultiStep
    lr_steps:
    - 40
    momentum: 0.9
    optimizer: AdamW
    weight_decay: 0.0001
  precision: fp32
  seed: 1234
  validation_interval: 1

After training,
$ mv /results /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/results_no_code_change
$ rm -rf /results
Exporting to 768x768 onnx

  augmentation:
    distortion_prob: 0.8
    eval_spatial_size:
    - 768
    - 768
    iou_crop_prob: 0.8
    multi_scales:
    - 480
    - 512
    - 544
    - 576
    - 608
    - 640
    - 672
    - 704
    - 736
    - 768
    - 800
    preserve_aspect_ratio: false
    train_spatial_size:
    - 768
    - 768
export:
  batch_size: -1
  checkpoint: /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/results_no_code_change/train/rtdetr_model_latest.pth
  input_channel: 3
  input_height: 768
  input_width: 768
  on_cpu: false
  onnx_file: model.onnx
  opset_version: 12

$ rm -rf /results
$ rm model.onnx
$ export CUDA_VISIBLE_DEVICES=0
$ rtdetr export -e /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/spec.yaml

Export to 832x832 onnx file.

  augmentation:
    distortion_prob: 0.8
    eval_spatial_size:
    - 832
    - 832
    iou_crop_prob: 0.8
    multi_scales:
    - 480
    - 512
    - 544
    - 576
    - 608
    - 640
    - 672
    - 704
    - 736
    - 768
    - 800
    preserve_aspect_ratio: false
    train_spatial_size:
    - 832
    - 832
export:
  batch_size: -1
  checkpoint: /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/results_no_code_change/train/rtdetr_model_latest.pth
  input_channel: 3
  input_height: 832
  input_width: 832
  on_cpu: false
  onnx_file: model.onnx
  opset_version: 12

$ rm -rf /results
$ rm model.onnx
$ export CUDA_VISIBLE_DEVICES=0
$ rtdetr export -e /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/spec.yaml

2295098451 · October 14, 2025, 5:51am

All right! The main issue is with the configuration file. When exporting the configuration file, it is necessary to include the “train_stpatial_size” from the “dataset”. However, my exported configuration file does not have this option and defaults to using 640 * 640.I can now export any size normally！ I encountered another problem again. When I exported onnx and converted it to engine on Deepstream for inference, there was no problem using FP32 precision, but with FP16 precision, there was no recognition box. There are the following warnings during conversion:

WARNING: [TRT]: Detected layernorm nodes in FP16: /model/stages.1/stages.1.1/norm/ReduceMean_1, /model/stages.2/stages.2.6/norm/ReduceMean_1, /model/stages.1/stages.1.0/norm/ReduceMean_1, /model/stages.2/stages.2.0/norm/ReduceMean_1, /model/stages.2/stages.2.3/norm/ReduceMean_1, /model/downsample_layers.3/downsample_layers.3.0/ReduceMean_1, /model/downsample_layers.2/downsample_layers.2.0/ReduceMean_1, /model/stages.1/stages.1.0/norm/Sqrt, /model/stages.2/stages.2.7/norm/ReduceMean_1, /model/stages.3/stages.3.0/norm/ReduceMean_1, /model/downsample_layers.0/downsample_layers.0.1/Sqrt, /model/stages.0/stages.0.1/norm/Sqrt, /model/stages.0/stages.0.0/norm/Sqrt, /model/stages.3/stages.3.1/norm/ReduceMean_1, /model/stages.2/stages.2.1/norm/ReduceMean_1, /model/stages.2/stages.2.4/norm/ReduceMean_1, /model/downsample_layers.1/downsample_layers.1.0/Sqrt, /model/downsample_layers.0/downsample_layers.0.1/ReduceMean_1, /model/stages.0/stages.0.0/norm/ReduceMean_1, /model/downsample_layers.0/downsample_layers.0.1/Sub, /model/downsample_layers.0/downsample_layers.0.1/Pow, /model/downsample_layers.0/downsample_layers.0.1/Add, /model/downsample_layers.0/downsample_layers.0.1/Div, /model/downsample_layers.0/downsample_layers.0.1/Mul, /model/downsample_layers.0/downsample_layers.0.1/Add_1, /model/stages.0/stages.0.0/norm/Sub, /model/stages.0/stages.0.0/norm/Pow, /model/stages.0/stages.0.0/norm/Add, /model/stages.0/stages.0.0/norm/Div, /model/stages.0/stages.0.0/norm/Mul, /model/stages.0/stages.0.0/norm/Add_1, /model/stages.0/stages.0.1/norm/Sub, /model/stages.0/stages.0.1/norm/Pow, /model/stages.0/stages.0.1/norm/Add, /model/stages.0/stages.0.1/norm/Div, /model/stages.0/stages.0.1/norm/Mul, /model/stages.0/stages.0.1/norm/Add_1, /model/downsample_layers.1/downsample_layers.1.0/Sub, /model/downsample_layers.1/downsample_layers.1.0/Pow, /model/downsample_layers.1/downsample_layers.1.0/Add, /model/downsample_layers.1/downsample_layers.1.0/Div, /model/downsample_layers.1/downsample_layers.1.0/Mul, /model/downsample_layers.1/downsample_layers.1.0/Add_1, /model/stages.1/stages.1.0/norm/Sub, /model/stages.1/stages.1.0/norm/Pow, /model/stages.1/stages.1.0/norm/Add, /model/stages.1/stages.1.0/norm/Div, /model/stages.1/stages.1.0/norm/Mul, /model/stages.1/stages.1.0/norm/Add_1, /model/stages.1/stages.1.1/norm/Sub, /model/stages.1/stages.1.1/norm/Pow, /model/stages.1/stages.1.1/norm/Add, /model/stages.1/stages.1.1/norm/Sqrt, /model/stages.1/stages.1.1/norm/Div, /model/stages.1/stages.1.1/norm/Mul, /model/stages.1/stages.1.1/norm/Add_1, /model/downsample_layers.2/downsample_layers.2.0/Sub, /model/downsample_layers.2/downsample_layers.2.0/Pow, /model/downsample_layers.2/downsample_layers.2.0/Add, /model/downsample_layers.2/downsample_layers.2.0/Sqrt, /model/downsample_layers.2/downsample_layers.2.0/Div, /model/downsample_layers.2/downsample_layers.2.0/Mul, /model/downsample_layers.2/downsample_layers.2.0/Add_1, /model/stages.2/stages.2.0/norm/Sub, /model/stages.2/stages.2.0/norm/Pow, /model/stages.2/stages.2.0/norm/Add, /model/stages.2/stages.2.0/norm/Sqrt, /model/stages.2/stages.2.0/norm/Div, /model/stages.2/stages.2.0/norm/Mul, /model/stages.2/stages.2.0/norm/Add_1, /model/stages.2/stages.2.1/norm/Sub, /model/stages.2/stages.2.1/norm/Pow, /model/stages.2/stages.2.1/norm/Add, /model/stages.2/stages.2.1/norm/Sqrt, /model/stages.2/stages.2.1/norm/Div, /model/stages.2/stages.2.1/norm/Mul, /model/stages.2/stages.2.1/norm/Add_1, /model/stages.2/stages.2.2/norm/Sub, /model/stages.2/stages.2.2/norm/Pow, /model/stages.2/stages.2.2/norm/Add, /model/stages.2/stages.2.2/norm/Sqrt, /model/stages.2/stages.2.2/norm/Div, /model/stages.2/stages.2.2/norm/Mul, /model/stages.2/stages.2.2/norm/Add_1, /model/stages.2/stages.2.3/norm/Sub, /model/stages.2/stages.2.3/norm/Pow, /model/stages.2/stages.2.3/norm/Add, /model/stages.2/stages.2.3/norm/Sqrt, /model/stages.2/stages.2.3/norm/Div, /model/stages.2/stages.2.3/norm/Mul, /model/stages.2/stages.2.3/norm/Add_1, /model/stages.2/stages.2.4/norm/Sub, /model/stages.2/stages.2.4/norm/Pow, /model/stages.2/stages.2.4/norm/Add, /model/s
WARNING: [TRT]: Running layernorm after self-attention in FP16 may cause overflow. Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.

I set ops et to 18 for this, but there was an error when converting the engine：

ERROR: [TRT]: ModelImporter.cpp:768: While parsing node number 772 [TopK -> "/model/decoder/TopK_output_0"]:
ERROR: [TRT]: ModelImporter.cpp:769: --- Begin node ---
ERROR: [TRT]: ModelImporter.cpp:770: input: "/model/decoder/ReduceMax_output_0"
input: "/model/decoder/Reshape_9_output_0"
output: "/model/decoder/TopK_output_0"
output: "/model/decoder/TopK_output_1"
name: "/model/decoder/TopK"
op_type: "TopK"
attribute {
  name: "axis"
  i: 1
  type: INT
}
attribute {
  name: "largest"
  i: 1
  type: INT
}
attribute {
  name: "sorted"
  i: 1
  type: INT
}

ERROR: [TRT]: ModelImporter.cpp:771: --- End node ---
ERROR: [TRT]: ModelImporter.cpp:773: ERROR: onnx2trt_utils.cpp:342 In function convertAxis:
[8] Assertion failed: (axis >= 0 && axis <= nbDims) && "Axis must be in the range [0, nbDims]."
ERROR: Failed to parse onnx file
ERROR: failed to build network since parsing model errors.
ERROR: failed to build network.
0:00:13.074210041 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2129> [UID = 1]: build engine file failed
0:00:13.473886176 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2215> [UID = 1]: build backend context failed
0:00:13.473961026 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1352> [UID = 1]: generate backend failed, check config file settings

Morganh · October 14, 2025, 6:40am

Glad to know this info. Thanks.

Could you please create a new forum topic? Thanks.

2295098451 · October 14, 2025, 6:51am

yes，I have already created

system · October 28, 2025, 6:51am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
The RTDETR model is used on DeepStream with FP16 precision and no bounding boxes TAO Toolkit jetson , deepstream	17	269	October 16, 2025
Trtexec convert onnx to engine fails TAO Toolkit	14	1462	October 30, 2023
Exporting model to onnx using "tao model segformer export" TAO Toolkit	5	624	September 6, 2023
Torch.onnx.export with dynamic size for craft TensorRT	6	4256	May 20, 2021
Accuracy drop in resize op when converting from ONNX to TRT FP32 TensorRT	5	1541	June 23, 2023
How to export the Pytorch model Keypoint R-CNN to onnx and benchmark with trtexec TensorRT	7	1459	July 14, 2022
Parse onnx file failed: Parameter check failed. condition:condition: allDimsGtEq(windowSize, 1) && volume(windowSize) < MAX_KERNEL_DIMS_PRODUCT(nbSpat TensorRT tensorrt	3	1011	May 17, 2022
Onnx to trt engine DeepStream SDK	5	963	October 12, 2021
Convert onnx to tensorrt error on Jetson Xavier. Jetson AGX Xavier	6	1729	October 18, 2021
Convert onnx to trt format using trtexec TensorRT tensorrt	5	5608	August 17, 2021

The issue of width and height when exporting onnx from rtdetr

Related topics