The issue of width and height when exporting onnx from rtdetr

Hello,I have encountered a problem

• Hardware
ubuntu22.04
tao toolkit6.25.9
NVIDIA version 535.183.01
RTX4090
• Network Type (rtdetr)

The size I set during training is:

When I export onnx, I set:

error report:

File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward result = self.forward(*input, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 203, in forward output = layer(output, src_mask=src_mask, pos_embed=pos_embed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward result = self.forward(*input, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 172, in forward q = k = self.with_pos_embed(src, pos_embed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 165, in with_pos_embed return tensor if pos_embed is None else tensor + pos_embed ~~~~~~~^~~~~~~~~~~ RuntimeError: The size of tensor a (527) must match the size of tensor b (400) at non-singleton dimension 1

I can only set 640 * 640 to successfully export. Does rtdetr not support rectangular export? And during my training, I couldn’t export so many sizes, for example, setting 992 * 992 would also result in an error when exporting

Hello, could you please take a look

@Morganh

Hello, can you see how to solve it

What is the error when export 992x992?

Error executing job with overrides: [‘export.checkpoint=/results/run/act03/rtdetr_model_latest.pth’, ‘export.onnx_file=/results/run/act03/export/rtdetr_model_latest.onnx’, ‘results_dir=/results/run/act03/’]Traceback (most recent call last):
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/core/decorators/workflow.py”, line 72, in _func
raise e
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/core/decorators/workflow.py”, line 51, in _func
runner(cfg, **kwargs)
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/scripts/export.py”, line 55, in main
run_export(cfg)
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/scripts/export.py”, line 182, in run_export
onnx_export.export_model(model, batch_size,
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/deformable_detr/utils/onnx_export.py”, line 74, in export_model
torch.onnx.export(model, dummy_input, onnx_file,
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/init.py”, line 383, in export
export(
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/utils.py”, line 495, in export
_export(
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/utils.py”, line 1428, in _export
graph, params_dict, torch_out = _model_to_graph(
^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/utils.py”, line 1053, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/utils.py”, line 937, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/onnx/utils.py”, line 844, in _trace_and_get_graph_from_model
trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/jit/_trace.py”, line 1498, in _get_trace_graph
outs = ONNXTracedModule(
^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/jit/_trace.py”, line 138, in forward
graph, _out = torch._C._create_graph_by_tracing(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/jit/_trace.py”, line 129, in wrapper
outs.append(self.inner(*trace_inputs))
^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/build_nn_model.py”, line 157, in forward
x = self.model(x, targets)
^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/rtdetr.py”, line 88, in forward
x, proj_feats = self.encoder(feats)
^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 339, in forward
memory = self.encoder[i](src_flatten, pos_embed=pos_embed) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 203, in forward
output = layer(output, src_mask=src_mask, pos_embed=pos_embed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forward
result = self.forward(*input, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 172, in forward
q = k = self.with_pos_embed(src, pos_embed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 165, in with_pos_embed
return tensor if pos_embed is None else tensor + pos_embed
~^~~~~
RuntimeError: The size of tensor a (961) must match the size of tensor b (400) at non-singleton dimension 1

Could you please try more cases? How about export to 704x704 , 800×800 or 960×960?

This is an 800 * 800 error, I only gave a partial:

File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1740, in _wrapped_call_implreturn self._call_impl(*args, **kwargs)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1751, in _call_implreturn forward_call(*args, **kwargs)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File “/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py”, line 1730, in _slow_forwardresult = self.forward(*input, **kwargs)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 172, in forwardq = k = self.with_pos_embed(src, pos_embed)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File “/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py”, line 165, in with_pos_embedreturn tensor if pos_embed is None else tensor + pos_embed

RuntimeError: The size of tensor a (625) must match the size of tensor b (400) at non-singleton dimension 1

I think the main reasons are: RuntimeError: The size of tensor a (625) must match the size of tensor b (400) at non-singleton dimension 1
800/32 = 25, 25*25=625
640/32 = 20, 20
*20=400
No matter what export size is used, as long as it is not 640 * 640, an error will be reported, similar to The size of tensor a (export size/32 * 2) must match the size of tensor b (400) at non-singleton dimension 1

According to tao_pytorch_backend/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py at main · NVIDIA/tao_pytorch_backend · GitHub, please retry to check if below works.

Set eval_spatial_size to below. Order is (h,w).

eval_spatial_size:
   -800
   -800

Also set export part to

input_width: 800
input_height: 800

Training settings:

  augmentation:
    multi_scales:
      - 480
      - 512
      - 544
      - 576
      - 608
      - 640
      - 672
      - 704
      - 736
      - 768
      - 800
    train_spatial_size:
      - 800
      - 800
    eval_spatial_size:
      - 800
      - 800

Set during export:

input_width: 800
input_height: 800

Still reporting errors:

  File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1730, in _slow_forward
    result = self.forward(*input, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py", line 203, in forward
    output = layer(output, src_mask=src_mask, pos_embed=pos_embed)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1740, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1751, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1730, in _slow_forward
    result = self.forward(*input, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py", line 172, in forward
    q = k = self.with_pos_embed(src, pos_embed)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/nvidia_tao_pytorch/cv/rtdetr/model/hybrid_encoder.py", line 165, in with_pos_embed
    return tensor if pos_embed is None else tensor + pos_embed
                                            ~~~~~~~^~~~~~~~~~~
RuntimeError: The size of tensor a (625) must match the size of tensor b (400) at non-singleton dimension 1

This is the configuration file for training and exporting:

They are files in yml format, and I can only upload files in txt format, so I changed the extension

export.txt (1.2 KB)

train-act.txt (3.1 KB)

Could you please check your trained pth model?

model = torch.load('your_model.pth')
print(model.encoder.pos_embed.shape)

report errors:

Traceback (most recent call last):
  File "d:\python\dataset\模型检查.py", line 4, in <module>
    print(model.encoder.pos_embed.shape)
AttributeError: 'dict' object has no attribute 'encoder'

I use the following code to print:

import torch

ckpt = torch.load(r"C:\Users\tianx\Desktop\rtdetr_model_latest.pth", map_location="cpu")
print(ckpt.keys())
state_dict = ckpt['state_dict']
print(state_dict.keys())
print(state_dict['model.encoder.pos_embed'].shape)

There is no model. encoder. pos_imbed inside

Hi, I cannot reproduce the original issue. I can export to various onnx. Please refer to my steps.
1.$ docker run --runtime=nvidia -it --rm -v /lustre/fsw/portfolios/edgeai/users/morganh:/lustre/fsw/portfolios/edgeai/users/morganh nvcr.io/nvidia/tao/tao-toolkit:6.25.9-pyt /bin/bash
2. Run training.
root@batch-block4-1080:/lustre/fsw/portfolios/edgeai/users/morganh/forum_repro# rtdetr train -e /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/spec.yaml

dataset:
  augmentation:
    distortion_prob: 0.8
    eval_spatial_size:
    - 800
    - 800
    iou_crop_prob: 0.8
    multi_scales:
    - 480
    - 512
    - 544
    - 576
    - 608
    - 640
    - 672
    - 704
    - 736
    - 768
    - 800
    preserve_aspect_ratio: false
    train_spatial_size:
    - 800
    - 800
  batch_size: 4
  num_classes: 80
  remap_mscoco_category: true
  train_data_sources:
  - image_dir: /lustre/fsw/portfolios/edgeai/users/morganh/coco_3/coco_data/raw-data/val2017
    json_file: /lustre/fsw/portfolios/edgeai/users/morganh/coco_3/coco_data/raw-data/annotations/instances_val2017.json
  val_data_sources:
    image_dir: /lustre/fsw/portfolios/edgeai/users/morganh/coco_3/coco_data/raw-data/val2017
    json_file: /lustre/fsw/portfolios/edgeai/users/morganh/coco_3/coco_data/raw-data/annotations/instances_val2017.json
  workers: 6
export:
  batch_size: -1
  checkpoint: /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/results_no_code_change/train/rtdetr_model_latest.pth
  input_channel: 3
  input_height: 768
  input_width: 768
  on_cpu: false
  onnx_file: model.onnx
  opset_version: 12
model:
  act: silu
  alpha: 0.75
  aux_loss: true
  backbone: resnet_50
  backbone_names:
  - backbone.0
  bbox_loss_coef: 5.0
  dec_layers: 6
  depth_mult: 1
  dim_feedforward: 1024
  dn_number: 100
  dropout_ratio: 0.0
  enc_act: gelu
  enc_layers: 1
  eval_idx: -1
  expansion: 1.0
  feat_channels:
  - 256
  - 256
  - 256
  feat_strides:
  - 8
  - 16
  - 32
  gamma: 2.0
  giou_loss_coef: 2.0
  hidden_dim: 256
  linear_proj_names:
  - reference_points
  - sampling_offsets
  loss_types:
  - vfl
  - boxes
  nheads: 8
  num_feature_levels: 3
  num_queries: 300
  num_select: 300
  pe_temperature: 10000
  return_interm_indices:
  - 1
  - 2
  - 3
  train_backbone: false
  use_encoder_idx:
  - 2
  vfl_loss_coef: 1.0
train:
  activation_checkpoint: true
  checkpoint_interval: 1
  clip_grad_norm: 0.1
  distributed_strategy: ddp
  enable_ema: true
  gpu_ids:
  - 0
  num_epochs: 1
  num_gpus: 1
  num_nodes: 1
  optim:
    lr: 0.0001
    lr_backbone: 1.0e-05
    lr_decay: 0.1
    lr_scheduler: MultiStep
    lr_steps:
    - 40
    momentum: 0.9
    optimizer: AdamW
    weight_decay: 0.0001
  precision: fp32
  seed: 1234
  validation_interval: 1

  1. After training,
    $ mv /results /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/results_no_code_change
    $ rm -rf /results

  2. Exporting to 768x768 onnx

  augmentation:
    distortion_prob: 0.8
    eval_spatial_size:
    - 768
    - 768
    iou_crop_prob: 0.8
    multi_scales:
    - 480
    - 512
    - 544
    - 576
    - 608
    - 640
    - 672
    - 704
    - 736
    - 768
    - 800
    preserve_aspect_ratio: false
    train_spatial_size:
    - 768
    - 768
export:
  batch_size: -1
  checkpoint: /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/results_no_code_change/train/rtdetr_model_latest.pth
  input_channel: 3
  input_height: 768
  input_width: 768
  on_cpu: false
  onnx_file: model.onnx
  opset_version: 12

$ rm -rf /results
$ rm model.onnx
$ export CUDA_VISIBLE_DEVICES=0
$ rtdetr export -e /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/spec.yaml

  1. Export to 832x832 onnx file.
  augmentation:
    distortion_prob: 0.8
    eval_spatial_size:
    - 832
    - 832
    iou_crop_prob: 0.8
    multi_scales:
    - 480
    - 512
    - 544
    - 576
    - 608
    - 640
    - 672
    - 704
    - 736
    - 768
    - 800
    preserve_aspect_ratio: false
    train_spatial_size:
    - 832
    - 832
export:
  batch_size: -1
  checkpoint: /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/results_no_code_change/train/rtdetr_model_latest.pth
  input_channel: 3
  input_height: 832
  input_width: 832
  on_cpu: false
  onnx_file: model.onnx
  opset_version: 12

$ rm -rf /results
$ rm model.onnx
$ export CUDA_VISIBLE_DEVICES=0
$ rtdetr export -e /lustre/fsw/portfolios/edgeai/users/morganh/forum_repro/spec.yaml

All right! The main issue is with the configuration file. When exporting the configuration file, it is necessary to include the “train_stpatial_size” from the “dataset”. However, my exported configuration file does not have this option and defaults to using 640 * 640.I can now export any size normally! I encountered another problem again. When I exported onnx and converted it to engine on Deepstream for inference, there was no problem using FP32 precision, but with FP16 precision, there was no recognition box. There are the following warnings during conversion:

WARNING: [TRT]: Detected layernorm nodes in FP16: /model/stages.1/stages.1.1/norm/ReduceMean_1, /model/stages.2/stages.2.6/norm/ReduceMean_1, /model/stages.1/stages.1.0/norm/ReduceMean_1, /model/stages.2/stages.2.0/norm/ReduceMean_1, /model/stages.2/stages.2.3/norm/ReduceMean_1, /model/downsample_layers.3/downsample_layers.3.0/ReduceMean_1, /model/downsample_layers.2/downsample_layers.2.0/ReduceMean_1, /model/stages.1/stages.1.0/norm/Sqrt, /model/stages.2/stages.2.7/norm/ReduceMean_1, /model/stages.3/stages.3.0/norm/ReduceMean_1, /model/downsample_layers.0/downsample_layers.0.1/Sqrt, /model/stages.0/stages.0.1/norm/Sqrt, /model/stages.0/stages.0.0/norm/Sqrt, /model/stages.3/stages.3.1/norm/ReduceMean_1, /model/stages.2/stages.2.1/norm/ReduceMean_1, /model/stages.2/stages.2.4/norm/ReduceMean_1, /model/downsample_layers.1/downsample_layers.1.0/Sqrt, /model/downsample_layers.0/downsample_layers.0.1/ReduceMean_1, /model/stages.0/stages.0.0/norm/ReduceMean_1, /model/downsample_layers.0/downsample_layers.0.1/Sub, /model/downsample_layers.0/downsample_layers.0.1/Pow, /model/downsample_layers.0/downsample_layers.0.1/Add, /model/downsample_layers.0/downsample_layers.0.1/Div, /model/downsample_layers.0/downsample_layers.0.1/Mul, /model/downsample_layers.0/downsample_layers.0.1/Add_1, /model/stages.0/stages.0.0/norm/Sub, /model/stages.0/stages.0.0/norm/Pow, /model/stages.0/stages.0.0/norm/Add, /model/stages.0/stages.0.0/norm/Div, /model/stages.0/stages.0.0/norm/Mul, /model/stages.0/stages.0.0/norm/Add_1, /model/stages.0/stages.0.1/norm/Sub, /model/stages.0/stages.0.1/norm/Pow, /model/stages.0/stages.0.1/norm/Add, /model/stages.0/stages.0.1/norm/Div, /model/stages.0/stages.0.1/norm/Mul, /model/stages.0/stages.0.1/norm/Add_1, /model/downsample_layers.1/downsample_layers.1.0/Sub, /model/downsample_layers.1/downsample_layers.1.0/Pow, /model/downsample_layers.1/downsample_layers.1.0/Add, /model/downsample_layers.1/downsample_layers.1.0/Div, /model/downsample_layers.1/downsample_layers.1.0/Mul, /model/downsample_layers.1/downsample_layers.1.0/Add_1, /model/stages.1/stages.1.0/norm/Sub, /model/stages.1/stages.1.0/norm/Pow, /model/stages.1/stages.1.0/norm/Add, /model/stages.1/stages.1.0/norm/Div, /model/stages.1/stages.1.0/norm/Mul, /model/stages.1/stages.1.0/norm/Add_1, /model/stages.1/stages.1.1/norm/Sub, /model/stages.1/stages.1.1/norm/Pow, /model/stages.1/stages.1.1/norm/Add, /model/stages.1/stages.1.1/norm/Sqrt, /model/stages.1/stages.1.1/norm/Div, /model/stages.1/stages.1.1/norm/Mul, /model/stages.1/stages.1.1/norm/Add_1, /model/downsample_layers.2/downsample_layers.2.0/Sub, /model/downsample_layers.2/downsample_layers.2.0/Pow, /model/downsample_layers.2/downsample_layers.2.0/Add, /model/downsample_layers.2/downsample_layers.2.0/Sqrt, /model/downsample_layers.2/downsample_layers.2.0/Div, /model/downsample_layers.2/downsample_layers.2.0/Mul, /model/downsample_layers.2/downsample_layers.2.0/Add_1, /model/stages.2/stages.2.0/norm/Sub, /model/stages.2/stages.2.0/norm/Pow, /model/stages.2/stages.2.0/norm/Add, /model/stages.2/stages.2.0/norm/Sqrt, /model/stages.2/stages.2.0/norm/Div, /model/stages.2/stages.2.0/norm/Mul, /model/stages.2/stages.2.0/norm/Add_1, /model/stages.2/stages.2.1/norm/Sub, /model/stages.2/stages.2.1/norm/Pow, /model/stages.2/stages.2.1/norm/Add, /model/stages.2/stages.2.1/norm/Sqrt, /model/stages.2/stages.2.1/norm/Div, /model/stages.2/stages.2.1/norm/Mul, /model/stages.2/stages.2.1/norm/Add_1, /model/stages.2/stages.2.2/norm/Sub, /model/stages.2/stages.2.2/norm/Pow, /model/stages.2/stages.2.2/norm/Add, /model/stages.2/stages.2.2/norm/Sqrt, /model/stages.2/stages.2.2/norm/Div, /model/stages.2/stages.2.2/norm/Mul, /model/stages.2/stages.2.2/norm/Add_1, /model/stages.2/stages.2.3/norm/Sub, /model/stages.2/stages.2.3/norm/Pow, /model/stages.2/stages.2.3/norm/Add, /model/stages.2/stages.2.3/norm/Sqrt, /model/stages.2/stages.2.3/norm/Div, /model/stages.2/stages.2.3/norm/Mul, /model/stages.2/stages.2.3/norm/Add_1, /model/stages.2/stages.2.4/norm/Sub, /model/stages.2/stages.2.4/norm/Pow, /model/stages.2/stages.2.4/norm/Add, /model/s
WARNING: [TRT]: Running layernorm after self-attention in FP16 may cause overflow. Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.

I set ops et to 18 for this, but there was an error when converting the engine:

ERROR: [TRT]: ModelImporter.cpp:768: While parsing node number 772 [TopK -> "/model/decoder/TopK_output_0"]:
ERROR: [TRT]: ModelImporter.cpp:769: --- Begin node ---
ERROR: [TRT]: ModelImporter.cpp:770: input: "/model/decoder/ReduceMax_output_0"
input: "/model/decoder/Reshape_9_output_0"
output: "/model/decoder/TopK_output_0"
output: "/model/decoder/TopK_output_1"
name: "/model/decoder/TopK"
op_type: "TopK"
attribute {
  name: "axis"
  i: 1
  type: INT
}
attribute {
  name: "largest"
  i: 1
  type: INT
}
attribute {
  name: "sorted"
  i: 1
  type: INT
}

ERROR: [TRT]: ModelImporter.cpp:771: --- End node ---
ERROR: [TRT]: ModelImporter.cpp:773: ERROR: onnx2trt_utils.cpp:342 In function convertAxis:
[8] Assertion failed: (axis >= 0 && axis <= nbDims) && "Axis must be in the range [0, nbDims]."
ERROR: Failed to parse onnx file
ERROR: failed to build network since parsing model errors.
ERROR: failed to build network.
0:00:13.074210041 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2129> [UID = 1]: build engine file failed
0:00:13.473886176 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2215> [UID = 1]: build backend context failed
0:00:13.473961026 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1352> [UID = 1]: generate backend failed, check config file settings

Glad to know this info. Thanks.

Could you please create a new forum topic? Thanks.

yes,I have already created

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.