The RTDETR model is used on DeepStream with FP16 precision and no bounding boxes

2295098451 · October 14, 2025, 6:50am

Hello，I have encountered a problem

training platform：

• Hardware
ubuntu22.04
tao toolkit6.25.9
NVIDIA version 535.183.01
RTX4090
• Network Type (rtdetr)

deployment platform：

NVIDIA Jetson Orin Nano (8GB ram)

When I exported onnx and converted it to engine on Deepstream for inference, there was no problem using FP32 precision, but with FP16 precision, there was no recognition box. There are the following warnings during conversion:

WARNING: [TRT]: Detected layernorm nodes in FP16: /model/stages.1/stages.1.1/norm/ReduceMean_1, /model/stages.2/stages.2.6/norm/ReduceMean_1, /model/stages.1/stages.1.0/norm/ReduceMean_1, /model/stages.2/stages.2.0/norm/ReduceMean_1, /model/stages.2/stages.2.3/norm/ReduceMean_1, /model/downsample_layers.3/downsample_layers.3.0/ReduceMean_1, /model/downsample_layers.2/downsample_layers.2.0/ReduceMean_1, /model/stages.1/stages.1.0/norm/Sqrt, /model/stages.2/stages.2.7/norm/ReduceMean_1, /model/stages.3/stages.3.0/norm/ReduceMean_1, /model/downsample_layers.0/downsample_layers.0.1/Sqrt, /model/stages.0/stages.0.1/norm/Sqrt, /model/stages.0/stages.0.0/norm/Sqrt, /model/stages.3/stages.3.1/norm/ReduceMean_1, /model/stages.2/stages.2.1/norm/ReduceMean_1, /model/stages.2/stages.2.4/norm/ReduceMean_1, /model/downsample_layers.1/downsample_layers.1.0/Sqrt, /model/downsample_layers.0/downsample_layers.0.1/ReduceMean_1, /model/stages.0/stages.0.0/norm/ReduceMean_1, /model/downsample_layers.0/downsample_layers.0.1/Sub, /model/downsample_layers.0/downsample_layers.0.1/Pow, /model/downsample_layers.0/downsample_layers.0.1/Add, /model/downsample_layers.0/downsample_layers.0.1/Div, /model/downsample_layers.0/downsample_layers.0.1/Mul, /model/downsample_layers.0/downsample_layers.0.1/Add_1, /model/stages.0/stages.0.0/norm/Sub, /model/stages.0/stages.0.0/norm/Pow, /model/stages.0/stages.0.0/norm/Add, /model/stages.0/stages.0.0/norm/Div, /model/stages.0/stages.0.0/norm/Mul, /model/stages.0/stages.0.0/norm/Add_1, /model/stages.0/stages.0.1/norm/Sub, /model/stages.0/stages.0.1/norm/Pow, /model/stages.0/stages.0.1/norm/Add, /model/stages.0/stages.0.1/norm/Div, /model/stages.0/stages.0.1/norm/Mul, /model/stages.0/stages.0.1/norm/Add_1, /model/downsample_layers.1/downsample_layers.1.0/Sub, /model/downsample_layers.1/downsample_layers.1.0/Pow, /model/downsample_layers.1/downsample_layers.1.0/Add, /model/downsample_layers.1/downsample_layers.1.0/Div, /model/downsample_layers.1/downsample_layers.1.0/Mul, /model/downsample_layers.1/downsample_layers.1.0/Add_1, /model/stages.1/stages.1.0/norm/Sub, /model/stages.1/stages.1.0/norm/Pow, /model/stages.1/stages.1.0/norm/Add, /model/stages.1/stages.1.0/norm/Div, /model/stages.1/stages.1.0/norm/Mul, /model/stages.1/stages.1.0/norm/Add_1, /model/stages.1/stages.1.1/norm/Sub, /model/stages.1/stages.1.1/norm/Pow, /model/stages.1/stages.1.1/norm/Add, /model/stages.1/stages.1.1/norm/Sqrt, /model/stages.1/stages.1.1/norm/Div, /model/stages.1/stages.1.1/norm/Mul, /model/stages.1/stages.1.1/norm/Add_1, /model/downsample_layers.2/downsample_layers.2.0/Sub, /model/downsample_layers.2/downsample_layers.2.0/Pow, /model/downsample_layers.2/downsample_layers.2.0/Add, /model/downsample_layers.2/downsample_layers.2.0/Sqrt, /model/downsample_layers.2/downsample_layers.2.0/Div, /model/downsample_layers.2/downsample_layers.2.0/Mul, /model/downsample_layers.2/downsample_layers.2.0/Add_1, /model/stages.2/stages.2.0/norm/Sub, /model/stages.2/stages.2.0/norm/Pow, /model/stages.2/stages.2.0/norm/Add, /model/stages.2/stages.2.0/norm/Sqrt, /model/stages.2/stages.2.0/norm/Div, /model/stages.2/stages.2.0/norm/Mul, /model/stages.2/stages.2.0/norm/Add_1, /model/stages.2/stages.2.1/norm/Sub, /model/stages.2/stages.2.1/norm/Pow, /model/stages.2/stages.2.1/norm/Add, /model/stages.2/stages.2.1/norm/Sqrt, /model/stages.2/stages.2.1/norm/Div, /model/stages.2/stages.2.1/norm/Mul, /model/stages.2/stages.2.1/norm/Add_1, /model/stages.2/stages.2.2/norm/Sub, /model/stages.2/stages.2.2/norm/Pow, /model/stages.2/stages.2.2/norm/Add, /model/stages.2/stages.2.2/norm/Sqrt, /model/stages.2/stages.2.2/norm/Div, /model/stages.2/stages.2.2/norm/Mul, /model/stages.2/stages.2.2/norm/Add_1, /model/stages.2/stages.2.3/norm/Sub, /model/stages.2/stages.2.3/norm/Pow, /model/stages.2/stages.2.3/norm/Add, /model/stages.2/stages.2.3/norm/Sqrt, /model/stages.2/stages.2.3/norm/Div, /model/stages.2/stages.2.3/norm/Mul, /model/stages.2/stages.2.3/norm/Add_1, /model/stages.2/stages.2.4/norm/Sub, /model/stages.2/stages.2.4/norm/Pow, /model/stages.2/stages.2.4/norm/Add, /model/s
WARNING: [TRT]: Running layernorm after self-attention in FP16 may cause overflow. Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.

I set opset to 18 for this, but there was an error when converting the engine：

ERROR: [TRT]: ModelImporter.cpp:768: While parsing node number 772 [TopK -> "/model/decoder/TopK_output_0"]:
ERROR: [TRT]: ModelImporter.cpp:769: --- Begin node ---
ERROR: [TRT]: ModelImporter.cpp:770: input: "/model/decoder/ReduceMax_output_0"
input: "/model/decoder/Reshape_9_output_0"
output: "/model/decoder/TopK_output_0"
output: "/model/decoder/TopK_output_1"
name: "/model/decoder/TopK"
op_type: "TopK"
attribute {
  name: "axis"
  i: 1
  type: INT
}
attribute {
  name: "largest"
  i: 1
  type: INT
}
attribute {
  name: "sorted"
  i: 1
  type: INT
}

ERROR: [TRT]: ModelImporter.cpp:771: --- End node ---
ERROR: [TRT]: ModelImporter.cpp:773: ERROR: onnx2trt_utils.cpp:342 In function convertAxis:
[8] Assertion failed: (axis >= 0 && axis <= nbDims) && "Axis must be in the range [0, nbDims]."
ERROR: Failed to parse onnx file
ERROR: failed to build network since parsing model errors.
ERROR: failed to build network.
0:00:13.074210041 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2129> [UID = 1]: build engine file failed
0:00:13.473886176 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2215> [UID = 1]: build backend context failed
0:00:13.473961026 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1352> [UID = 1]: generate backend failed, check config file settings

2295098451 · October 14, 2025, 6:53am

@Morganh

Hello, I’m going to trouble you again,Thanks.

Morganh · October 14, 2025, 7:07am

To narrow down, please try to run some experiments.

Please try to export an onnx file with a newer opset version. For example, version 18. You can refer to my previous spec.yaml.
Please try to generate engine inside the tao-deploy docker instead of deepstream.
The tao-deploy docker can be found in GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC . nvcr.io/nvidia/tao/tao-toolkit:6.25.9-deploy
Then inside the tao-deply docker, run trtexec to generate the TensorRT engine. Refer to TRTEXEC with RT-DETR — Tao Toolkit.

2295098451 · October 14, 2025, 7:24am

I tried setting opset to 18 in tao Docker to export onnx, then generating an engine with fp16 precision in Docker and testing it in a container, but encountered an error: No bounding box coordinates available

（I did not run this engine on the Deepstream platform because the container environment is different from the deployment environment）

Starting rtdetr trt_inference.

Error drawing bbox for prediction [ 2. nan nan nan nan nan]/s]Error drawing bbox for prediction [ 2. nan nan nan nan nan]
Error drawing bbox for prediction [ 2. nan nan nan nan nan]  1.86s/it]Error drawing bbox for prediction [ 2. nan nan nan nan nan]
Error drawing bbox for prediction [ 2. nan nan nan nan nan]  1.55s/it]Error drawing bbox for prediction [ 2. nan nan nan nan nan]
Error drawing bbox for prediction [ 2. nan nan nan nan nan]  1.66s/it]Error drawing bbox for prediction [ 2. nan nan nan nan nan]
Error drawing bbox for prediction [ 2. nan nan nan nan nan]  1.71s/it]Error drawing bbox for prediction [ 2. nan nan nan nan nan]
Error drawing bbox for prediction [ 2. nan nan nan nan nan]  1.89s/it]Error drawing bbox for prediction [ 2. nan nan nan nan nan]

Morganh · October 14, 2025, 7:43am

2295098451:

WARNING: [TRT]: Detected layernorm nodes in FP16: /model/stages.1/stages.1.1/norm/ReduceMean_1, /model/stages.2/stages.2.6/norm/ReduceMean_1, /model/stages.1/stages.1.0/norm/ReduceMean_1, /model/stages.2/stages.2.0/norm/ReduceMean_1, /model/stages.2/stages.2.3/norm/ReduceMean_1, /model/downsample_layers.3/downsample_layers.3.0/ReduceMean_1, /model/downsample_layers.2/downsample_layers.2.0/ReduceMean_1, /model/stages.1/stages.1.0/norm/Sqrt, /model/stages.2/stages.2.7/norm/ReduceMean_1, /model/stages.3/stages.3.0/norm/ReduceMean_1, /model/downsample_layers.0/downsample_layers.0.1/Sqrt, /model/stages.0/stages.0.1/norm/Sqrt, /model/stages.0/stages.0.0/norm/Sqrt, /model/stages.3/stages.3.1/norm/ReduceMean_1, /model/stages.2/stages.2.1/norm/ReduceMean_1, /model/stages.2/stages.2.4/norm/ReduceMean_1, /model/downsample_layers.1/downsample_layers.1.0/Sqrt, /model/downsample_layers.0/downsample_layers.0.1/ReduceMean_1, /model/stages.0/stages.0.0/norm/ReduceMean_1, /model/downsample_layers.0/downsample_layers.0.1/Sub, /model/downsample_layers.0/downsample_layers.0.1/Pow, /model/downsample_layers.0/downsample_layers.0.1/Add, /model/downsample_layers.0/downsample_layers.0.1/Div, /model/downsample_layers.0/downsample_layers.0.1/Mul, /model/downsample_layers.0/downsample_layers.0.1/Add_1, /model/stages.0/stages.0.0/norm/Sub, /model/stages.0/stages.0.0/norm/Pow, /model/stages.0/stages.0.0/norm/Add, /model/stages.0/stages.0.0/norm/Div, /model/stages.0/stages.0.0/norm/Mul, /model/stages.0/stages.0.0/norm/Add_1, /model/stages.0/stages.0.1/norm/Sub, /model/stages.0/stages.0.1/norm/Pow, /model/stages.0/stages.0.1/norm/Add, /model/stages.0/stages.0.1/norm/Div, /model/stages.0/stages.0.1/norm/Mul, /model/stages.0/stages.0.1/norm/Add_1, /model/downsample_layers.1/downsample_layers.1.0/Sub, /model/downsample_layers.1/downsample_layers.1.0/Pow, /model/downsample_layers.1/downsample_layers.1.0/Add, /model/downsample_layers.1/downsample_layers.1.0/Div, /model/downsample_layers.1/downsample_layers.1.0/Mul, /model/downsample_layers.1/downsample_layers.1.0/Add_1, /model/stages.1/stages.1.0/norm/Sub, /model/stages.1/stages.1.0/norm/Pow, /model/stages.1/stages.1.0/norm/Add, /model/stages.1/stages.1.0/norm/Div, /model/stages.1/stages.1.0/norm/Mul, /model/stages.1/stages.1.0/norm/Add_1, /model/stages.1/stages.1.1/norm/Sub, /model/stages.1/stages.1.1/norm/Pow, /model/stages.1/stages.1.1/norm/Add, /model/stages.1/stages.1.1/norm/Sqrt, /model/stages.1/stages.1.1/norm/Div, /model/stages.1/stages.1.1/norm/Mul, /model/stages.1/stages.1.1/norm/Add_1, /model/downsample_layers.2/downsample_layers.2.0/Sub, /model/downsample_layers.2/downsample_layers.2.0/Pow, /model/downsample_layers.2/downsample_layers.2.0/Add, /model/downsample_layers.2/downsample_layers.2.0/Sqrt, /model/downsample_layers.2/downsample_layers.2.0/Div, /model/downsample_layers.2/downsample_layers.2.0/Mul, /model/downsample_layers.2/downsample_layers.2.0/Add_1, /model/stages.2/stages.2.0/norm/Sub, /model/stages.2/stages.2.0/norm/Pow, /model/stages.2/stages.2.0/norm/Add, /model/stages.2/stages.2.0/norm/Sqrt, /model/stages.2/stages.2.0/norm/Div, /model/stages.2/stages.2.0/norm/Mul, /model/stages.2/stages.2.0/norm/Add_1, /model/stages.2/stages.2.1/norm/Sub, /model/stages.2/stages.2.1/norm/Pow, /model/stages.2/stages.2.1/norm/Add, /model/stages.2/stages.2.1/norm/Sqrt, /model/stages.2/stages.2.1/norm/Div, /model/stages.2/stages.2.1/norm/Mul, /model/stages.2/stages.2.1/norm/Add_1, /model/stages.2/stages.2.2/norm/Sub, /model/stages.2/stages.2.2/norm/Pow, /model/stages.2/stages.2.2/norm/Add, /model/stages.2/stages.2.2/norm/Sqrt, /model/stages.2/stages.2.2/norm/Div, /model/stages.2/stages.2.2/norm/Mul, /model/stages.2/stages.2.2/norm/Add_1, /model/stages.2/stages.2.3/norm/Sub, /model/stages.2/stages.2.3/norm/Pow, /model/stages.2/stages.2.3/norm/Add, /model/stages.2/stages.2.3/norm/Sqrt, /model/stages.2/stages.2.3/norm/Div, /model/stages.2/stages.2.3/norm/Mul, /model/stages.2/stages.2.3/norm/Add_1, /model/stages.2/stages.2.4/norm/Sub, /model/stages.2/stages.2.4/norm/Pow, /model/stages.2/stages.2.4/norm/Add, /model/s
WARNING: [TRT]: Running layernorm after self-attention in FP16 may cause overflow. Exporting the model to the latest available ONNX opset (later than opset 17) to use the INormalizationLayer, or forcing layernorm layers to run in FP32 precision can help with preserving accuracy.

Any change for previous log?

2295098451 · October 14, 2025, 7:48am

This is a warning when converting engines on the Deepstream platform (opset<=17). When opset=18, an error will be reported on the Deepstream platform:

ERROR: [TRT]: ModelImporter.cpp:768: While parsing node number 772 [TopK -> "/model/decoder/TopK_output_0"]:
ERROR: [TRT]: ModelImporter.cpp:769: --- Begin node ---
ERROR: [TRT]: ModelImporter.cpp:770: input: "/model/decoder/ReduceMax_output_0"
input: "/model/decoder/Reshape_9_output_0"
output: "/model/decoder/TopK_output_0"
output: "/model/decoder/TopK_output_1"
name: "/model/decoder/TopK"
op_type: "TopK"
attribute {
  name: "axis"
  i: 1
  type: INT
}
attribute {
  name: "largest"
  i: 1
  type: INT
}
attribute {
  name: "sorted"
  i: 1
  type: INT
}

ERROR: [TRT]: ModelImporter.cpp:771: --- End node ---
ERROR: [TRT]: ModelImporter.cpp:773: ERROR: onnx2trt_utils.cpp:342 In function convertAxis:
[8] Assertion failed: (axis >= 0 && axis <= nbDims) && "Axis must be in the range [0, nbDims]."
ERROR: Failed to parse onnx file
ERROR: failed to build network since parsing model errors.
ERROR: failed to build network.
0:00:13.074210041 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2129> [UID = 1]: build engine file failed
0:00:13.473886176 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2215> [UID = 1]: build backend context failed
0:00:13.473961026 3167867 0xaaaaf299f130 ERROR                nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1352> [UID = 1]: generate backend failed, check config file settings

Morganh · October 14, 2025, 7:52am

Please share the full log when you use trtexec to generate fp16 engine inside the nvcr.io/nvidia/tao/tao-toolkit:6.25.9-deploy docker. Thanks.

2295098451 · October 14, 2025, 7:54am

Okay, this is the complete log：

2025-10-14 15:13:03,966 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2025-10-14 15:13:04,080 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:6.25.9-deploy
2025-10-14 15:13:04,160 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 308: Printing tty value True
sys:1: UserWarning: 
'gen_trt_engine.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
/usr/local/lib/python3.12/dist-packages/nvidia_tao_deploy/cv/common/hydra/hydra_runner.py:99: UserWarning: 
'gen_trt_engine.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
  _run_hydra(
/usr/local/lib/python3.12/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(
Gen_trt_engine results will be saved at: /results/run/person-act/act4/gen_trt_engine/gen_trt_engine
Log file already exists at /results/run/person-act/act4/gen_trt_engine/gen_trt_engine/status.json
Starting rtdetr gen_trt_engine.
[10/14/2025-07:13:10] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +2755, GPU +446, now: CPU 3011, GPU 1224 (MiB)Setting up QAT mode: False
[10/14/2025-07:13:10] [TRT] [I] Successfully created plugin: MultiscaleDeformableAttnPlugin_TRTParsing ONNX modelin_namespace:gin.
List inputs:
Input 0 -> inputs.
(3, 640, 640).
-1.
Network Description
Input 'inputs' with shape (-1, 3, 640, 640) and dtype DataType.FLOAT
Output 'pred_logits' with shape (-1, 100, 4) and dtype DataType.FLOAT
Output 'pred_boxes' with shape (-1, 100, 4) and dtype DataType.FLOAT
TensorRT engine build configurations:
  OptimizationProfile: 
    "inputs": (1, 3, 640, 640), (8, 3, 640, 640), (8, 3, 640, 640)
 
  BuilderFlag.FP16
  BuilderFlag.TF32
 
  Note: max representabile value is 2,147,483,648 bytes or 2GB.
  MemoryPoolType.WORKSPACE = 2147483648 bytes
  MemoryPoolType.DLA_MANAGED_SRAM = 0 bytes
  MemoryPoolType.DLA_LOCAL_DRAM = 1073741824 bytes
  MemoryPoolType.DLA_GLOBAL_DRAM = 536870912 bytes
  MemoryPoolType.TACTIC_DRAM = 25393692672 bytes
  MemoryPoolType.TACTIC_SHARED_MEMORY = 1073741824 bytes
 
  Tactic Sources = 24
[10/14/2025-07:16:29] [TRT] [I] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 38 MiB, GPU 813 MiBquiring 439971328 bytes.FP32 precision, or exporting the model to use INormalizationLayer (available with ONNX opset >= 17) can help preserving accuracy.Engine build finished successfully.
Gen_trt_engine finished successfully.
2025-10-14 07:16:31,277 - nvidia_tao_deploy.cv.common.entrypoint.entrypoint_hydra - WARNING - Telemetry data couldn't be sent, but the command ran successfully.
2025-10-14 07:16:31,278 - nvidia_tao_deploy.cv.common.entrypoint.entrypoint_hydra - WARNING - 'str' object has no attribute 'decode'
2025-10-14 07:16:31,278 - nvidia_tao_deploy.cv.common.entrypoint.entrypoint_hydra - INFO - Execution status: PASS
2025-10-14 15:16:32,074 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 371: Stopping container.

Morganh · October 14, 2025, 8:05am

Then still inside the tao-deploy docker, please try to run inference with this fp16 engine.
Please refer to the command and config in RT-DETR with TAO Deploy — Tao Toolkit.

Please note, when you run command inside the tao-deploy docker, it is not needed to add tao deploy in the beginning of the command, i.e., $ rtdetr inference xxx

2295098451 · October 14, 2025, 8:56am

I created a new container tao toolkit: 6.25.9-deploy from the image and ran the inference of engine fp16. The log is as follows

root@293ce86a9252:/usr/local/lib/python3.12/dist-packages/nvidia_tao_deploy/cv/rtdetr/scripts# python3 inference.py     --config-path /workspace/tao-experiments/rtdetr/specs     --config-name infer
sys:1: UserWarning: 
'infer' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
/usr/local/lib/python3.12/dist-packages/nvidia_tao_deploy/cv/common/hydra/hydra_runner.py:99: UserWarning: 
'infer' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/automatic_schema_matching for migration instructions.
  _run_hydra(
/usr/local/lib/python3.12/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
  ret = run_job(
Trt_inference results will be saved at: /workspace/tao-experiments/rtdetr/run/person-act/act4/infer
Log file already exists at /workspace/tao-experiments/rtdetr/run/person-act/act4/infer/status.json
Starting rtdetr trt_inference.
Producing predictions:   0%|                                                                                                                                      | 0/271 [00:00<?, ?it/s]Error drawing bbox for prediction [ 2. nan nan nan nan nan]
Error drawing bbox for prediction [ 0. nan nan nan nan nan]
Error drawing bbox for prediction [ 1. nan nan nan nan nan]
Error drawing bbox for prediction [ 2. nan nan nan nan nan]
Error drawing bbox for prediction [ 0. nan nan nan nan nan]
Error drawing bbox for prediction [ 1. nan nan nan nan nan]
Error drawing bbox for prediction [ 2. nan nan nan nan nan]

Morganh · October 14, 2025, 9:04am

Use the same way, inside the tao-deploy docker, please try to run inference with this fp32 engine. The result is expected, right?

2295098451 · October 15, 2025, 12:18am

Yes, this is normal：

Trt_inference results will be saved at: /workspace/tao-experiments/rtdetr/run/person-act/act4/infer
Log file already exists at /workspace/tao-experiments/rtdetr/run/person-act/act4/infer/status.json
Starting rtdetr trt_inference.
Producing predictions:   4%|████▌                                                                                                                        | 10/271 [00:13<05:44,  1.32s/it]

Morganh · October 15, 2025, 3:24am

OK.

For FP16, please generate a new onnx with opset 17 and retest.
In nvcr.io/nvidia/tao/tao-toolkit:6.25.9-deploy, the TensorRT version is 10.8.0.43. Refer to Release Notes — NVIDIA TensorRT Documentation,

Suggest to test with opset17 also.

2295098451 · October 15, 2025, 7:20am

Same result：

Trt_inference results will be saved at: /workspace/tao-experiments/rtdetr/run/person-act/act4/infer
Log file already exists at /workspace/tao-experiments/rtdetr/run/person-act/act4/infer/status.json
Starting rtdetr trt_inference.
Producing predictions:   0%|                                            | 0/271 [00:00<?, ?it/s]Error drawing bbox for prediction [ 2. nan nan nan nan nan]
Error drawing bbox for prediction [ 0. nan nan nan nan nan]
Error drawing bbox for prediction [ 1. nan nan nan nan nan]
Error drawing bbox for prediction [ 2. nan nan nan nan nan]

2295098451 · October 16, 2025, 1:38am

Hello, I have temporarily resolved the issue. I found the ResNet50 backbone (which you posted) in other people’s posts on the forum, and trained RTDETR using 544 * 960. I exported FP16 precision in the container and tested it with bounding boxes. There are also bounding boxes on Deepstream, and there are no previous warnings

Morganh · October 16, 2025, 3:36am

Thanks for the info. Glad to know it is working now.
According to The issue of width and height when exporting onnx from rtdetr - #16 by 2295098451, you were using convnextv2_nano backbone.

May I know your latest spec yaml file? Thanks.

2295098451 · October 16, 2025, 5:29am

Sure, this is a file in yml format. Since I can only upload files in txt format, I have modified the extension

train-act.txt (3.3 KB)

system · October 30, 2025, 5:30am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
The issue of width and height when exporting onnx from rtdetr TAO Toolkit	14	243	October 14, 2025
Trtexec convert onnx to engine fails TAO Toolkit	14	1462	October 30, 2023
Loss of precision to onnx converter for engine by deepstream 6.3 DeepStream SDK tensorrt , gstreamer , inference-server-triton	31	801	August 2, 2024
Conversion PyTorch to TensorRT fails when using FP16 (works with FP32 and INT8) TensorRT cudnn	2	1693	July 9, 2024
TensorRT RT-Detr model conversion precision loss TAO Toolkit tensorrt , cudnn , tao	13	231	January 5, 2026
TensorFlow EfficientDet-D0 -> ONNX -> TensorRT converted model fails to run in Deepstream DeepStream SDK deepstream61	8	1121	August 11, 2022
Onnx to trt engine DeepStream SDK	5	963	October 12, 2021
Classifier result on onnx doesn't match Deepstream result DeepStream SDK tensorrt , tensorflow , nvbugs , onnx	35	3739	October 2, 2021
Holistically-Nested Edge Detection using TensoRT Jetson AGX Xavier tensorrt , deepstream , tensorrt-model-optimizer	8	333	April 9, 2025
DeepStream DeepStream SDK	9	693	October 12, 2021

The RTDETR model is used on DeepStream with FP16 precision and no bounding boxes

Related topics