How to run a ONNX format Efficientdet-d0 model exported from TAO 6.0

Mike · September 30, 2025, 10:53pm

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU): dGPU
• DeepStream Version: 8.0
• JetPack Version (valid for Jetson only)
• TensorRT Version: As per DS 8.0
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

I used TAO nvcr.io/nvidia/tao/tao-toolkit:6.0.0-tf2 container to train, prune and retrain an Efficientdet-d0 model which was successful. I used the notebook tao-experiments/efficientdet_tf2/efficientdet.ipynb

DONE (t=0.56s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.613
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.960
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.695
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.277
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.645
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.575
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.635
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.741
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.750
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.386
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.759
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.734
None

I was also able to successfully export the model into onnx format.

Loaded saved model from /tmp/tmp7c44ew6u
Using tensorflow=2.17.0, onnx=1.17.0, tf2onnx=1.16.1/3dd772
Using opset <onnx, 13>
Computed 0 values for constant folding
Optimizing ONNX model
After optimization: BatchNormalization -54 (108->54), Cast -38 (48->10), Const -536 (1136->600), GlobalAveragePool +16 (0->16), Greater -1 (1->0), Identity -2 (2->0), Mul -2 (192->190), ReduceMean -16 (16->0), ReduceSum -1 (1->0), Reshape -70 (114->44), Shape -1 (7->6), Squeeze +13 (22->35), Transpose -765 (780->15), Unsqueeze -8 (28->20)
TF2ONNX graph created successfully
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Mul) bound to different types (tensor(int32) and tensor(float) in node (StatefulPartitionedCall/mul).Found Conv node 'StatefulPartitionedCall/efficientdet-d0/stem_conv/Conv2D' as stem entry
Updating Resize node Resize__870 to [1. 1. 2. 2.]Found Concat node 'StatefulPartitionedCall/concat' as the tip of class-predict
Found Concat node 'StatefulPartitionedCall/concat_1' as the tip of box-predict
Created NMS plugin 'EfficientNMS_TRT' with attributes: {'plugin_version': '1', 'background_class': -1, 'max_output_boxes': 100, 'score_threshold': 0.4, 'iou_threshold': 0.5, 'score_activation': True, 'box_coding': 1}
The exported model is saved at: /workspace/tao-experiments/efficientdet_tf2/export/efficientdet-d0.onnx
Export finished successfully.

When I check the input and output dimensions for the model I get this.

input: input shape: ['dim?', 512, 512, 3]
outputs: ['num_detections', 'detection_boxes', 'detection_scores', 'detection_classes']

which looks normal.

I ran evaluate and inference as well which were successful.

Trt_inference results will be saved at: /workspace/tao-experiments/efficientdet_tf2/export
Starting efficientdet_tf2 trt_inference.

Producing predictions: 100%|██████████| 314/314 [00:09<00:00, 34.18it/s]Finished inference.
Trt_inference finished successfully.

All the bounding boxes were correctly produced.

My real problem starts when I proceed to deploy to Deepstream 8.0. DS expects the model in NCHW format while the ONNX format is NHWC which keeps giving me problems.

I tried to use the trtexec as well to build the engine file which was successfull.

[10/01/2025-03:55:54] [I] Input binding for input with dimensions 1x512x512x3 is created.
[10/01/2025-03:55:54] [I] Output binding for num_detections with dimensions 1x1 is created.
[10/01/2025-03:55:54] [I] Output binding for detection_boxes with dimensions 1x100x4 is created.
[10/01/2025-03:55:54] [I] Output binding for detection_scores with dimensions 1x100 is created.
[10/01/2025-03:55:54] [I] Output binding for detection_classes with dimensions 1x100 is created.

&&&& PASSED TensorRT.trtexec [TensorRT v100900] [b34] # trtexec --onnx=assets/efficientdet-d0.onnx --saveEngine=assets/efficientdet-d0_b1_fp16.engine --fp16 --minShapes=input:1x512x512x3 --optShapes=input:1x512x512x3 --maxShapes=input:1x512x512x3

But this engine also lands into the same issue.

Opening in BLOCKING MODE
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
0:00:00.685705062 268 0x56292d4d6fc0 WARN nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1260> [UID = 1]: Warning, OpenCV has been deprecated. Using NMS for clustering instead of cv::groupRectangles with topK = 20 and NMS Threshold = 0.5
0:00:00.764122940 268 0x56292d4d6fc0 INFO nvinfer gstnvinfer.cpp:685:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2109> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/deepstream-nvdsanalytics-test/assets/efficientdet-d0_b1_fp16.engine
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:363 [Implicit Engine Info]: layers num: 0

0:00:00.764154641 268 0x56292d4d6fc0 INFO nvinfer gstnvinfer.cpp:685:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2212> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/deepstream-nvdsanalytics-test/assets/efficientdet-d0_b1_fp16.engine
0:00:00.764168807 268 0x56292d4d6fc0 ERROR nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::preparePreprocess() <nvdsinfer_context_impl.cpp:1051> [UID = 1]: RGB/BGR input format specified but network input channels is not 3
ERROR: nvdsinfer_context_impl.cpp:1377 Infer Context prepare preprocessing resource failed., nvinfer error:NVDSINFER_CONFIG_FAILED
0:00:00.773772980 268 0x56292d4d6fc0 WARN nvinfer gstnvinfer.cpp:918:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:00.773790994 268 0x56292d4d6fc0 WARN nvinfer gstnvinfer.cpp:918:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/deepstream-nvdsanalytics-test/assets/pgie_config.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: main:1539: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to create NvDsInferContext instance
Debug info: gstnvinfer.cpp(918): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/deepstream-nvdsanalytics-test/assets/pgie_config.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
[NvMultiObjectTracker] De-initialized
App run failed

Here is my pgie config and the onnx file. Can you guide me with steps.

efficientdet-d0.onnx.txt (14.8 MB)

pgie_config.txt (716 Bytes)

Mike · October 7, 2025, 4:30am

Can someone please check.

Fiona.Chen · October 9, 2025, 8:13am

Seems your model input layer dimension is NHWC

But you configured “network-input-order=0” in the pgie_config.txt file. It is wrong. All the gst-nvinfer parameters are explained in Gst-nvinfer — DeepStream documentation, please configure correctly according to your model.

Mike · October 27, 2025, 4:01pm

I did write a small parser and used an updated pgie file. The pipeline runs successfully but does not provide any detections.

deepstream-app -c ./assets/pipeline_config.txt
** WARN: <parse_source:725>: Unknown key ‘alert-interval’ for group [source0]
** WARN: <parse_source:725>: Unknown key ‘calc-interval’ for group [source0]
** WARN: <parse_source:725>: Unknown key ‘latitude’ for group [source0]
** WARN: <parse_source:725>: Unknown key ‘longitude’ for group [source0]
** WARN: <parse_source:725>: Unknown key ‘ppm’ for group [source0]
** WARN: <parse_source:725>: Unknown key ‘threshold’ for group [source0]
** WARN: <parse_tracker:1756>: Unknown key ‘enable-batch-process’ for group [tracker]
** WARN: <parse_tracker:1756>: Unknown key ‘enable-past-frame’ for group [tracker]
Unknown or legacy key specified ‘confidence-threshold’ for group [property]
Unknown or legacy key specified ‘nms-iou-threshold’ for group [property]
Warn: ‘threshold’ parameter has been deprecated. Use ‘pre-cluster-threshold’ instead.
Authorization required, but no authorization protocol specified

Opening in BLOCKING MODE
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
0:00:00.330060054 371 0x58fac3d21190 INFO nvinfer gstnvinfer.cpp:685:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2109> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/js-nvdsanalytics/assets/efficientdet-d0_b1_fp16.engine
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:363 [Implicit Engine Info]: layers num: 0

0:00:00.330099374 371 0x58fac3d21190 INFO nvinfer gstnvinfer.cpp:685:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2212> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/js-nvdsanalytics/assets/efficientdet-d0_b1_fp16.engine
0:00:00.333843926 371 0x58fac3d21190 INFO nvinfer gstnvinfer_impl.cpp:343:notifyLoadModelStatus:<primary_gie> [UID 1]: Load new model:/opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/js-nvdsanalytics/assets/pgie_config.txt sucessfully

Runtime commands:
h: Print this help
q: Quit

p: Pause
r: Resume

** INFO: <bus_callback:291>: Pipeline ready

Opening in BLOCKING MODE
** INFO: <bus_callback:277>: Pipeline running

nvstreammux: Successfully handled EOS for source_id=0
** INFO: <bus_callback:334>: Received EOS. Exiting …

Quitting
[NvMultiObjectTracker] De-initialized
App run successful

nvdsparse_efficientdet.txt (1.4 KB)

pgie_config.txt (1.8 KB)

Fiona.Chen · October 28, 2025, 1:31am

Why did you customize the bbox parsing function? Is there any issue with “NvDsInferParseCustomEfficientDetTAO” in deepstream_tao_apps/configs/nvinfer/efficientdet_tao/pgie_d0_512_tao_config.txt at release/tao5.1_ds6.4ga · NVIDIA-AI-IOT/deepstream_tao_apps?

Mike · October 29, 2025, 10:29am

Thanks. This works. The documentation is very convoluted. :-)

system · November 12, 2025, 10:29am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TensorFlow EfficientDet-D0 -> ONNX -> TensorRT converted model fails to run in Deepstream DeepStream SDK deepstream61	8	1120	August 11, 2022
Classifier result on onnx doesn't match Deepstream result DeepStream SDK tensorrt , tensorflow , nvbugs , onnx	35	3733	October 2, 2021
Issue porting efficientnet b0 to deepstream from TAO using onnx TAO Toolkit jetson , deepstream	2	61	December 5, 2024
Failed to deploy efficientdet-tf1 in deepstream DeepStream SDK	2	434	November 6, 2023
Deepstream Onnx inference no output TAO Toolkit	29	291	August 15, 2024
Custom detection ONNX model gives wrong outputs using nvinfer with DeepStream 5.1 DeepStream SDK	17	3056	October 12, 2021
Deploy custom object detection tf2 model DeepStream SDK	4	1169	January 4, 2022
Not able to run onnx model in deepstream DeepStream SDK	6	447	September 19, 2023
Problems with Onnx model zoo -> trtexec -> DeepStream 6.0 pipeline DeepStream SDK	17	1001	November 29, 2023
Deepstream onnx model creates no detections DeepStream SDK	8	1215	January 18, 2022

How to run a ONNX format Efficientdet-d0 model exported from TAO 6.0

Related topics