Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU): dGPU
• DeepStream Version: 8.0
• JetPack Version (valid for Jetson only)
• TensorRT Version: As per DS 8.0
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
I used TAO nvcr.io/nvidia/tao/tao-toolkit:6.0.0-tf2 container to train, prune and retrain an Efficientdet-d0 model which was successful. I used the notebook tao-experiments/efficientdet_tf2/efficientdet.ipynb
DONE (t=0.56s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.613
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.960
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.695
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.277
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.645
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.575
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.635
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.741
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.750
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.386
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.759
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.734
None
I was also able to successfully export the model into onnx format.
Loaded saved model from /tmp/tmp7c44ew6u
Using tensorflow=2.17.0, onnx=1.17.0, tf2onnx=1.16.1/3dd772
Using opset <onnx, 13>
Computed 0 values for constant folding
Optimizing ONNX model
After optimization: BatchNormalization -54 (108->54), Cast -38 (48->10), Const -536 (1136->600), GlobalAveragePool +16 (0->16), Greater -1 (1->0), Identity -2 (2->0), Mul -2 (192->190), ReduceMean -16 (16->0), ReduceSum -1 (1->0), Reshape -70 (114->44), Shape -1 (7->6), Squeeze +13 (22->35), Transpose -765 (780->15), Unsqueeze -8 (28->20)
TF2ONNX graph created successfully
[ONNXRuntimeError] : 1 : FAIL : Type Error: Type parameter (T) of Optype (Mul) bound to different types (tensor(int32) and tensor(float) in node (StatefulPartitionedCall/mul).Found Conv node 'StatefulPartitionedCall/efficientdet-d0/stem_conv/Conv2D' as stem entry
Updating Resize node Resize__870 to [1. 1. 2. 2.]Found Concat node 'StatefulPartitionedCall/concat' as the tip of class-predict
Found Concat node 'StatefulPartitionedCall/concat_1' as the tip of box-predict
Created NMS plugin 'EfficientNMS_TRT' with attributes: {'plugin_version': '1', 'background_class': -1, 'max_output_boxes': 100, 'score_threshold': 0.4, 'iou_threshold': 0.5, 'score_activation': True, 'box_coding': 1}
The exported model is saved at: /workspace/tao-experiments/efficientdet_tf2/export/efficientdet-d0.onnx
Export finished successfully.
When I check the input and output dimensions for the model I get this.
input: input shape: ['dim?', 512, 512, 3]
outputs: ['num_detections', 'detection_boxes', 'detection_scores', 'detection_classes']
which looks normal.
I ran evaluate and inference as well which were successful.
Trt_inference results will be saved at: /workspace/tao-experiments/efficientdet_tf2/export
Starting efficientdet_tf2 trt_inference.
Producing predictions: 100%|██████████| 314/314 [00:09<00:00, 34.18it/s]Finished inference.
Trt_inference finished successfully.
All the bounding boxes were correctly produced.
My real problem starts when I proceed to deploy to Deepstream 8.0. DS expects the model in NCHW format while the ONNX format is NHWC which keeps giving me problems.
I tried to use the trtexec as well to build the engine file which was successfull.
[10/01/2025-03:55:54] [I] Input binding for input with dimensions 1x512x512x3 is created.
[10/01/2025-03:55:54] [I] Output binding for num_detections with dimensions 1x1 is created.
[10/01/2025-03:55:54] [I] Output binding for detection_boxes with dimensions 1x100x4 is created.
[10/01/2025-03:55:54] [I] Output binding for detection_scores with dimensions 1x100 is created.
[10/01/2025-03:55:54] [I] Output binding for detection_classes with dimensions 1x100 is created.
&&&& PASSED TensorRT.trtexec [TensorRT v100900] [b34] # trtexec --onnx=assets/efficientdet-d0.onnx --saveEngine=assets/efficientdet-d0_b1_fp16.engine --fp16 --minShapes=input:1x512x512x3 --optShapes=input:1x512x512x3 --maxShapes=input:1x512x512x3
But this engine also lands into the same issue.
Opening in BLOCKING MODE
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
0:00:00.685705062 268 0x56292d4d6fc0 WARN nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1260> [UID = 1]: Warning, OpenCV has been deprecated. Using NMS for clustering instead of cv::groupRectangles with topK = 20 and NMS Threshold = 0.5
0:00:00.764122940 268 0x56292d4d6fc0 INFO nvinfer gstnvinfer.cpp:685:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2109> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/deepstream-nvdsanalytics-test/assets/efficientdet-d0_b1_fp16.engine
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:363 [Implicit Engine Info]: layers num: 0
0:00:00.764154641 268 0x56292d4d6fc0 INFO nvinfer gstnvinfer.cpp:685:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2212> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/deepstream-nvdsanalytics-test/assets/efficientdet-d0_b1_fp16.engine
0:00:00.764168807 268 0x56292d4d6fc0 ERROR nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::preparePreprocess() <nvdsinfer_context_impl.cpp:1051> [UID = 1]: RGB/BGR input format specified but network input channels is not 3
ERROR: nvdsinfer_context_impl.cpp:1377 Infer Context prepare preprocessing resource failed., nvinfer error:NVDSINFER_CONFIG_FAILED
0:00:00.773772980 268 0x56292d4d6fc0 WARN nvinfer gstnvinfer.cpp:918:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:00.773790994 268 0x56292d4d6fc0 WARN nvinfer gstnvinfer.cpp:918:gst_nvinfer_start:<primary_gie> error: Config file path: /opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/deepstream-nvdsanalytics-test/assets/pgie_config.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: main:1539: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to create NvDsInferContext instance
Debug info: gstnvinfer.cpp(918): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie:
Config file path: /opt/nvidia/deepstream/deepstream-8.0/sources/apps/sample_apps/deepstream-nvdsanalytics-test/assets/pgie_config.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
[NvMultiObjectTracker] De-initialized
App run failed
Here is my pgie config and the onnx file. Can you guide me with steps.
efficientdet-d0.onnx.txt (14.8 MB)
pgie_config.txt (716 Bytes)
