Problems exporting TAO ONNX model to Jetson

Please provide the following information when requesting support.

• Hardware (A100/Orin)
• Network Type (Detectnet_v2)

Hello, I am working in a project that consists on detecting multiple types of vehicles from traffic cameras. In this phase, I am trying to finetune detectnet_v2 (traficcamnet) with a custom dataset. Following the tutorials I’ve managed to train some models (unpruned, pruned and quantized), the training curves look fine and performing “tao evaluate” and “tao inference” on the final weights yield good results.

The problems I am facing right now concern the export and deployment of such new models to the production environment. First I export an onnx model with “tao export”. The ONNX model is able to find objects of interest in images, however I do not know how to translate the bounding boxes raw outputs to actual bounding boxes in image coordinates (but this is not the main issue, at this point I trust the onnx model works, although any pointers on how to do this would be apreciated to fully trust this step is correct).

Then I copy the onnx model to my Jetson and here I have multiple problems when I start detectNet (passing the labels and inputs/outputs names accordingly).

1 - The onnx model has dynamic shapes (mainly the batch size) and this results in an error. I was able to fix this by creating a new onnx model with fixed batch size of 1 (but I am not sure this causes other errors down in the pipeline). Can we do “tao export” without dynamic axis?
2 - The model cannot bind the inputs/outputs, even if I set the correct names in the detectNet initialization, it expects the default names of “data” (for input) and “coverage”/“bboxes” (for outputs). Again, I was able to fix this by creating a new onnx model from the original changing the names of the input/output layers, but I am not sure this causes other errors down in the pipeline.
3 - With fixes 1 and 2 I can run detectNet but I get some weird detections (there are more classes than there should be) suggesting something went wrong in the process.

[tracker] dropped track -1 -> class=39 frames=0
[tracker] dropped track -1 -> class=46 frames=0
[tracker] dropped track -1 -> class=55 frames=0
[tracker] dropped track -1 -> class=8 frames=0
[tracker] dropped track -1 -> class=8 frames=0
[tracker] dropped track -1 -> class=40 frames=0
[tracker] dropped track -1 -> class=67 frames=0
[tracker] added track -1 -> class=24
[tracker] added track -1 -> class=46
[tracker] added track -1 -> class=55
[tracker] added track -1 -> class=39
[tracker] added track -1 -> class=8
[tracker] added track -1 -> class=8
[tracker] added track -1 -> class=40
[tracker] added track -1 -> class=62
[tracker] dropped track -1 -> class=24 frames=0
[tracker] dropped track -1 -> class=55 frames=0
[tracker] dropped track -1 -> class=39 frames=0
[tracker] dropped track -1 -> class=46 frames=0
[tracker] dropped track -1 -> class=8 frames=0

I am pretty sure the error is in the ONNX → TRT step (although some issue generating the onnx model is not 100% discarded). I have tried to use “trtexec” to generate a TRT engine from the onnx model and give that to detectNet, but I get the same result.

I would appreciate help to solve this issue.

Did you refer to section “10. Model Export” of tao_tutorials/notebooks/tao_launcher_starter_kit/detectnet_v2/detectnet_v2.ipynb at main · NVIDIA/tao_tutorials · GitHub to export? Please use tf2onnx.

No, for current detectnet_v2, the exporting will only do dynamic batch size .

Can you open the onnx to double check? The input name is input_1 and output name is "output_cov/Sigmoid", "output_bbox/BiasAdd" . Refer to tao_tensorflow1_backend/nvidia_tao_tf1/cv/detectnet_v2/export/exporter.py at main · NVIDIA/tao_tensorflow1_backend · GitHub.

Please follow above-mentioned notebook to generate tensorrt engine and run inference to double check.

Yes, this is an example of the command we use for exporting

tao model detectnet_v2 export \
                  -m $(USER_EXPERIMENT_DIR)/experiment_dir_unpruned/weights/resnet18_detector.hdf5 \
                  -o $(USER_EXPERIMENT_DIR)/experiment_dir_final/resnet18_detector_unpruned.onnx \
                  -e $(SPECS_DIR)/detectnet_v2_train.txt \
                  --gen_ds_config \
                  --onnx_route tf2onnx \
                  --verbose

Yes, I have an utility to explore onnx models and these are indeed the default names for the inputs and outputs. This is the information for the input/output layers of the default traficcament model (downloaded as tlt and exported to onnx with tao export, and after removing the dynamic batch size and fix it to 1)

Model Inputs:
Name: input_1:0, Shape: [1, 3, 544, 960], Type: float32

Model Outputs:
Name: output_cov/Sigmoid:0, Shape: [1, 4, 34, 60], Type: float32
Name: output_bbox/BiasAdd:0, Shape: [1, 16, 34, 60], Type: float32

However, when I try to run the model with detectNet

net = detectNet(
		model=args.model,
		labels=args.labels,
		input_blob="input_1", 
		output_cvg="output_cov/Sigmoid",
		output_bbox="output_bbox/BiasAdd",
		threshold=args.threshold,
	)

I get this error

[TRT]    CUDA engine context initialized on device GPU:
[TRT]       -- layers       30
[TRT]       -- maxBatchSize 1
[TRT]       -- deviceMemory 26112000
[TRT]       -- bindings     3
[TRT]       binding 0
                -- index   0
                -- name    'input_1:0'
                -- type    FP32
                -- in/out  INPUT
                -- # dims  4
                -- dim #0  1
                -- dim #1  3
                -- dim #2  544
                -- dim #3  960
[TRT]       binding 1
                -- index   1
                -- name    'output_cov/Sigmoid:0'
                -- type    FP32
                -- in/out  OUTPUT
                -- # dims  4
                -- dim #0  1
                -- dim #1  4
                -- dim #2  34
                -- dim #3  60
[TRT]       binding 2
                -- index   2
                -- name    'output_bbox/BiasAdd:0'
                -- type    FP32
                -- in/out  OUTPUT
                -- # dims  4
                -- dim #0  1
                -- dim #1  16
                -- dim #2  34
                -- dim #3  60
[TRT]    
[TRT]    3: Cannot find binding of given name: data
[TRT]    failed to find requested input layer data in network
[TRT]    device GPU, failed to create resources for CUDA engine
[TRT]    failed to create TensorRT engine for models/trafficcamnet_adapted2.onnx, device GPU
[TRT]    detectNet -- failed to initialize.

The error is the same if add the “:0” at the end of the names.

It works if I manually change the names for the inputs/outputs to the default ones: “data” for inputs and “coverage”/“bboxes” for outputs, that it works.

Model Inputs:
Name: data, Shape: [1, 3, 544, 960], Type: float32

Model Outputs:
Name: coverage, Shape: [1, 4, 34, 60], Type: float32
Name: bboxes, Shape: [1, 16, 34, 60], Type: float32

Since jetson inference uses trt I get the exact same errors in both cases (first the dynamic shapes, then the naming of the inputs/outputs, then the model does not detect anything).

I am using the provided trafficcamnet model as well for debugging, and not only our fine tuned model, I would expect it to work but as mentioned the onnx version of trafficamnet does not work for us. It would be nice to have a reference notebook that showcases this process working fine (run detectnet with trafficcamnet exported to onnx from the tlt version).

Thanks for the help.

OK, got it. The trafficcamnet is trained several years ago with previous version of detectnet_v2. The input/output name may have changed.

Officially, there is not a sample to run onnx file with onnxruntime. We suggest to use tensorrt engine.
BTW, for tlt-> onnx file, you can refer to this link to get the onnx version of trafficcamnet tlt model.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.