SSD Mobilenet v2 on Jetson Nano too slow

Dear Guys,

i got my SSD Mobilenet v2 working on Jetson Nano but unfortunately very slow (~2FPS). I used to convert my frozen inference graph (.pb) to ONNX model and called the session.run() function (see script).
I get an onnx runtime warning: "CUDA kernel not supported. Fallback to CPU execution provider for Op type: Conv node name: Conv1/BiasAdd
" so it seems like the ONNX framework does not support this operation.
But does this cause the inference to be that slow?solectrix_inference.py (1.5 KB) inference.onnx (4.1 MB)
Is there any way to optimate the inference in order to get something about 20 FPS?
Thank you and have a great day
Toni

edit: here is the output of console for tf2onnx converter:
convert_console_log.txt (22.1 KB)

It seems like the onnxruntime ran it on the CPU and not GPU, which may explain why it only ran at 2FPS.

To convert SSD-Mobilenet model from TensorFlow to TensorRT, see this project from @AastaLLL : https://github.com/AastaNV/TRT_object_detection

Also, I have SSD-Mobilenet working through ONNX, but it was trained with PyTorch (not TensorFlow) - https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-ssd.md

It ran on GPU… i logged the hardware stats with Tegrastats: log_file.txt (2.6 KB)
At the call of the interesting function i get 99% GPU consumption.
I should mention that my model calculates on float32 and i havent overclocked the Jetson Nano yet.

Trying to inference my Model with TensorRT Engine throws errors which could not be resolved with GitHub or this forum:

It is missing a plugin that appears to be needed to run your TensorFlow model with TensorRT. Aside from providing that plugin, you could try using TF-TRT library:

It should run the unsupported layers in TensorFlow, and the others in TensorRT.

Otherwise, I would circle back to the recommendations in my post above to get SSD-Mobilenet model to TensorRT.

Hello
i was able to extract the specific operation which is not supported and is the root issue.


It seems like mentioned Where Layer returns NonZero.
Is there any way dealing with this unsupported operation?
Maybe replacing it before convertion within TF API or afterwards with TensorRT API?
P.S Here is my original model: frozen_graph_infer.pb (4.3 MB)
Thank you and have a great day!