I just wanted to summarize developers experience and sharing some tips about tensorflow object detection API on TX2.
At the moment I am just talking about what is actually doable and not, with a focus on inference, rather than training.
Browsing the forum, my experience and other resources, this is what I understood.
Let’s consider only available pretrained frozen graph.
TF obj.det. API can be used for inference with ssd_mobilenet_v1 network architecture at approx ~5-8 fps. Faster Rcnn resnet pretrained models seems to cause OOM errors (in my experience, all of them). Was anybody able to run one of the Faster Rcnn resnet model? If yes, could you share some tips?
Does the conversion to TensorRT have an impact also on memory usage?
What I mean is the following: even if I am not able to run a specific network architecture due to OOM error on TX2 from TF Obj Det API, I could potentially train a model in a different, more powerful machine, export trained graph to UFF format (through python API), then transfer it to TX2 where it can be imported using C++ API for inference. Does it sound correct? performance would certainly benefit from a TF->TensorRT conversion, but I am not sure about memory usage.
I am considering this as an option beacuse I’ve notice in jetson-inference DetectNet a FasterRcnnResnet50 network.
Thanks for your contribution!
Thanks for the sharing.
We are also checking TensorFlow object detection API.
Appreciated for sharing your experience with us.
Although TensorFlow can run ssd_mobilenet_v1 with GPU mode correctly, we find the GPU utilization is pretty low.
Do you also meet this issue?
Could you share the tegrastats data when you inference with the ssd_mobilenet_v1?
For your second question:
1. Workflow is correct. Only concern is that we have yet to support the custom API for UFF user.
If there is a non-supported layer in your model, there is no WAR to run this layer with TensorRT.
2. TensorRT support fp16 mode which can cut memory in half and it will be extremely helpful for your use case.
I will soon be looking into Tensorflow object detection API with TensorRT (for TX2).
some models of interest are :
Do you have any links specific to the tensorflow Object detection API TensorRT to get me started?
It’s recommended to check if the layers of your model are well-supported by TensorRT first.
We have listed the supported layer for UFF parser and TensorRT engine in detail here:
UFF parser: [url]Developer Guide :: NVIDIA Deep Learning TensorRT Documentation
TensorRT engine: [url]Developer Guide :: NVIDIA Deep Learning TensorRT Documentation
are there any new projects like (https://github.com/NVIDIA-Jetson/tf_to_trt_image_classification) but for object detection models using TensorRT?
When talking about performance of the Object detection API:
I was working on this topic over the last 4 months, also started at around 4 fps for mobilenet ssd,
But now i am able to achieve up to 30 fps with the same model on the jetson, you can have a look at my github and try it out (https://github.com/GustavZ/realtime_object_detection)
What i am now interested in is: Makin Mask R-CNN run on the jetson. Did anybody get it working on the jetson successfully? Maybe through compressing/binarization techniques? Is it documented which layers lack TensorRT for Mask R-CNN? Probably a lot…
Anyways would be nice to hear about your experience
Sorry for that we are not familiar with Mask R-CNN.
But you can find the detail supported layer of TensorRT 4 here:
[url]Developer Guide :: NVIDIA Deep Learning TensorRT Documentation
Is there a MaskRCNN sample available for TensorRT4? I need to know to to create my config.py file to be used as a preprocessor. I am using the matterport mask rcnn model as well.
MaskRCNN is not in our official sample.
Suppose you need some plugin implementation to make it work.
You can check this sample for detail:
This repo (GitHub - NVIDIA-AI-IOT/tf_trt_models: TensorFlow models accelerated with NVIDIA TensorRT) is a good resource for optimizing tensorflow classification/detection models with tensorRT. You can achieve up to 10-15 FPS on the Jetson tx2. However, I was not able to properly get mask rcnn working in a similar manner. Seems that the mask layer is not yet supported in tensorRT4.