Convert Faster RCNN Tensorflow model to TensorRT plan

I am trying to convert my tensorflow based Faster-RCNN(pretrained COCO model) which uses Inception V2 as feature extractor.

I have gone through the example for Faster-RCNN available under /usr/src/tensorrt/samples/sampleFasterRCNN

I am trying to follow the similar approach for tensorflow and i am facing issue in fusing the RPN and ROI layers in python.

Using graph surgeon i do not know how to create RPROIFused layer as there is no document/example available for TensorRT Plugins input and output parameters for python

For Example config for NMS in TensorRT in python is as follows

NMS = gs.create_plugin_node(name=“NMS”, op=“NMS_TRT”,
shareLocation=1,
varianceEncodedInTarget=0,
backgroundLabelId=0,
confidenceThreshold=1e-8,
nmsThreshold=0.6,
topK=100,
keepTopK=100,
numClasses=91,
inputOrder=[1, 0, 2],
confSigmoid=1,
isNormalized=1,
scoreConverter=“SIGMOID”)

Similarly i would like to know the details of how to use RPROIFused layer in python.

Hi,

The fasterRCNN model is complex and will not work out of the box with TensorRT.
It requires a lot of pre-processing to make the model work, as it contains a lot of unsupported nodes.

A better workaround is to use TF-TRT, which will automatically fallback the unsupported nodes into TF implementation.
For more information of TF-TRT, please check this tutorial:
[url]https://github.com/NVIDIA-AI-IOT/tf_trt_models[/url]

Thanks.

AastaLLL TF-TRT is crashing on Jetson Nano. Looks like insufficient memory using FP16.

Any ideas how to make this work?

Initially I assumed the uff->engine->pb was the way to go as the memory-heavy operations would happen on the x86 device.

Which feature extractor you are using as backbone for the faster rcnn.

On Jetson Nano faster rcnn with vgg16(caffe model) and inceptionv2(tensorflow) models won’t run due to insufficient memory irrespective of whether you are using TensorRT or Tensorflow-TensorRT.

I trained faster-rcnn by changing the feature extractor from vgg16 to googlenet and i converted to TensorRT plan and i got it running at 2 FPS(FP32 precision).

If you are using faster-rcnn because you have to detect smaller objects then use Retinanet and optimize the model with TensorRT.

For optimizing retinanet go through this link GitHub - NVIDIA/retinanet-examples: Fast and accurate object detection with end-to-end GPU optimization.

If you are object detection does not involve smaller objects better use tiny-yolov3 or ssd.

Neither of them will work for me. Ideally I’d like to use the inception v2 one because this is what I deploy to my clients, but if I have to fallback to resnet for nano, then so be it.

Also, I need FP16 due to the low FPS on FP32

Thanks I will have a look at Retinanet!

OK so it looks like Retinanet might be what I’m looking for. I like the fact they do all training, optimising etc. in a single project. Should be very easy to pick up.