How to achieve a good performance of MaskRCNN on Jetson Nano

Hello, I’m using TLT with MaskRCNN to achieve a lane detection model, but I’d like to run this model in the Jetson Nano platform. However, I’m getting very low FPS (better result was 2FPS). I followed this tutorial to understand the training process: https://developer.nvidia.com/blog/training-instance-segmentation-models-using-maskrcnn-on-the-transfer-learning-toolkit/

I tried a different number of layers (resnet10, resnet18, and resnet50) and different resolutions, but the performance is still low.
I would like to know if there are other parameters in the spec file that could help me improve the performance of MaskRCNN model on Jetson Nano. My spec file looks very much like the one in the link I mentioned above.

I’m also using DeepStream SDK for inference. Any help would be great. Thanks in advance.

Hi @fredericolms,
Which image_size did you set in the training spec?
If you were using the value mentioned in the https://developer.nvidia.com/blog/training-instance-segmentation-models-using-maskrcnn-on-the-transfer-learning-toolkit/, see its " Figure 4. Performance of the Mask R-CNN model with DeepStream." , the fps seems to match your result.

Hello, thanks for the reply! I used different image_size arguments. For the result of 2FPS which I meant, I used an image_size of 256x256 with a resnet50 as the backbone. I also used the image_size of 1344x832 (as the tutorial link that I mentioned) and the performance was 0.6FPS on average. Then I thought that maybe there was a parameter or something to change in the spec file that would help to improve performance. I have two questions:

1 - As you mentioned, Figure 4 shows the FPS on Jetson Nano. But the trained model in that tutorial has 91 classes. I am training only one class, which is the “lane” class. Should I expect a difference (in performance) between a MaskRCNN trained model with 91 classes and a MaskRCNN model with only one class (which is my case)?

2 - If this is how the Jetson Nano will perform with MaskRCNN models, is there another segmentation model (more lightweight) that I can use to train and use DeepStream as my “platform” for inference?

Thank you in advance.

In your Nano, have you set max power mode and boost the cpu block?

$ sudo nvpmodel -m 0

$ jetson_clocks

Yes, I got those results while in max power mode and jetson_clocks running.

So, I suggest you to test the mask-rcnn model which is mentioned in the blog. It is trained on 1 class.

The configuration file and label file for the model are provided in the SDK. These files can be used with the generated model as well as your own trained model. A sample Mask R-CNN model trained on a one-class dataset is provided in GitHub GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream

cd deepstream_tlt_apps/
wget https://nvidia.box.com/shared/static/8k0zpe9gq837wsr0acoy4oh3fdf476gq.zip -O models.zip
unzip models.zip
rm models.zip

Thank you, foi all the help @Morganh! I tested this model on my Jetson Nano and I got 0.53 FPS on average. So it seems like this is the performance of MaskRCNN on Jetson Nano, right? Or am I doing something wrong here?

I’m afraid your result is similar to Figure 4.

Ok, thank you for the support!