Bad object detection predictions on SSD Mobilenet v2 lite

maarten0912 · December 8, 2021, 11:04am

I am using Dusty’s jetson-inference repository to create a real-time object detection program for a custom dataset.

Background info

Right now I am training and predicting on a PC with a NVIDIA graphics card and CUDA installed. The goal is to put a nice trained model (in ONNX) format on the Jetson Nano, and perform inference with TensorRT optimization. However, the model is not predicting correctly…

I am using the VisDrone2019-DET dataset, which has 8629 images. I have transformed the data to conform to the VOC standards, this is what my dataset looks like:

VisDrone
├── Annotations
│   ├── 000001.xml
│   ├── 000002.xml
│   ├── 000003.xml
│   ├── 000004.xml
│   ├── 000005.xml
├── ImageSets
│   ├── Main
│   │   ├── test.txt
│   │   └── trainval.txt
├── JPEGImages
│   ├── 000001.jpg
│   ├── 000002.jpg
│   ├── 000003.jpg
│   ├── 000004.jpg
│   └── 000005.jpg
└── label.txt

Example of the annotation files:

<annotation>
	<object>
		<name>person</name>
		<bndbox>
			<xmax>374</xmax>
			<xmin>160</xmin>
			<ymax>481</ymax>
			<ymin>217</ymin>
		</bndbox>
		<truncation>0</truncation>
		<occlusion>0</occlusion>
	</object>
	<object>
		<name>person</name>
		<bndbox>
			<xmax>20</xmax>
			<xmin>1</xmin>
			<ymax>98</ymax>
			<ymin>1</ymin>
		</bndbox>
		<truncation>0</truncation>
		<occlusion>0</occlusion>
	</object>
	<object>
		<name>car</name>
		<bndbox>
			<xmax>375</xmax>
			<xmin>310</xmin>
			<ymax>289</ymax>
			<ymin>165</ymin>
		</bndbox>
		<truncation>0</truncation>
		<occlusion>0</occlusion>
	</object>
</annotation>

The data looks really nice in my eyes, although I am suspecting that the data may be too hard for the choice of model. I wanted to post a picture of an annotated example image from the dataset, but the forum won’t allow me to add more than one embedded picture per topic…

Now when I go to training the model, I am executing this:

python train_ssd.py --net mb2-ssd-lite --pretrained-ssd models/mb2-lite.pth --data ../../VisDrone --model-dir models/visdrone_model  --dataset-type voc --epochs 100

I have tried a little to play around with --batch-size and --learning-rate, but when testing different options with lower epochs, it still is pretty bad

This is the average precision of the model (note that it is calculated using the training images):

python eval_ssd.py --net mb2-ssd-lite --trained_model models/visdrone_model/mb2-ssd-lite-Epoch-99-Loss-6.650372860745224.pth --dataset_type voc --dataset ../../VisDrone/ --label_file models/visdrone_model/labels.txt --eval_dir models/visdrone_model_eval

[...redacted...]

Average Precision Per-class:
car: 0.08366060023699787
motor: 0.0036423337837340614
person: 0.0023940388984447654
pedestrian: 0.002341346690480933
awning tricycle: 0.002820879803813936
tricycle: 0.0019855374143438924
bicycle: 0.0004966070280048979
truck: 0.053817137171613544
van: 0.01781248643689646
bus: 0.11825235193326702

Average Precision Across All Classes: 0.02872233193975974

To predict images with run_ssd_example.py, I made one change, since it gave the error RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! That’s why I changed the line:

predictor = create_mobilenetv2_ssd_lite_predictor(net, candidate_size=200)

to:

import torch
predictor = create_mobilenetv2_ssd_lite_predictor(net, candidate_size=200,device=torch.device('cuda:0'))

Then using run_ssd_example.py, I predicted objects in the same image. The prediction looks really bad to me…

To me it looks like the training did some things right, but you can see that it is really not good enough.

Please note that this image processing is done with pytorch, and not with TensorRT. When I use detectnet with TensorRT, I get similar results thought.

Now that I flooded you with information, I have some questions for you:

Any ideas for changing the batch-size and learning-rate (and maybe more options) to make the training work better?
As you can see, the precision and predictions are really bad, any idea why? I thought maybe the dataset is too hard for the SSD Mobilenet V2 lite. If so, how can I improve on this, while keeping a real-time model?

Let me know if you could use any more information!

maarten0912 · December 8, 2021, 11:50am

This is an example from the training data, with annotations:

AastaLLL · December 15, 2021, 7:14am

Hi,

Sorry for the late update.

You may need some high resolution detector.
In SSD MobileNet, the input image is rescaled into 300x300.
It will be hard for a detector to distinguish a small bounding box.

Thanks.

maarten0912 · December 23, 2021, 2:34pm

Dear AastaLLL,

I have tried to train a model with 512x512 now.
It works already a lot better than the other model, but it is still not perfect.
I guess it is not possible with SSD MobileNet to make it better than that (I do not want to add difficult components myself, only code from jetson-inference pytorch-ssd).
I will regard this question as solved!

Thanks for helping!

system · January 6, 2022, 2:34pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Training of Object Detection models on Jetson Nano! Jetson Nano ai-training	7	1301	October 18, 2021
Training ssd-mobilenet on jetson nano from scratch Jetson Nano ai-training	4	1635	October 18, 2021
Detect small Object Jetson Nano jetson-inference	4	172	September 2, 2024
Bad accurancy , re-trained the SSD network for people detection #1172 Jetson Nano ai-training	4	573	October 10, 2021
Ssd-mobilenet-v2 object detection Jetson Nano jetson-inference	2	1365	August 2, 2022
Access to ssd-mobilenetv2 lite Jetson Nano jetson-inference	3	812	July 14, 2023
Bad performance of jetson-inference with ssd-mobilenet-v2 Jetson Nano jetson-inference	2	727	October 18, 2021
Jetson inference retraining SSD-Mobilenet Jetson Nano tensorrt , ai-training	4	1537	November 17, 2021
Running original tensorflow model for SSD mobilenet v2 with live USB camera capture on jetson nano Jetson Nano	2	1289	October 18, 2021
SSD Mobilenet onnx from saved_model trained in tensorflow api Jetson Nano tensorrt , tensorflow , jetson-inference , onnx	2	1239	October 15, 2021

Bad object detection predictions on SSD Mobilenet v2 lite

Background info

Now that I flooded you with information, I have some questions for you:

Related topics