Difficulty of Retraining Object Detection Model with Custom Data

wiany11 · August 21, 2017, 7:30pm

I am struggling with training a new object detection model.

What I want to do is detecting mouth (and eyes if possible).

I captured 100 ~ 200 images using the builtin camera on Jetson TX2.
(All training images are very similar…)

And I followed the instruction (https://github.com/dusty-nv/jetson-inference) to train the model.

On DIGITS, Loss seemed to decrease well, however, it did not work well on validation data which is very similar with training data.

What should I do to train a model? Specifically,

Roughly how many images do I need?
Should I change the net itself besides DIGITS parameters? (E.g., strides?)
Should I consider sizes of bounding boxes?
Any useful advice for training with small dataset?

Thank you!

dusty_nv · August 21, 2017, 8:53pm

To train a customized DetectNet, ideally your training dataset would consist of several thousand annotated images. You should be able to detect pedestrian or animal-sized objects like from the tutorial without changing the detection sizes in network prototxt, but for smaller objects like faces or facial features, you may want to decrease the dimensions. You can consult the facenet model which was trained on FDDB. I would recommend to try working with FDDB first or this DIGITS pedestrian detection tutorial with KITTI.

jkjung13 · September 9, 2017, 2:26pm

I trained a DetectNet model to detect fishes with roughly 3,000 images in the training set. During training, I got mAP as high as 78.6. And the resulting model seemed to work OK on new (previously unseen) images. Check out my blog posts if you are interested.

[url]https://jkjung-avt.github.io/fisheries-dataset/[/url]
[url]https://jkjung-avt.github.io/detectnet-training/[/url]

dusty_nv · September 9, 2017, 2:46pm

Cool result, thanks for sharing!

I added a link to your resource from the wiki. You may want to add a link to your second post at the end of your first post.

jkjung13 · September 9, 2017, 4:00pm

Dustin, thanks. I’ve added the link as you suggested.

snarky · September 12, 2017, 4:32am

The rule of thumb is that you need one input image per learned parameter in your network.
If your network has a million parameters (many large convolution kernels with many output channels and such,) that’s likely to be millions of images!

You can of course get good results with fewer training images, but there is significant risk of over-fitting in that case. This is what happens when training gets low loss but validation fails.

To get less over-fitting, try adding a few dropout layers in your network while training, setting dropout in each such layer to perhaps 50% or so.

Topic		Replies	Views
Can DetectNet and ImageNet be trained to detect objects? Jetson TX2	3	868	October 18, 2021
Detect small Object Jetson Nano jetson-inference	4	145	September 2, 2024
Training model, object detetion Jetson Orin NX jetson-inference , ai-training , training , jetson	4	1086	September 26, 2023
Model Training Accuracy Issue Jetson Nano ai-training	3	521	April 13, 2023
Detecting distant objects Jetson TX2 jetson-inference	4	1088	March 2, 2022
Failing custom object detection Jetson Nano jetson-inference	6	520	January 5, 2022
Jetson-inference Multiple Object Detection How To Jetson TX2	5	1594	October 18, 2021
Is there a way to retrain ssd-mobilenet v2 with my own data? Jetson TX2	2	1408	October 18, 2021
Model Training Accuracy Issue TensorRT	1	359	March 30, 2023
Custom object detection from new images dataset using Jetson Nano Jetson Nano opencv , tensorflow , jetson-inference	2	824	October 15, 2021

Difficulty of Retraining Object Detection Model with Custom Data

Related topics