DetectNet Training with my own data set

Hi all,
I want to train my own SSD model to use in detectNet using my own data set full of images. Since I already have the images, I don’t want to use the live camera, but i still want to be able to create a label.txt file and go through each image and draw bounding boxes like in the tutorial. I was wondering if there is a way to do this, as I don’t see any instructions on using my own precollected data. Any help would be appreciated!

Hi @jpurpura37, you can use the CVAT tool from CVAT.org (it just runs in your browser) to annotate existing data with bounding boxes. Then export your annotated dataset from CVAT in Pascal VOC format. Then add a labels.txt for it and you should be good to go.

Once I did that it seemed to create an annotations folder with xml code representing each annotation and a labels.txt file as well. It does not have the actual images with the boxes overlayed on them. Can this format be used in the train_ssd.py script or do I need to use another 3rd party software such as Roboflow to actually overlay the bounding boxes on the images?

It makes some label/colors file, but I believe this is not the one you want. Make a labels.txt that has one class name per line, and that’s it.

The annotated dataset doesn’t have the ground-truth bounding boxes actually drawn over them, they are just stored in the XML files. Once you train the model and run the inference, you can save the predicted bounding boxes on a copy of the images. Maybe CVAT has a feature where it can save the images with the ground-truth annotations overlaid - I haven’t looked into that before.

Okay so how do I train the model and run the inference if I just have the xml files? It seems like in the tutorial, open images provides the pictures with the boxes already overlayed on the images. I am confused with the input into the train_ssd.py script is i guess.

That is a visualization through the OpenImages web viewer. When the data is actually downloaded, it comes separately as the original images and the annotations (in different files)

You must have the original images, from before you annotated them? If you annotated in CVAT, and then downloaded in Pascal VOC format, the original images will be included in the dataset you downloaded from CVAT (under /JPEGImages)

For custom datasets, I recommend using the Pascal VOC format with train_ssd.py. And there, the original images are stored under JPEGImages/ and the annotations are under Annotations/. CVAT should do all this for you already, and all you have to do is make the labels.txt for it.


So when I converted to PASCAL VOC format from CVAC and downloaded everything this is what it gave me. Do I put this in my ssd/data directory and then run the train_ssd.py script with it? The ImageSets folder also doesn’t have the actual images in it, just the names of them.

Hmm that is strange, it should have JPEGImages/ too. Do you have a folder somewhere else with all your images stored in it? If so, can you create a JPEGImages/ folder in your dataset and put them in there? I wonder if there is some option in CVAT when you exported to export the images too, or some other setting that I can’t recall.

Yes, that is correct. You need to create a labels.txt inside your dataset (with one class name per line) and then run train_ssd.py like this:

https://github.com/dusty-nv/jetson-inference/blob/master/docs/pytorch-collect-detection.md#training-your-model

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.