Help with re-training a network

Hi, following along from previous post Segnet: Help to prepare image for better recognition
I am following @Dusty’s suggestion and having a crack at training my own model to include only the few classes I am interested id.
BUT I want to re-cycle the many other images already classified from two datasets.
The Pascal VOC 2012 set for ‘cat’ and ‘person’ and the SUN RGB set for ‘floor’ and ‘person’.
To this I am adding 120 or so images of my own - special ‘floor’ ones
Again following suggestions I am following the steps in https://www.highvoltagecode.com/post/edge-ai-semantic-segmentation-on-nvidia-jetson

Cutting to the chase. The training module there wants the image and the Segmentation Class Mask for each image (a .png)
I cant find anywhere the set of Segmentation Class png images that match the datasets. For the VOC set there is a folder SegmentationClass but it seems to be a ‘taster’ (part of the original competition) and only contains 2900 images, vs the 17125 jpg images in the set.
There is also an ‘annotations’ folder but these xml files only specify rectangles.

The SUN set has nothing equivalent as an output but its annotations do specify polygons.

Any idea where I could get the datasets and the masks ? 10 years after the competition finished somebody must have saved them.

On a rainy sunday afternoon I thought to regenerate the mask by simply pushing each image through segnet.py and saving the mask.
Great idea except the masks that come out of that are not sharp (linear) or are like packman figures if ‘point’ is used.
How can I get it to regenerate images like the one below from VOC set?
2007_005608
2007_005608


Thanks again
JC

Hi @jc5p, I’m not exactly sure, but I don’t believe that Pascal VOC includes segmentation ground-truth for all the images in the dataset. If you check the number of entries in ImageSets/Segmentation/trainval.txt, it matches the number of images in the SegmentationClass folder (2913), so I don’t think this is a mistake. I’m not sure if previous years of the Pascal VOC dataset have additional segmented images which are different from VOC2012 or not.

For SUN RGB-D, there is a script here that re-maps the colors to Pascal VOC format: https://github.com/Onixaz/pytorch-segmentation/blob/master/datasets/sun_remap.py

I faintly recall using this repo to pre-process the SUN metadata, but it has been a long time ago: https://github.com/ankurhanda/sunrgbd-meta-data

@Dusty thanks, that final link looks positive I will check it out.

Any thoughts on getting a better grain from the mask ? Does net.Process have more parameters for instance.
Thanks JC

If you run it with --filter-mode=linear it will perform bilinear filtering on the mask, but that is already the default.

The size of the output grid (which creates the mask) is proportional to the size of the input imagery the model was trained on. So the bigger the model, the higher-resolution the mask (however the runtime performance will also be lower). For example, the cityscapes-2048x2048 model has a bigger mask than cityscapes-1024x1024.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.