Segnet: Help to prepare image for better recognition

Hi, I am starting to play with segnet SUN RGB-D as a method of navigating my toy robot across the house.
The demos gave me great hope that ‘floor’ could be recognised and that is a start to navigating around obstacles.

My test images, all taken by the robot so are close to the ground, have all failed with the network=fcn-resnet18-sun
From previous trials - non Jetson and no AI fancy stuff - this was because of the reflected light off the tiles and, to a lesser extent, the grouting between the tiles (which was solved by bluring the image a bit).
From the output of the example program it also shows an issue where the ‘floor’ is not detected right up to the front of the device

Does anybody have any suggestions about how the imput image could be ‘doctored’ so that it is more compatible with fcn-resnet18-sun.
It would be a lot of work to create my own model - apart from the photographic effort every image would have to be edited to remove the rest of the room.

When it boils down to what I have envisaged the only relevent classes are

  • floor - where it can go
  • person - for other processing to see who
  • cat - to avoid or maybe to chase
    anything else is ‘obstacle’ to be avoided



Do you get the output with the jetson-inference example as below:


Hi @jc5p, unfortunately I don’t have a great idea of how to transform your imagery such that it works better with the pre-trained model. However if you wanted to explore training your own segmentation model, here is a tutorial about that:

@AastaLLL Yes that is where the info for the test came from.
@dusty_nv Thanks for the pointer to how to how to train a seqmentation model. As I suspected it is a daunting and very time consuming task. Which is why I was thinking of cheating!

Do you know if you can get access to the images from the pre-trained models and then reassemble them? For example all the ‘cat’ and ‘dog’ images from the VOC set plus all the ‘floor’ and ‘person’ images from the SUN set which could then be augmented by, in my case, extra ‘floor’ images ?


Sure, here are links to these datasets:

I don’t think it would be as simple as only choosing the “floor” images for example, because an image typically has several segmentation classes within it. What you may want/need to do is pre-process these datasets and only select the classes that you want to use for the mask images. And discard images that have none of the classes you want.


I think I understand what you are saying I will make an attempt to get the datasets and pre-process them.
If I am not back in one year - send help. :)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.