I’ve got my object detection working for the most part. Now I’d like to extend it a bit to determine where the object is located on the image. For example is the object located on the driveway, street, yard, etc… I’m struggling to figure out a way of doing this.
The only way that comes to mind is to label each pixel in a structure of some sort and use the coordinates of the object to determine where it is. Labeling each pixel will take a long time. Is there another approach? Or an easy way to label each pixel?
Hi,
Maybe you can try semantic segmentation.
It can cluster an image to the different classes which should meet your requirement.
Below is an example with SegNet for your reference:
<img src="https://github.com/dusty-nv/jetson-inference/raw/master/docs/images/deep-vision-header.jpg" width="100%">
<p align="right"><sup><a href="detectnet-example-2.md">Back</a> | <a href="segnet-camera-2.md">Next</a> | </sup><a href="../README.md#hello-ai-world"><sup>Contents</sup></a>
<br/>
<sup>Semantic Segmentation</sup></s></p>
# Semantic Segmentation with SegNet
The next deep learning capability we'll cover in this tutorial is **semantic segmentation**. Semantic segmentation is based on image recognition, except the classifications occur at the pixel level as opposed to the entire image. This is accomplished by *convolutionalizing* a pre-trained image recognition backbone, which transforms the model into a [Fully Convolutional Network (FCN)](https://arxiv.org/abs/1605.06211) capable of per-pixel labeling. Especially useful for environmental perception, segmentation yields dense per-pixel classifications of many different potential objects per scene, including scene foregrounds and backgrounds.
<img src="https://github.com/dusty-nv/jetson-inference/raw/pytorch/docs/images/segmentation.jpg">
[`segNet`](../c/segNet.h) accepts as input the 2D image, and outputs a second image with the per-pixel classification mask overlay. Each pixel of the mask corresponds to the class of object that was classified. [`segNet`](../c/segNet.h) is available to use from [Python](https://rawgit.com/dusty-nv/jetson-inference/pytorch/docs/html/python/jetson.inference.html#segNet) and [C++](../c/segNet.h).
As examples of using the `segNet` class, we provide sample programs C++ and Python:
- [`segnet.cpp`](../examples/segnet/segnet.cpp) (C++)
- [`segnet.py`](../python/examples/segnet.py) (Python)
These samples are able to segment images, videos, and camera feeds. For more info about the various types of input/output streams supported, see the [Camera Streaming and Multimedia](aux-streaming.md) page.
See [below](#pretrained-segmentation-models-available) for various pre-trained segmentation models available that use the FCN-ResNet18 network with realtime performance on Jetson. Models are provided for a variety of environments and subject matter, including urban cities, off-road trails, and indoor office spaces and homes.
This file has been truncated. show original
Thanks.
system
Closed
January 26, 2022, 3:08am
5
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.