Free space detection using jetson inference segmentation

OK great, glad you got it working how you wanted it to.

One item to point out, if you wish to run this on a realtime stream, it should help the performance to not allocate new CUDA buffers each frame. Instead you can do something like this:

class_mask = None
class_mask_np = None

while True:   # your camera loop
   ret, frame = cap.read()   
   # your other code here

  if class_mask is None:
     class_mask = jetson.utils.cudaAllocMapped(width=img_width, height=img_height, format="gray8")
     class_mask_np = jetson.utils.cudaToNumpy(class_mask)

This way, the CUDA memory is only allocated once. Also you only need to call cudaToNumpy() once per buffer - the mapping is persistent. Any changes you make in numpy will show up in CUDA memory, and vice versa (because itā€™s mapped to the same memory). Your other cudaToNumpy() calls you only need to do once also.

Hello Dusty,

Thank you so much for your suggestion.
I updated code as per your suggestion.
It is working charm for ā€˜class_maskā€™ but it is not working for ā€˜cuda_frameā€™.
At the end of this code img variable is looking like a masked image. Instead i want img to be original image (with cuda optimization) so that i can use this img and class_mask output to create an overlay image using opencv. Now both are masked image so the final overlay-ed image is also a masked image.

Can not i use cuda optimization for cuda_frame. As suggested i wanted to call cudaToNumpy only once here for cuda_frame. I was able to get frames but all masked.

Below is the snippet of the code i tried. It is extension of previous posted code.

  # Allocate buffer for cuda_frame
  if cuda_frame is None:
    cuda_frame = jetson.utils.cudaAllocMapped(width=img_width, height=img_height, format="rgba8")
    img = jetson.utils.cudaToNumpy(cuda_frame, img_width, img_height, 4)

  frame_rgba = cv2.cvtColor(frame, cv2.COLOR_BGR2RGBA)
  cuda_frame = jetson.utils.cudaFromNumpy(frame_rgba)

  # process the segmentation network
  net.Process(cuda_frame)
  num_classes = net.GetNumClasses()
  jetson.utils.cudaDeviceSynchronize()
  img = cv2.cvtColor(img, cv2.COLOR_RGBA2RGB).astype(np.uint8)
  img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR) 

  # Allocate buffer for mask
  if class_mask is None:
    class_mask = jetson.utils.cudaAllocMapped(width=img_width, height=img_height, format="rgb8")
    class_mask_np = jetson.utils.cudaToNumpy(class_mask)
 
  # get the class mask (each pixel contains the classID for itself)
  net.Mask(class_mask, img_width, img_height, format="rgb8")

Hello Dusty,

Apart from above last clarification, i would need another valuable input from your side.
If i can fix above optimisation issue then i am almost done with free space with segmentation task.

As part of my project now i have another task.

Now for my robotic application my task is to track a moving person on the free space detected above. I guess it is some thing to do with depth estimation but have no idea about it.

Since I have a limitation to use jetson inference only since it was used for object detection, free space segmentation, is there any way that tracking moving object or person is possible to achieve using jetson inference?
OR
Is there any way that i can collaborate with any other available tracking algorithm with jetson inference?

Any guidance or github link to explore would be really helpful to achieve my goal.

Thanks and Regards,
Udaykiran Patnaik.

I think you may not be able to use it for cuda_frame, because that comes from your cv2.VideoCapture(), which returns new frame each time. You also donā€™t need to allocate CUDA memory yourself for this. I think you may just go back to:

cap = cv2.VideoCapture(0)
ret, frame = cap.read()
frame_rgba = cv2.cvtColor(frame, cv2.COLOR_BGR2RGBA)
cuda_frame = jetson.utils.cudaFromNumpy(frame_rgba)

Or really, you could just use jetson.utils.videoSource() and it will already be in CUDA for you.

Hello Dusty,

Thank you so much.
I understood.

I think this solves my free space detection problem as of now.

Thanks and Regards,
Udaykiran Patnaik.

I havenā€™t done tracking before with jetson-inference, but VPI (Vision Programming Interface) has a tracking algorithm: VPI - Vision Programming Interface: KLT Bounding Box Tracker

VPI doesnā€™t have a Python interface yet, that will be coming in a future version. DeepStream has tracking too.

Hello Dusty,
Thank you so much for your quick reply.
I understood.

Thanks and Regards,
Udaykiran Patnaik.

No problem - by the way, if you donā€™t need temporal tracking, you could simply do it as you outlined in the other thread:

The VPI or DeepStream-based tracking would be for if you wanted certainty that the bounding box was the same person frame-to-frame (for example, if multiple people were in the camera frame).

ā€œTracking by detectionā€ is essentially what was being referred to in the other thread - since these object detection DNNs are fairly accurate, they produce detection bounding-boxes a good amount of the time (although there is noise). However the object detection DNN doesnā€™t know if that is the same person or not in the bounding box - just that it is a person (any person).

Hello Dusty,

Thank you so much for your reply.
I understood. I will try what i have outlined in other thread.

Thanks and Regards,
Udaykiran Patnaik.