Instead of using the unet human segmentation package used in the nvblox examples, is it possible to use something like sam2 for segmenting humans?
Hello @haroonhidayath45,
Thanks for posting in the Isaac ROS forum!
Yes, you can build your own SAM2 ROS node that outputs a per-pixel human mask to feed into nvblox. While doing so, you’ll need to remove the original UNet segmentation node from the launch file (e.g., realsense_example.launch.py) and start your SAM2 node instead. Make sure to remap the SAM2 mask output (binary human mask image) to the topic that nvblox originally subscribes to, so the entire pipeline processes it correctly.
Are there any requirements other than image size for the mask image? From what I understand, It only requires a mask with humans as 255 and bg as 0 right?
Yes, your understanding is basically correct, but there are a few strict requirements beyond just image size. The mask must:
-
Be a mono8 image with humans as non‑zero (e.g. 255) and background as 0
-
Have the same width & height as the color image that nvblox uses
-
Use the same
frame_idas that color image -
Have a timestamp (
header.stamp) synchronized with the color image
Good luck with your implementation!