Publish a ROS Image message from Camera and use in Isaac ROS Image Segmentation

Assuming one has camera working with Nvidia Jetson Nano developer’s kit, validated through:

gst-launch-1.0 nvarguscamerasrc !
‘video/x-raw(memory:NVMM), width=(int)1920, height=(int)1080,
format=(string)NV12, framerate=(fraction)30/1’ ! nvvidconv !
video/x-raw, format=I420 ! x264enc !
h264parse ! qtmux ! filesink
location=output.mp4 -e

How should one go about publishing the camera data as a ROS message such as in the format sensor_imgs/Image ? I want to achieve Isaac ROS image segmentation on live camera feed instead of playing the ros bag of sample data.

The platform is Nvidia Orin Nano Developer’s kit, the camera is Raspberry Pi High Quality Camera (IMX477).

Secondly, how should one go about subscribing to ROS camera message (instead of ros bag) for image segmentation (which commands to modify/add from these quickstart guidelines)?

Hi @hassaniqbal209, since you are using Isaac ROS, please see the Argus camera node: https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_argus_camera

There is also documentation on running the image segmentation node with MIPI CSI camera here: https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_image_segmentation/blob/main/docs/tutorial-nitros-graph.md

If you have additional questions about Isaac ROS, I would recommend posting to the Isaac ROS forum here:

Hi @dusty_nv ,

Thanks for your response. Can you please explain the difference between using Isaac ROS and for example, this framework: GitHub - dusty-nv/ros_deep_learning: Deep learning inference nodes for ROS / ROS2 with support for NVIDIA Jetson and TensorRT

The eventual goal is to run Segment-Anything on Nvidia Orin Nano Developer’s kit on live camera feed. I assumed Isaac ROS is the way to go, but if there is a simpler way to approach this problem, can you elaborate on the steps please?

Segment-anything uses 1) image encoder that is implemented in PyTorch and requires a GPU for efficient inference, and 2) prompt encoder and mask decoder that can run directly with PyTroch or converted to ONNX and run efficiently on CPU or GPU across a variety of platforms that support ONNX runtime.

Appreciate your guidance a lot!

Isaac ROS is newer, more optimized, and provides more GPU-accelerated inferencing/processing nodes. It also integrates with Isaac Sim. Whereas ros_deep_learning is basically a ROS wrapper for jetson-inference.

I’m not sure what modifications would be required to the isaac_ros_unet segmentation node in order to run segment-anything, because it’s currently setup for U-Net. Until you have segment-anything model exported to ONNX and running in TensorRT, you might just roll your own ROS node for it that uses PyTorch.

Isaac ROS is newer, more optimized, and provides more GPU-accelerated inferencing/processing nodes. It also integrates with Isaac Sim. Whereas ros_deep_learning is basically a ROS wrapper for jetson-inference.

@dusty_nv , thank you for clarification.

Until you have segment-anything model exported to ONNX and running in TensorRT, you might just roll your own ROS node for it that uses PyTorch.

  1. How does one publish camera data when using your above suggestion? If my understanding is correct, this option is different from Isaac ROS, implying I may not be able to use your earlier proposed solution of Argus Camera Node? It also seems the cameras needed for our application, aren’t compatible with Agnus-Camera.

  2. To use the ONNX model, the image must first be pre-processed using SAMPredictor (along with exported ONNX model). Is there a way to optimze SAMPredictor for Orin Nano developer’s kit?

Thanks so much for the guidance.

The ROS camera publisher nodes are independent of the processing nodes. Camera nodes (such as argus_camera from Isaac ROS, video_source from ros_deep_learning, or v4l2_camera from ROS2) all publish the sensor_msgs/Image message type, which the processing nodes subscribe to. So you can connect them up however you need in your launch file since they use the same messages.

Do you have a requirement to integrate with ROS as part of a bigger project, or are you just trying to capture live camera data and interface your DNN’s with it? If the later, you can just do that in your own Python scripts without needing ROS. From Python, you can directly capture the video data using cv2.VideoCapture, GStreamer, or jetson-inference libraries and pass it to your PyTorch/TensorRT code.

I haven’t deployed segment-anything to Jetson, so I’m not sure, sorry. If there isn’t a way to export the entire model to ONNX and run it with TensorRT, you could try torch2trt - otherwise just use TensorRT for the parts that can be exported, and the other parts run in PyTorch with GPU enabled.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.