Isaac_ros_foundationpose for pallets

Hi,

I would like to estimate the pose for “pallets” object. I tried to run isaac_ros_foundationpose with 3d mesh and texture image as input on a rosbag. But, I was not able to obtain any output because isaac_ros_rtdetr with ´sdetr_amr´ model did not detect my pallet. Can the model detect all pallets zero-shot or model retraining is required? I could also test my setup on your rosbag containing pallets if you could provide a rosbag with pallets.

Another question that I have is, the isaac_ros_foundationpose requires depth image, rgb image, segmentation mask as input. In my setup, the segmentation mask is published in a lower frequency than depth and rgb image. Is the time-sync auto taken care by isaac_ros_foundationpose or?

Many Thanks,

I ran the sdetr_amr model on r2b_storage bag from r2b dataset 2023 | NVIDIA NGC and the model was not able to detect the pallets. Please find attached the screenshots.
Screnshot 1:

Screnshot 2:

My launch command is ros2 launch isaac_ros_rtdetr.launch.py launch_fragments:=rtdetr interface_specs_file:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_rtdetr/quickstart_interface_specs.json engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/synthetica_detr/sdetr_amr.plan and the corresponding launch file is pasted below,

# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
# Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# SPDX-License-Identifier: Apache-2.0

import launch
from launch.actions import DeclareLaunchArgument
from launch.substitutions import LaunchConfiguration
from launch_ros.actions import ComposableNodeContainer
from launch_ros.descriptions import ComposableNode

MODEL_INPUT_SIZE = 640  # RT-DETR models expect 640x640 encoded image size
MODEL_NUM_CHANNELS = 3  # RT-DETR models expect 3 image channels


def generate_launch_description():
    """Generate launch description for testing relevant nodes."""
    launch_args = [
        DeclareLaunchArgument(
            'model_file_path',
            default_value='',
            description='The absolute file path to the ONNX file'),
        DeclareLaunchArgument(
            'engine_file_path',
            default_value='',
            description='The absolute file path to the TensorRT engine file'),
        DeclareLaunchArgument(
            'input_image_width',
            default_value='640',
            description='The input image width'),
        DeclareLaunchArgument(
            'input_image_height',
            default_value='480',
            description='The input image height'),
        DeclareLaunchArgument(
            'input_tensor_names',
            default_value='["images", "orig_target_sizes"]',
            description='A list of tensor names to bound to the specified input binding names'),
        DeclareLaunchArgument(
            'input_binding_names',
            default_value='["images", "orig_target_sizes"]',
            description='A list of input tensor binding names (specified by model)'),
        DeclareLaunchArgument(
            'output_tensor_names',
            default_value='["labels", "boxes", "scores"]',
            description='A list of tensor names to bound to the specified output binding names'),
        DeclareLaunchArgument(
            'output_binding_names',
            default_value='["labels", "boxes", "scores"]',
            description='A list of output tensor binding names (specified by model)'),
        DeclareLaunchArgument(
            'verbose',
            default_value='False',
            description='Whether TensorRT should verbosely log or not'),
        DeclareLaunchArgument(
            'force_engine_update',
            default_value='False',
            description='Whether TensorRT should update the TensorRT engine file or not'),
    ]

    # Image Encoding parameters
    input_image_width = LaunchConfiguration('input_image_width')
    input_image_height = LaunchConfiguration('input_image_height')

    # TensorRT parameters
    model_file_path = LaunchConfiguration('model_file_path')
    engine_file_path = LaunchConfiguration('engine_file_path')
    input_tensor_names = LaunchConfiguration('input_tensor_names')
    input_binding_names = LaunchConfiguration('input_binding_names')
    output_tensor_names = LaunchConfiguration('output_tensor_names')
    output_binding_names = LaunchConfiguration('output_binding_names')
    verbose = LaunchConfiguration('verbose')
    force_engine_update = LaunchConfiguration('force_engine_update')

    resize_node = ComposableNode(
        name='resize_node',
        package='isaac_ros_image_proc',
        plugin='nvidia::isaac_ros::image_proc::ResizeNode',
        parameters=[{
            'input_width': input_image_width,
            'input_height': input_image_height,
            'output_width': MODEL_INPUT_SIZE,
            'output_height': MODEL_INPUT_SIZE,
            'keep_aspect_ratio': True,
            'encoding_desired': 'rgb8',
            'disable_padding': True
        }],
        remappings=[
            #('image', '/camera/color/image_flipped'),
            ('image', 'd455_1_rgb_image'),
            #('camera_info', '/camera/color/camera_info')
            ('camera_info', 'd455_1_rgb_camera_info')
        ]

    )

    pad_node = ComposableNode(
        name='pad_node',
        package='isaac_ros_image_proc',
        plugin='nvidia::isaac_ros::image_proc::PadNode',
        parameters=[{
            'output_image_width': MODEL_INPUT_SIZE,
            'output_image_height': MODEL_INPUT_SIZE,
            'padding_type': 'BOTTOM_RIGHT'
        }],
        remappings=[(
            'image', 'resize/image'
        )]
    )

    image_format_node = ComposableNode(
        name='image_format_node',
        package='isaac_ros_image_proc',
        plugin='nvidia::isaac_ros::image_proc::ImageFormatConverterNode',
        parameters=[{
                'encoding_desired': 'rgb8',
                'image_width': MODEL_INPUT_SIZE,
                'image_height': MODEL_INPUT_SIZE
        }],
        remappings=[
            ('image_raw', 'padded_image'),
            ('image', 'image_rgb')]
    )

    image_to_tensor_node = ComposableNode(
        name='image_to_tensor_node',
        package='isaac_ros_tensor_proc',
        plugin='nvidia::isaac_ros::dnn_inference::ImageToTensorNode',
        parameters=[{
            'scale': False,
            'tensor_name': 'image',
        }],
        remappings=[
            ('image', 'image_rgb'),
            ('tensor', 'normalized_tensor'),
        ]
    )

    interleave_to_planar_node = ComposableNode(
        name='interleaved_to_planar_node',
        package='isaac_ros_tensor_proc',
        plugin='nvidia::isaac_ros::dnn_inference::InterleavedToPlanarNode',
        parameters=[{
            'input_tensor_shape': [MODEL_INPUT_SIZE, MODEL_INPUT_SIZE, MODEL_NUM_CHANNELS]
        }],
        remappings=[
            ('interleaved_tensor', 'normalized_tensor')
        ]
    )

    reshape_node = ComposableNode(
        name='reshape_node',
        package='isaac_ros_tensor_proc',
        plugin='nvidia::isaac_ros::dnn_inference::ReshapeNode',
        parameters=[{
            'output_tensor_name': 'input_tensor',
            'input_tensor_shape': [MODEL_NUM_CHANNELS, MODEL_INPUT_SIZE, MODEL_INPUT_SIZE],
            'output_tensor_shape': [1, MODEL_NUM_CHANNELS, MODEL_INPUT_SIZE, MODEL_INPUT_SIZE]
        }],
        remappings=[
            ('tensor', 'planar_tensor')
        ],
    )

    rtdetr_preprocessor_node = ComposableNode(
        name='rtdetr_preprocessor',
        package='isaac_ros_rtdetr',
        plugin='nvidia::isaac_ros::rtdetr::RtDetrPreprocessorNode',
        remappings=[
            ('encoded_tensor', 'reshaped_tensor')
        ]
    )

    tensor_rt_node = ComposableNode(
        name='tensor_rt',
        package='isaac_ros_tensor_rt',
        plugin='nvidia::isaac_ros::dnn_inference::TensorRTNode',
        parameters=[{
            'model_file_path': model_file_path,
            'engine_file_path': engine_file_path,
            'output_binding_names': output_binding_names,
            'output_tensor_names': output_tensor_names,
            'input_tensor_names': input_tensor_names,
            'input_binding_names': input_binding_names,
            'verbose': verbose,
            'force_engine_update': force_engine_update
        }]
    )

    rtdetr_decoder_node = ComposableNode(
        name='rtdetr_decoder',
        package='isaac_ros_rtdetr',
        plugin='nvidia::isaac_ros::rtdetr::RtDetrDecoderNode',
        parameters=[{
            'confidence_threshold': 0.5
        }]
    )

    container = ComposableNodeContainer(
        name='rtdetr_container',
        namespace='rtdetr_container',
        package='rclcpp_components',
        executable='component_container_mt',
        composable_node_descriptions=[
           resize_node, pad_node, image_format_node,
            image_to_tensor_node, interleave_to_planar_node, reshape_node,
            rtdetr_preprocessor_node, tensor_rt_node, rtdetr_decoder_node
        ],
        output='screen'
    )

    final_launch_description = launch_args + [container]
    return launch.LaunchDescription(final_launch_description)

Please provide pointers on why is this the case.

Many Thanks

Thank you for your post,

I am still investigating how to set up your environment to reproduce on my device.

1 Like

Thanks @Raffaello

While you investigate that issue, I also wanted to know the reason for the following log message from the foundation pose node. It appears that the foundation pose has started without any errors but there is no pose output from the node.

Please note that this setup is running on a different custom rosbag.

Thanks.

Thank you for your message,

Looking at your logs, the issue may be related to the output framerate. The Isaac ROS foundation pose requires more time to analyze the input stream, which could be causing the delay.

Best,
Raffaello

1 Like

Thanks @Raffaello

In this case, I’m trying to estimate the pose of a pallet using Foundation Pose. I have the detection mask coming from my custom model since the SyntheticDETR was not detecting it. Even though all the three input images come at a similar frequency, there is no output from the foundation pose GEM.

My 3d model of the pallet look like below,

Below is my RVIZ,

Below is my console from foundationpose GEM,

Do you have any pointers on why is this the case? The quickstart of foundationpose using your example rosbag worked just fine. Is there a specific setup required for the foundationpose model to work?

I look forward for your response.

Many Thanks

@Raffaello

Hi,

Just an update: I was able to estimate the pose of the pallet by running the FoundationPose model directly on my input (by following the steps in the official repository).

For some reason, it is not working inside ISAAC ROS environment and I’m not able to understand why.

Maybe few pointers from your side might help. Please let me know if you need further information from my side.

Thanks

Hi,

Thanks for your status update.
Could you please check if the your customized detector does detect pallets for FoundationPose in ROS?

$ ros2 topic echo /detections_output

Best,
Todd

Hi @ToddT ,

As a first step, I am publishing a static image with the segmentation mask of the object directly. The foundationpose node receives rgb image, depth image and segmentation mask from three topics. But, I could see no output from the node.

Thanks,

Hi @amu459

As a first step, I am publishing a static image with the segmentation mask of the object directly. The foundationpose node receives rgb image, depth image and segmentation mask from three topics. But, I could see no output from the node.

Could you please share the launch file can corresponding image so we can try this at our end?

Best,
Todd

hello,I meet the same problem as you met,the rgb_iamge,depth_image,detection and segmentation is normal publish,but the log is waiting to the receiver.Have you solve the problem?

Hi @Miahhhh ,

Thank you for your post.
Could you please share you’re environment info?
Are you running the foundation pose on Orin or x86, what’s the camera module you’re using?

Best,
-Todd

Thank your apply,
I run in the x86 ,ubuntu22.04, the camera is realsense D435i



the image and the log is show as

Hi @Miahhhh ,

This warning message happens when the camera detection pipeline frame rate is low but it doesn’t affect the function of FoundationPose as long as the segmentation topic is producing message.
This could happen when using RealSense because it’s USB cam.

Are you using the .obj and .png file mentioned in isaac_ros_foundationpose — isaac_ros_docs documentation and launch the pipeline with below command?

ros2 launch isaac_ros_examples isaac_ros_examples.launch.py launch_fragments:=realsense_mono_rect_depth,foundationpose mesh_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/Mac_and_cheese_0_1/Mac_and_cheese_0_1.obj texture_path:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/Mac_and_cheese_0_1/materials/textures/baked_mesh_tex0.png score_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/score_trt_engine.plan refine_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/refine_trt_engine.plan rt_detr_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/synthetica_detr/sdetr_grasp.plan

Thanks,
-Todd

@ToddT I didn’t use the launch you mention, I use the yolov8 to detect the bottle for my own 3d-mesh model, I revise the launchfoundationpose_realsense.launch for using the yolov8 to replace the rt_detr detection algorithm.
Thanks

@ToddT I don’t have the object to test the ros2 launch,it detect nothing ,isaac_ros_examples isaac_ros_examples.launch.py launch_fragments:=realsense_mono_rect_depth,foundationpose mesh_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/Mac_and_cheese_0_1/Mac_and_cheese_0_1.obj texture_path:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/Mac_and_cheese_0_1/materials/textures/baked_mesh_tex0.png score_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/score_trt_engine.plan refine_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/refine_trt_engine.plan rt_detr_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/synthetica_detr/sdetr_grasp.plan

Hi @Miahhhh

I didn’t use the launch you mention, I use the yolov8 to detect the bottle for my own 3d-mesh model,

Could you please share the obj and tex here so we can check on our side?

I don’t have the object to test the ros2 launch,it detect nothing

We have Rosbag version of example, could you please check if that sample works on your side?

Thanks,
-Todd

@ToddT
You mention with rosbag launch is this “ros2 launch isaac_ros_examples isaac_ros_examples.launch.py launch_fragments:=foundationpose interface_specs_file:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/quickstart_interface_specs.json mesh_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/Mustard/textured_simple.obj texture_path:=${ISAAC_ROS_WS}/isaac_ros_assets/isaac_ros_foundationpose/Mustard/texture_map.png score_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/score_trt_engine.plan refine_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/foundationpose/refine_trt_engine.plan rt_detr_engine_file_path:=${ISAAC_ROS_WS}/isaac_ros_assets/models/synthetica_detr/sdetr_grasp.plan”? It works OK

my obj and tex is
mesh.zip (29.8 MB)

Thanks

Hi @Miahhhh ,

I modified isaac_ros_pose_estimation/isaac_ros_foundationpose/launch/isaac_ros_foundationpose_core.launch.py
to use Yolov8 detector in isaac_ros_yolov8 — isaac_ros_docs documentation.

The result is positive, it can estimate the pose with your mesh file.
Could you please check again your launch file?

Thanks,
-Todd