VSLAM / Latency Difference in Image Conversion: BGRA8 to Mono8 Conversion / Intra Process

Hello everyone,

I’ve been working on a Visual SLAM pipeline with ZED cameras and noticed a significant difference in latency when and nodegraph converting images from BGRA8(ZED Camera Output to Mono8 using different configurations. I tested both the provided example launch file and a custom launch file I created, and the results were distinct.

Here’s a summary of the setups I compared:


1. Using the Example Launch File (ZED Live):

  • In this setup, the provided example launch file from the ZED ROS2 package is used.
  • A single node is responsible for converting images directly from BGRA8 (the ZED camera’s output) to Mono8.
  • The node graph is streamlined, with minimal steps in the conversion process.

2. Custom Launch File (Configured):

I configured isaac_ros_visual_slam_core.launch.py

from typing import Any, Dict

from isaac_ros_examples import IsaacROSLaunchFragment
import launch
from launch.actions import DeclareLaunchArgument
from launch.substitutions import LaunchConfiguration
from launch_ros.actions import ComposableNodeContainer
from launch_ros.descriptions import ComposableNode


class IsaacROSVisualSlamLaunchFragment(IsaacROSLaunchFragment):

    @staticmethod
    def get_composable_nodes(interface_specs: Dict[str, Any]) -> Dict[str, ComposableNode]:
        print("test")
        enable_image_denoising = LaunchConfiguration('enable_image_denoising')
        print( enable_image_denoising)
        rectified_images = LaunchConfiguration('rectified_images')
        enable_slam_visualization = LaunchConfiguration('enable_slam_visualization')
        enable_landmarks_view = LaunchConfiguration('enable_landmarks_view')
        enable_observations_view = LaunchConfiguration('enable_observations_view')
        camera_optical_frames = LaunchConfiguration('camera_optical_frames')
        base_frame = LaunchConfiguration('base_frame')
        num_cameras = LaunchConfiguration('num_cameras')
        enable_imu_fusion = LaunchConfiguration('enable_imu_fusion')
        imu_frame = LaunchConfiguration('imu_frame')
        gyro_noise_density = LaunchConfiguration('gyro_noise_density')
        gyro_random_walk = LaunchConfiguration('gyro_random_walk')
        accel_noise_density = LaunchConfiguration('accel_noise_density')
        accel_random_walk = LaunchConfiguration('accel_random_walk')
        calibration_frequency = LaunchConfiguration('calibration_frequency')
        image_jitter_threshold_ms = LaunchConfiguration('image_jitter_threshold_ms')

        return {

            #  rgb8 converter nodes
            'image_format_converter_node_left_rgb': ComposableNode(
                package='isaac_ros_image_proc',
                plugin='nvidia::isaac_ros::image_proc::ImageFormatConverterNode',
                name='image_format_node_left_rgb',
                parameters=[{
                    'encoding_desired': 'rgb8',
                    'image_width': interface_specs['camera_resolution']['width'],
                    'image_height': interface_specs['camera_resolution']['height'],
                }],
                remappings=[
                    ('image_raw', '/zed_front/zed_node_0/left/image_rect_color'),
                    ('image', 'left/image_rect'),
                ]),
            'image_format_converter_node_right_rgb': ComposableNode(
                package='isaac_ros_image_proc',
                plugin='nvidia::isaac_ros::image_proc::ImageFormatConverterNode',
                name='image_format_node_right_rgb',
                parameters=[{
                    'encoding_desired': 'rgb8',
                    'image_width': interface_specs['camera_resolution']['width'],
                    'image_height': interface_specs['camera_resolution']['height'],
                }],
                remappings=[
                    ('image_raw', '/zed_front/zed_node_0/right/image_rect_color'),
                    ('image', 'right/image_rect'),
                ]),

            # mono8 converter nodes
            'image_format_converter_node_left_mono': ComposableNode(
                package='isaac_ros_image_proc',
                plugin='nvidia::isaac_ros::image_proc::ImageFormatConverterNode',
                name='image_format_node_left_mono',
                parameters=[{
                    'encoding_desired': 'mono8',
                    'image_width': interface_specs['camera_resolution']['width'],
                    'image_height': interface_specs['camera_resolution']['height'],
                }],
                remappings=[
                    ('image_raw', 'left/image_rect'),
                    ('image', 'left/image_rect_mono'),
                ]),
            'image_format_converter_node_right_mono': ComposableNode(
                package='isaac_ros_image_proc',
                plugin='nvidia::isaac_ros::image_proc::ImageFormatConverterNode',
                name='image_format_node_right_mono',
                parameters=[{
                    'encoding_desired': 'mono8',
                    'image_width': interface_specs['camera_resolution']['width'],
                    'image_height': interface_specs['camera_resolution']['height'],
                }],
                remappings=[
                    ('image_raw', 'right/image_rect'),
                    ('image', 'right/image_rect_mono'),
                ]),

            #  visual_slam_node
            'visual_slam_node': ComposableNode(
                name='visual_slam_node',
                package='isaac_ros_visual_slam',
                plugin='nvidia::isaac_ros::visual_slam::VisualSlamNode',
                parameters=[{
                    'enable_image_denoising': enable_image_denoising,
                    'rectified_images': rectified_images,
                    'enable_slam_visualization': enable_slam_visualization,
                    'enable_landmarks_view': enable_landmarks_view,
                    'enable_observations_view': enable_observations_view,
                    'base_frame': base_frame,
                    'enable_imu_fusion': enable_imu_fusion,
                    'imu_frame': imu_frame,
                    'gyro_noise_density': gyro_noise_density,
                    'gyro_random_walk': gyro_random_walk,
                    'accel_noise_density': accel_noise_density,
                    'accel_random_walk': accel_random_walk,
                    'calibration_frequency': calibration_frequency,
                    'image_jitter_threshold_ms': image_jitter_threshold_ms,
                }],
                remappings=[
                    ('/visual_slam/image_0', 'left/image_rect_mono'),
                    ('/visual_slam/camera_info_0', 'left/camera_info_rect'),
                    ('/visual_slam/image_1', 'right/image_rect_mono'),
                    ('/visual_slam/camera_info_1', 'right/camera_info_rect'),
                    ('visual_slam/imu', 'camera/imu'),
                ],
            )

##################################################

        }   # End of return


    @staticmethod
    def get_launch_actions(interface_specs: Dict[str, Any]) -> \
            Dict[str, launch.actions.OpaqueFunction]:
        return {
            'enable_image_denoising': DeclareLaunchArgument(
                'enable_image_denoising',
                default_value='False',
            ),
            'rectified_images': DeclareLaunchArgument(
                'rectified_images',
                default_value='False',
            ),
            'enable_slam_visualization': DeclareLaunchArgument(
                'enable_slam_visualization',
                default_value='True',
            ),
            'enable_landmarks_view': DeclareLaunchArgument(
                'enable_landmarks_view',
                default_value='True',
            ),
            'enable_observations_view': DeclareLaunchArgument(
                'enable_observations_view',
                default_value='True',
            ),
            'camera_optical_frames': DeclareLaunchArgument(
                'camera_optical_frames',
                default_value='[zed_front_left_camera_optical_frame, \
                                zed_front_right_camera_optical_frame, \
                                ]',
            ),
            'base_frame': DeclareLaunchArgument(
                'base_frame',
                default_value='base_link',
            ),
            'num_cameras': DeclareLaunchArgument(
                'num_cameras',
                default_value='2',
            ),
            'enable_imu_fusion': DeclareLaunchArgument(
                'enable_imu_fusion',
                default_value='True',
            ),
            'imu_frame': DeclareLaunchArgument(
                'imu_frame',
                default_value='zed_front_camera_center',
            ),
            'gyro_noise_density': DeclareLaunchArgument(
                'gyro_noise_density',
                default_value='0.000244',
            ),
            'gyro_random_walk': DeclareLaunchArgument(
                'gyro_random_walk',
                default_value='0.000019393',
            ),
            'accel_noise_density': DeclareLaunchArgument(
                'accel_noise_density',
                default_value='0.001862',
            ),
            'accel_random_walk': DeclareLaunchArgument(
                'accel_random_walk',
                default_value='0.003',
            ),
            'calibration_frequency': DeclareLaunchArgument(
                'calibration_frequency',
                default_value='200.0',
            ),
            'image_jitter_threshold_ms': DeclareLaunchArgument(
                'image_jitter_threshold_ms',
                default_value='34.0',
            )
        }

def generate_launch_description():
    """Launch file to bring up visual slam node standalone."""
    # Define interface_specs with required parameters
    interface_specs = {
        'camera_resolution': {
            'width': 640,  # Match the resolution of your camera
            'height': 360
        }
    }

    # Get launch actions (declare arguments)
    launch_actions = IsaacROSVisualSlamLaunchFragment.get_launch_actions(interface_specs)

    # Pass interface_specs to get_composable_nodes
    visual_slam_launch_container = ComposableNodeContainer(
        name='visual_slam_launch_container',
        namespace='',
        package='rclcpp_components',
        executable='component_container',
        composable_node_descriptions=IsaacROSVisualSlamLaunchFragment.get_composable_nodes(
            interface_specs  # Pass the dictionary here
        ).values(),
        output='screen',
    )

  # Include declared arguments in the launch description
    return launch.LaunchDescription(list(launch_actions.values()) + [visual_slam_launch_container])
  • In my custom configuration, I use two separate nodes for the image conversion:
    • The first node converts BGRA8 to RGB8.
    • The second node converts RGB8 to Mono8.
  • I haven’t yet enabled intra-process communication but would like to explore how this can be done to minimize latency.

I tried to input BGRA8 images and convert it to Mono8 directly with

            'image_format_converter_node_left_rgb': ComposableNode(
                package='isaac_ros_image_proc',
                plugin='nvidia::isaac_ros::image_proc::ImageFormatConverterNode',
                name='image_format_node_left_rgb',
                parameters=[{
                    'encoding_desired': 'mono8',
                    'image_width': interface_specs['camera_resolution']['width'],
                    'image_height': interface_specs['camera_resolution']['height'],
                }],
                remappings=[
                    ('image_raw', '/zed_front/zed_node_0/left/image_rect_color'),
                    ('image', 'left/image_rect'),

But it didn’t work giving error it expect rgb8 encoding.


Key Observations from Testing:

  • Example Launch File:
    The conversion from BGRA8 to Mono8 in a single node is faster and has lower latency.
  • Custom Launch File:
    The two-node conversion process increases latency, and I suspect enabling intra-process communication could help reduce it.

Questions for the Community:

  1. How can I enable intra-process communication for my custom launch file in ROS2?
  2. Is it possible to replicate the single-node behavior (BGRA8 to Mono8) from the example launch file in a custom configuration?

I’m hoping to streamline my setup for real-time processing and would greatly appreciate any insights or suggestions on optimizing the image conversion pipeline. Thanks in advance!

Following up on my earlier post about BGRA8 to Mono8 conversion, I discovered the ZED camera can directly output mono8 images. This raises a question:

Would using the ZED camera’s mono8 output reduce latency compared to using the isaac_ros_image_proc::ImageFormatConverterNode?

Key Points:

  • Skipping Conversion Nodes: Using mono8 output directly removes the need for conversion nodes, potentially lowering latency.
  • Isaac ROS Converters: These are GPU-optimized and might still be faster, even with the added node.

Has anyone tried using the ZED camera’s mono8 output in their Isaac ROS pipelines? Any insights on real-time performance trade-offs would be greatly appreciated!

Thanks!