Performance Issue of ESS Disparity Graph on RTX4090

HiddenNetwork · November 15, 2023, 8:57pm

Hi there,

I’m trying to take a side-by-side 30fps 720p stereo video feed, run through ESS Disparity, and then feed the output into NVblox to get a 3D reconstruction in realtime.

I’ve written a Python node using OpenCV that splits the side by side video into left/right images for input to ESS.

I’ve created a ROS2 launch file to launch the following disparity graph / pipeline:

gscam2 -> stereo_splitter -> Isaac ROS Rectification -> Isaac ROS Resize -> Isaac ROS ESS StereoDNN Disparity

We’ve ran this pipeline on a PC with an RTX4090, and are noticing intermittent throttling. Sometimes it works for a few minutes, and then permanently throttles. Other times, it temporarily throttles and then recovers.

I collected data using ros2 topic hz -w 30 /<topic_name> | grep average for each of the relevant topics, and graphed them below. Notice at ~50second mark, the FPS performance of all NITROS nodes dips, and eventually recovers.

Interestingly, the exact same launch file, settings, and input works perfectly on our laptops with an A2000 GPU. Notice in the graph below, there is no drop in FPS performance in any nodes, and all nodes are consistently > 20fps.

We’ve found that whether the left/right images are fed into Rectification, Resize, or Disparity node, directly, it exhibits throttling on the RTX 4090. For example, running only rectification nodes in Nitros as shown, also experiences an FPS drop:

gscam2 -> stereo_splitter -> Isaac ROS Rectification

We suspect that the data flow between the non-NITROS nodes (stereo_splitter) and the first NITROS-enabled node which accepts the left/right images is a performance bottleneck on the RTX4090.

We are using GPU driver 525.85 on the RTX4090 PC, manually installed through the .run file (as per the Isaac sim Recommendation), and believe the rest of the system is configured correctly on Ubuntu 22.04:

The PowerMizer settings in “Nvidia settings” are set to “Prefer Maximum Performance”.
We’ve also tried maxing the CPU clock frequency to performance mode with cpupower-gui
We’ve confirmed it’s not network latency issues.

We initially tried GPU driver version 535.129, but found the throttling performance was worse / started throttling earlier.

When we notice throttling, we also notice the GPU usage drops and CPU usage increases. When the FPS issue eventually recovers, the CPU/GPU usage goes back to normal.

We are confused why this works on our laptop, but not an RTX4090 PC. Could you shed some light on this issue? Do you see any issues with the way existing nodes are interacting with each other? Are there some things you can suggest to verify the RTX4090 is setup correctly, or some tools to help debug and solve this?

PC Specs:

RAM: 64GB
Processor: i9-13900KF
GPU: Nvidia RTX 4090
OS: Ubuntu 22.04.3
Isaac ROS: 2.0
ESS model: 3.0

 >> nvidia-smi
Mon Nov 13 17:40:23 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.05    Driver Version: 525.85.05    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  Off |
| 31%   31C    P8    25W / 450W |    204MiB / 24564MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2048      G   /usr/lib/xorg/Xorg                 76MiB |
|    0   N/A  N/A      2182      G   /usr/bin/gnome-shell              126MiB |
+-----------------------------------------------------------------------------+

Laptop specs:

Lenovo Thinkpad P17 Gen2i
RAM: 32GB
Processor: i7-11800H
GPU: Nvidia RTX A2000 Mobile
OS: Ubuntu 22.04.3
Isaac ROS: 2.0
ESS model: 3.0

 >> nvidia-smi
Mon Nov 13 17:43:22 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX A2000 Laptop GPU    Off | 00000000:01:00.0  On |                  N/A |
| N/A   51C    P0              14W /  60W |    557MiB /  4096MiB |      2%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      3268      G   /usr/lib/xorg/Xorg                          185MiB |
|    0   N/A  N/A      3674      G   /usr/bin/gnome-shell                         75MiB |
|    0   N/A  N/A      4335      G   ...,WinRetrieveSuggestionsOnlyOnDemand       26MiB |
|    0   N/A  N/A      7422      G   .../jvm/java-11-openjdk-amd64/bin/java       62MiB |
|    0   N/A  N/A     42236      G   ...irefox/3358/usr/lib/firefox/firefox      194MiB |
+---------------------------------------------------------------------------------------+

Our Launch file:

"""Launch a Node with parameters and remappings."""

from launch import LaunchDescription
from launch_ros.actions import Node, ComposableNodeContainer
from launch_ros.descriptions import ComposableNode

NETWORK_WIDTH = 960
NETWORK_HEIGHT = 576

def generate_launch_description():

    gstreamer_node = Node(
        package='gscam2',
        executable='gscam_main',
        output='screen',
        name='gscam_publisher',
        parameters=[
            {
                "camera_info_url": "",  # Camera calibration information
                "gscam_config": "rtpbin name=rtpbin udpsrc caps=\"application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264\" port=5005 ! rtpbin.recv_rtp_sink_0 rtpbin. ! rtph264depay ! decodebin ! videoconvert ! video/x-raw,format=RGB",
                "preroll": False,
                "use_gst_timestamps": False,
            },
        ],
    )
    
    splitter_node = Node(
        package='custom_package',
        executable='stereo_splitter',
        parameters=[
            {'left_camera_file': '/workspaces/isaac_ros-dev/src/custom_package/config/dual_sensor_left.yaml',
             'right_camera_file': '/workspaces/isaac_ros-dev/src/custom_package/config/dual_sensor_right.yaml'}
        ]
    )

    disparity_visualizer_node = Node(
        package='custom_package',
        executable='disparity_visualizer',
    )
   
    left_rectify_node = ComposableNode(
        name='left_rectify_node',
        package='isaac_ros_image_proc',
        plugin='nvidia::isaac_ros::image_proc::RectifyNode',
        parameters=[{
            'output_width': 1280,
            'output_height': 720,
        }],
        remappings=[
            ('image_raw', '/left/image_raw'),
            ('camera_info', '/left/camera_info'),
            ('image_rect', '/left/image_rect'),
            ('camera_info_rect', '/left/camera_info_rect')
        ]
    )

    right_rectify_node = ComposableNode(
        name='right_rectify_node',
        package='isaac_ros_image_proc',
        plugin='nvidia::isaac_ros::image_proc::RectifyNode',
        parameters=[{
            'output_width': 1280,
            'output_height': 720,
        }],
        remappings=[
            ('image_raw', '/right/image_raw'),
            ('camera_info', '/right/camera_info'),
            ('image_rect', '/right/image_rect'),
            ('camera_info_rect', '/right/camera_info_rect')
        ]
    )

    left_image_resize_node = ComposableNode(
        name='left_image_resize_node',
        package='isaac_ros_image_proc',
        plugin='nvidia::isaac_ros::image_proc::ResizeNode',
        parameters=[{
                'output_width': NETWORK_WIDTH,
                'output_height': NETWORK_HEIGHT,
                'keep_aspect_ratio': False # Do not add black borders to image
        }],
        remappings=[
            ('camera_info', '/left/camera_info_rect'),
            ('image', '/left/image_rect'),
            ('resize/camera_info', '/left/camera_info_rect_resize'),
            ('resize/image', '/left/image_rect_resize')]
    )

    right_image_resize_node = ComposableNode(
        name='right_image_resize_node',
        package='isaac_ros_image_proc',
        plugin='nvidia::isaac_ros::image_proc::ResizeNode',
        parameters=[{
                'output_width': NETWORK_WIDTH,
                'output_height': NETWORK_HEIGHT,
                'keep_aspect_ratio': False # Do not add black borders to image
        }],
        remappings=[
            ('camera_info', '/right/camera_info_rect'),
            ('image', '/right/image_rect'),
            ('resize/camera_info', '/right/camera_info_rect_resize'),
            ('resize/image', '/right/image_rect_resize')]
    )

    disparity_node = ComposableNode(
        name='disparity',
        package='isaac_ros_ess',
        plugin='nvidia::isaac_ros::dnn_stereo_depth::ESSDisparityNode',
        parameters=[{
            'engine_file_path': "/workspaces/isaac_ros-dev/src/isaac_ros_dnn_stereo_depth/resources/ess.engine",
            "threshold": 0.0,
            "image_type": "RGB_U8",
            }],
        remappings=[
            ('left/camera_info', '/left/camera_info_rect_resize'),
            ('left/image_rect', '/left/image_rect_resize'),
            ('right/camera_info', '/right/camera_info_rect_resize'),
            ('right/image_rect', '/right/image_rect_resize')
        ]
    )

    nitros_container = ComposableNodeContainer(
        name='nitros_container',
        package='rclcpp_components',
        namespace='',
        executable='component_container_mt',
        composable_node_descriptions=[
            left_rectify_node, right_rectify_node,
            left_image_resize_node, right_image_resize_node,
            disparity_node
        ],
        output='screen'
    )

    return LaunchDescription([
                            gstreamer_node,
                            splitter_node,
                            nitros_container,
                            disparity_visualizer_node,
                            ])

stereo_splitter node:

import numpy as np
import cv2 as cv

import rclpy
import cv_bridge
from rclpy.node import Node
from sensor_msgs.msg import CameraInfo, Image

class StereoSplitter(Node):

    def __init__(self):
        super().__init__('stereo_splitter')
        self.encoding = 'rgb8'
        self._bridge = cv_bridge.CvBridge()
        self.read_calibration()
        self.setup_subscribers()
        self.setup_publishers()
    
    def yaml_to_CameraInfo(self, yaml_file: str):
        """
        Parse a yaml file containing camera calibration data (as produced by
        ROS or RTabMap) into a sensor_msgs/CameraInfo msg.

        Parameters
        ----------
        yaml_file : str
            Path to yaml file containing camera calibration data

        Returns
        -------
        camera_info_msg : sensor_msgs.msg.CameraInfo
            A sensor_msgs.msg.CameraInfo message containing the camera calibration data
        """
        camera_info_msg = CameraInfo()
        cv_file = cv.FileStorage()
        cv_file.open(yaml_file, cv.FileStorage_READ)
        camera_info_msg.width = int(cv_file.getNode('image_width').real())
        camera_info_msg.height = int(cv_file.getNode('image_height').real())
        camera_info_msg.k = cv_file.getNode('camera_matrix').mat().flatten().tolist()
        camera_info_msg.d = cv_file.getNode('distortion_coefficients').mat().flatten().tolist()
        camera_info_msg.r = cv_file.getNode('rectification_matrix').mat().flatten().tolist()
        camera_info_msg.p = cv_file.getNode('projection_matrix').mat().flatten().tolist()
        camera_info_msg.distortion_model = cv_file.getNode('distortion_model').string()
        cv_file.release()

        return camera_info_msg

    def read_calibration(self):
        self.declare_parameter('left_camera_file', '/workspaces/isaac_ros-dev/src/custom_package/config/dual_sensor_left.yaml')
        self.declare_parameter('right_camera_file', '/workspaces/isaac_ros-dev/src/custom_package/config/dual_sensor_right.yaml')
        left_cam_info_path = self.get_parameter('left_camera_file').get_parameter_value().string_value
        right_cam_info_path = self.get_parameter('right_camera_file').get_parameter_value().string_value
        self.left_camera_info = self.yaml_to_CameraInfo(left_cam_info_path)
        self.right_camera_info = self.yaml_to_CameraInfo(right_cam_info_path)

    def setup_subscribers(self):
        print('Setting up Subscribers...')
        self.subscription = self.create_subscription(
            Image,
            '/image_raw',
            self.stereo_img_recvd_callback,
            10)
        self.subscription #prevent unused variable warning

    def setup_publishers(self):
        print('Setting up Publishers...')
        self._img_left_pub = self.create_publisher(
            Image, '/left/image_raw', 10)
        self._img_right_pub = self.create_publisher(
            Image, '/right/image_raw', 10)
        self._camera_left_pub = self.create_publisher(
            CameraInfo, '/left/camera_info', 10)
        self._camera_right_pub = self.create_publisher(
            CameraInfo, '/right/camera_info', 10)

    def stereo_img_recvd_callback(self, image_msg):
        print('Received Stereo image!')
        split_images = self.split_stereo_image(self._bridge.imgmsg_to_cv2(image_msg))
        
        #Convert cv2 image to ros2 image message
        left_msg = self._bridge.cv2_to_imgmsg(np.array(split_images[0]), self.encoding)
        right_msg = self._bridge.cv2_to_imgmsg(np.array(split_images[1]), self.encoding)

        #Synchronize timestamps across all published topics
        current_timestamp = self.get_clock().now().to_msg()
        left_msg.header.stamp = current_timestamp
        right_msg.header.stamp = current_timestamp
        self.left_camera_info.header.stamp = current_timestamp
        self.right_camera_info.header.stamp = current_timestamp
        left_msg.header.frame_id = "stereocamera_left_optical_frame"
        right_msg.header.frame_id = "stereocamera_right_optical_frame"
        self.left_camera_info.header.frame_id = "stereocamera_left_optical_frame"
        self.right_camera_info.header.frame_id = "stereocamera_right_optical_frame"

        #Publish left/right images and left/right camera infos
        self._img_left_pub.publish(left_msg)
        self._camera_left_pub.publish(self.left_camera_info)
        self._img_right_pub.publish(right_msg)
        self._camera_right_pub.publish(self.right_camera_info)

    #Receives a cv2 image and returns two split cv2 images
    def split_stereo_image(self, stereoImage):
        print("Splitting Stereo image!")

        # Get height/width of original image
        height, width = stereoImage.shape[:2]
        horizontalCenter = int(width/2)

        # Generate sub images
        left_image = stereoImage[0:height, 0:horizontalCenter]
        right_image = stereoImage[0:height, horizontalCenter:width]  

        return left_image, right_image


def main(args=None):
    rclpy.init(args=args)
    stereo_splitter = StereoSplitter()
    rclpy.spin(stereo_splitter)

    #Cleanup
    stereo_splitter.destroy_node()
    rclpy.shutdown()

if __name__ == '__main__':
    main()

HiddenNetwork · November 24, 2023, 8:47pm

Hi,

I have also observed this issue on the laptop with an A2000 GPU. After running the above disparity pipeline for a while, it has reached a similar throttling state where both the rectification and disparity outputs are freezing/glitching.

As mentioned earlier, it seems the data flow between the non-NITROS nodes (stereo_splitter) and the first NITROS-enabled node which accepts the left/right images may be a performance bottleneck…

How could we fix the bottle necks in our pipeline?

Raffaello · November 30, 2023, 12:05pm

Upon reviewing your code and documentation, it appears that the Python node may become a bottleneck. Let’s take a step-by-step approach to safely arrive at this conclusion while ensuring optimal performance.

I suggest recording the output from your stereo_splitter and saving it in a rosbag file. When is done, you can run your rosbag and your new pipeline will be like this one:
rosbag -> Isaac ROS Rectification -> Isaac ROS Resize -> Isaac ROS ESS StereoDNN Disparity

From this point, you can check now the performance of the Isaac ROS ESS StereoDNN Disparity

About your last post, we know there is a bottleneck between not-NITROS and nitro packages, but to reproduce the same output of our benchmark, I suggest following this document:
https://nvidia-isaac-ros.github.io/concepts/benchmarking/index.html#benchmarking

Keep me posted,
Raffaello

HiddenNetwork · December 12, 2023, 4:00pm

Hi @Raffaello,

I have tried recording the output from stereo_splitter and saving it in a rosbag file, as you suggested.

However, when playing back the rosbag, the performance of the disparity pipeline is similarly poor. After running the above pipeline for a while (~1.5min) , it reaches a similar throttling state where both the rectification and disparity outputs are freezing/glitching.

Raffaello · December 13, 2023, 7:23pm

Have you tried our benchmark?

You can run our Isaac ROS benchmark. This repository provides benchmark scripts for our Isaac ROS packages.

You can test on your device the ESS Stereo DNN disparity node following this documentation
https://nvidia-isaac-ros.github.io/repositories_and_packages/isaac_ros_benchmark/isaac_ros_benchmark/index.html#quickstart

Instead of testing Apriltag, like the documentation shows, you can do all steps from points 1 to 4

5: Install Isaac ROS ess

sudo apt-get install -y ros-humble-isaac-ros-ess ros-humble-isaac-ros-test

6: Run these tests

launch_test src/isaac_ros_benchmark/scripts/isaac_ros_ess_graph.py

launch_test src/isaac_ros_benchmark/scripts/isaac_ros_ess_node.py

7: Read output

Check the output from the terminal.

Additionally, the final results and benchmark metadata (e.g., system information, benchmark configurations) are also exported as a JSON file.

Topic		Replies	Views
Noisy output of ESS StereoDNN Disparity Isaac ROS isaac-ros-dnn-stereo-disparity	6	1053	January 13, 2024
Incorrect Results for ESS Disparity Demo - Release 3.2 Isaac ROS isaac-ros-dnn-stereo-disparity	2	29	January 16, 2025
Rectify node not producing any output using USB Camera for Stereo Disparity Isaac ROS camera , usb , isaac-ros-image-pipeline , isaac-ros-dnn-stereo-disparity	1	311	April 12, 2024
Isaac_ros_ess Visualization Problem Isaac ROS camera	10	117	November 1, 2024
ESS Node not publishing anything Isaac ROS	2	25	July 17, 2024
Gstreamer reports Raspberry Pi camera streaming at 120fps when in reality it is only 60fps Jetson Nano camera , gstreamer	53	3797	July 19, 2022
Visionworks with ROS (blank disparity map) Jetson TX1	6	1684	October 18, 2021
Issue using camera Jetson Xavier NX camera	26	1786	November 21, 2022
Simple Gstream USB Camera Live View Jetson AGX Orin gstreamer	12	2753	July 25, 2023
Issues with the CSI cameras attached to the Xavier NX Jetson Xavier NX camera , hw , gstreamer , nvbugs	50	5989	October 18, 2021

Performance Issue of ESS Disparity Graph on RTX4090

Related topics