Hi there,
I’m trying to take a side-by-side 30fps 720p stereo video feed, run through ESS Disparity, and then feed the output into NVblox to get a 3D reconstruction in realtime.
I’ve written a Python node using OpenCV that splits the side by side video into left/right images for input to ESS.
I’ve created a ROS2 launch file to launch the following disparity graph / pipeline:
gscam2 -> stereo_splitter -> Isaac ROS Rectification -> Isaac ROS Resize -> Isaac ROS ESS StereoDNN Disparity
We’ve ran this pipeline on a PC with an RTX4090, and are noticing intermittent throttling. Sometimes it works for a few minutes, and then permanently throttles. Other times, it temporarily throttles and then recovers.
I collected data using ros2 topic hz -w 30 /<topic_name> | grep average
for each of the relevant topics, and graphed them below. Notice at ~50second mark, the FPS performance of all NITROS nodes dips, and eventually recovers.
Interestingly, the exact same launch file, settings, and input works perfectly on our laptops with an A2000 GPU. Notice in the graph below, there is no drop in FPS performance in any nodes, and all nodes are consistently > 20fps.
We’ve found that whether the left/right images are fed into Rectification, Resize, or Disparity node, directly, it exhibits throttling on the RTX 4090. For example, running only rectification nodes in Nitros as shown, also experiences an FPS drop:
gscam2 -> stereo_splitter -> Isaac ROS Rectification
We suspect that the data flow between the non-NITROS nodes (stereo_splitter
) and the first NITROS-enabled node which accepts the left/right images is a performance bottleneck on the RTX4090.
We are using GPU driver 525.85
on the RTX4090 PC, manually installed through the .run file (as per the Isaac sim Recommendation), and believe the rest of the system is configured correctly on Ubuntu 22.04:
- The PowerMizer settings in “Nvidia settings” are set to “Prefer Maximum Performance”.
- We’ve also tried maxing the CPU clock frequency to performance mode with cpupower-gui
- We’ve confirmed it’s not network latency issues.
We initially tried GPU driver version 535.129
, but found the throttling performance was worse / started throttling earlier.
When we notice throttling, we also notice the GPU usage drops and CPU usage increases. When the FPS issue eventually recovers, the CPU/GPU usage goes back to normal.
We are confused why this works on our laptop, but not an RTX4090 PC. Could you shed some light on this issue? Do you see any issues with the way existing nodes are interacting with each other? Are there some things you can suggest to verify the RTX4090 is setup correctly, or some tools to help debug and solve this?
PC Specs:
- RAM: 64GB
- Processor: i9-13900KF
- GPU: Nvidia RTX 4090
- OS: Ubuntu 22.04.3
- Isaac ROS: 2.0
- ESS model: 3.0
>> nvidia-smi
Mon Nov 13 17:40:23 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.05 Driver Version: 525.85.05 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | Off |
| 31% 31C P8 25W / 450W | 204MiB / 24564MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2048 G /usr/lib/xorg/Xorg 76MiB |
| 0 N/A N/A 2182 G /usr/bin/gnome-shell 126MiB |
+-----------------------------------------------------------------------------+
Laptop specs:
- Lenovo Thinkpad P17 Gen2i
- RAM: 32GB
- Processor: i7-11800H
- GPU: Nvidia RTX A2000 Mobile
- OS: Ubuntu 22.04.3
- Isaac ROS: 2.0
- ESS model: 3.0
>> nvidia-smi
Mon Nov 13 17:43:22 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA RTX A2000 Laptop GPU Off | 00000000:01:00.0 On | N/A |
| N/A 51C P0 14W / 60W | 557MiB / 4096MiB | 2% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 3268 G /usr/lib/xorg/Xorg 185MiB |
| 0 N/A N/A 3674 G /usr/bin/gnome-shell 75MiB |
| 0 N/A N/A 4335 G ...,WinRetrieveSuggestionsOnlyOnDemand 26MiB |
| 0 N/A N/A 7422 G .../jvm/java-11-openjdk-amd64/bin/java 62MiB |
| 0 N/A N/A 42236 G ...irefox/3358/usr/lib/firefox/firefox 194MiB |
+---------------------------------------------------------------------------------------+
Our Launch file:
"""Launch a Node with parameters and remappings."""
from launch import LaunchDescription
from launch_ros.actions import Node, ComposableNodeContainer
from launch_ros.descriptions import ComposableNode
NETWORK_WIDTH = 960
NETWORK_HEIGHT = 576
def generate_launch_description():
gstreamer_node = Node(
package='gscam2',
executable='gscam_main',
output='screen',
name='gscam_publisher',
parameters=[
{
"camera_info_url": "", # Camera calibration information
"gscam_config": "rtpbin name=rtpbin udpsrc caps=\"application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264\" port=5005 ! rtpbin.recv_rtp_sink_0 rtpbin. ! rtph264depay ! decodebin ! videoconvert ! video/x-raw,format=RGB",
"preroll": False,
"use_gst_timestamps": False,
},
],
)
splitter_node = Node(
package='custom_package',
executable='stereo_splitter',
parameters=[
{'left_camera_file': '/workspaces/isaac_ros-dev/src/custom_package/config/dual_sensor_left.yaml',
'right_camera_file': '/workspaces/isaac_ros-dev/src/custom_package/config/dual_sensor_right.yaml'}
]
)
disparity_visualizer_node = Node(
package='custom_package',
executable='disparity_visualizer',
)
left_rectify_node = ComposableNode(
name='left_rectify_node',
package='isaac_ros_image_proc',
plugin='nvidia::isaac_ros::image_proc::RectifyNode',
parameters=[{
'output_width': 1280,
'output_height': 720,
}],
remappings=[
('image_raw', '/left/image_raw'),
('camera_info', '/left/camera_info'),
('image_rect', '/left/image_rect'),
('camera_info_rect', '/left/camera_info_rect')
]
)
right_rectify_node = ComposableNode(
name='right_rectify_node',
package='isaac_ros_image_proc',
plugin='nvidia::isaac_ros::image_proc::RectifyNode',
parameters=[{
'output_width': 1280,
'output_height': 720,
}],
remappings=[
('image_raw', '/right/image_raw'),
('camera_info', '/right/camera_info'),
('image_rect', '/right/image_rect'),
('camera_info_rect', '/right/camera_info_rect')
]
)
left_image_resize_node = ComposableNode(
name='left_image_resize_node',
package='isaac_ros_image_proc',
plugin='nvidia::isaac_ros::image_proc::ResizeNode',
parameters=[{
'output_width': NETWORK_WIDTH,
'output_height': NETWORK_HEIGHT,
'keep_aspect_ratio': False # Do not add black borders to image
}],
remappings=[
('camera_info', '/left/camera_info_rect'),
('image', '/left/image_rect'),
('resize/camera_info', '/left/camera_info_rect_resize'),
('resize/image', '/left/image_rect_resize')]
)
right_image_resize_node = ComposableNode(
name='right_image_resize_node',
package='isaac_ros_image_proc',
plugin='nvidia::isaac_ros::image_proc::ResizeNode',
parameters=[{
'output_width': NETWORK_WIDTH,
'output_height': NETWORK_HEIGHT,
'keep_aspect_ratio': False # Do not add black borders to image
}],
remappings=[
('camera_info', '/right/camera_info_rect'),
('image', '/right/image_rect'),
('resize/camera_info', '/right/camera_info_rect_resize'),
('resize/image', '/right/image_rect_resize')]
)
disparity_node = ComposableNode(
name='disparity',
package='isaac_ros_ess',
plugin='nvidia::isaac_ros::dnn_stereo_depth::ESSDisparityNode',
parameters=[{
'engine_file_path': "/workspaces/isaac_ros-dev/src/isaac_ros_dnn_stereo_depth/resources/ess.engine",
"threshold": 0.0,
"image_type": "RGB_U8",
}],
remappings=[
('left/camera_info', '/left/camera_info_rect_resize'),
('left/image_rect', '/left/image_rect_resize'),
('right/camera_info', '/right/camera_info_rect_resize'),
('right/image_rect', '/right/image_rect_resize')
]
)
nitros_container = ComposableNodeContainer(
name='nitros_container',
package='rclcpp_components',
namespace='',
executable='component_container_mt',
composable_node_descriptions=[
left_rectify_node, right_rectify_node,
left_image_resize_node, right_image_resize_node,
disparity_node
],
output='screen'
)
return LaunchDescription([
gstreamer_node,
splitter_node,
nitros_container,
disparity_visualizer_node,
])
stereo_splitter node:
import numpy as np
import cv2 as cv
import rclpy
import cv_bridge
from rclpy.node import Node
from sensor_msgs.msg import CameraInfo, Image
class StereoSplitter(Node):
def __init__(self):
super().__init__('stereo_splitter')
self.encoding = 'rgb8'
self._bridge = cv_bridge.CvBridge()
self.read_calibration()
self.setup_subscribers()
self.setup_publishers()
def yaml_to_CameraInfo(self, yaml_file: str):
"""
Parse a yaml file containing camera calibration data (as produced by
ROS or RTabMap) into a sensor_msgs/CameraInfo msg.
Parameters
----------
yaml_file : str
Path to yaml file containing camera calibration data
Returns
-------
camera_info_msg : sensor_msgs.msg.CameraInfo
A sensor_msgs.msg.CameraInfo message containing the camera calibration data
"""
camera_info_msg = CameraInfo()
cv_file = cv.FileStorage()
cv_file.open(yaml_file, cv.FileStorage_READ)
camera_info_msg.width = int(cv_file.getNode('image_width').real())
camera_info_msg.height = int(cv_file.getNode('image_height').real())
camera_info_msg.k = cv_file.getNode('camera_matrix').mat().flatten().tolist()
camera_info_msg.d = cv_file.getNode('distortion_coefficients').mat().flatten().tolist()
camera_info_msg.r = cv_file.getNode('rectification_matrix').mat().flatten().tolist()
camera_info_msg.p = cv_file.getNode('projection_matrix').mat().flatten().tolist()
camera_info_msg.distortion_model = cv_file.getNode('distortion_model').string()
cv_file.release()
return camera_info_msg
def read_calibration(self):
self.declare_parameter('left_camera_file', '/workspaces/isaac_ros-dev/src/custom_package/config/dual_sensor_left.yaml')
self.declare_parameter('right_camera_file', '/workspaces/isaac_ros-dev/src/custom_package/config/dual_sensor_right.yaml')
left_cam_info_path = self.get_parameter('left_camera_file').get_parameter_value().string_value
right_cam_info_path = self.get_parameter('right_camera_file').get_parameter_value().string_value
self.left_camera_info = self.yaml_to_CameraInfo(left_cam_info_path)
self.right_camera_info = self.yaml_to_CameraInfo(right_cam_info_path)
def setup_subscribers(self):
print('Setting up Subscribers...')
self.subscription = self.create_subscription(
Image,
'/image_raw',
self.stereo_img_recvd_callback,
10)
self.subscription #prevent unused variable warning
def setup_publishers(self):
print('Setting up Publishers...')
self._img_left_pub = self.create_publisher(
Image, '/left/image_raw', 10)
self._img_right_pub = self.create_publisher(
Image, '/right/image_raw', 10)
self._camera_left_pub = self.create_publisher(
CameraInfo, '/left/camera_info', 10)
self._camera_right_pub = self.create_publisher(
CameraInfo, '/right/camera_info', 10)
def stereo_img_recvd_callback(self, image_msg):
print('Received Stereo image!')
split_images = self.split_stereo_image(self._bridge.imgmsg_to_cv2(image_msg))
#Convert cv2 image to ros2 image message
left_msg = self._bridge.cv2_to_imgmsg(np.array(split_images[0]), self.encoding)
right_msg = self._bridge.cv2_to_imgmsg(np.array(split_images[1]), self.encoding)
#Synchronize timestamps across all published topics
current_timestamp = self.get_clock().now().to_msg()
left_msg.header.stamp = current_timestamp
right_msg.header.stamp = current_timestamp
self.left_camera_info.header.stamp = current_timestamp
self.right_camera_info.header.stamp = current_timestamp
left_msg.header.frame_id = "stereocamera_left_optical_frame"
right_msg.header.frame_id = "stereocamera_right_optical_frame"
self.left_camera_info.header.frame_id = "stereocamera_left_optical_frame"
self.right_camera_info.header.frame_id = "stereocamera_right_optical_frame"
#Publish left/right images and left/right camera infos
self._img_left_pub.publish(left_msg)
self._camera_left_pub.publish(self.left_camera_info)
self._img_right_pub.publish(right_msg)
self._camera_right_pub.publish(self.right_camera_info)
#Receives a cv2 image and returns two split cv2 images
def split_stereo_image(self, stereoImage):
print("Splitting Stereo image!")
# Get height/width of original image
height, width = stereoImage.shape[:2]
horizontalCenter = int(width/2)
# Generate sub images
left_image = stereoImage[0:height, 0:horizontalCenter]
right_image = stereoImage[0:height, horizontalCenter:width]
return left_image, right_image
def main(args=None):
rclpy.init(args=args)
stereo_splitter = StereoSplitter()
rclpy.spin(stereo_splitter)
#Cleanup
stereo_splitter.destroy_node()
rclpy.shutdown()
if __name__ == '__main__':
main()