I’m writing a controller that exists in a node on the Ros2 side, which subscribes to joint states and the tf tree, then publishes a joint effort command back to isaac sim to control the robot. The basic pipeline of this works as expected, but it seems potentially like lag in the system is stopping the controller from actually working.
I’m therefore attempting to measure the lag for round trip messages between Isaac sim and Ros2, so I can understand what sort of performance hit this controller is experiencing, or if theres something else going on. I have an action graph that publishes a timestamp to Ros on a keyboard input, then a node in ros that subscribes to this message and immediately publishes that timestamp on another topic. Back in the action graph, I subscribe to that topic, and subtract the current time from the received time in that message. Im using the following action graph to do this:
However, something strange happens. If I insert a time.sleep(x) into the ros time relay node, the lag seems to change proportionally with x. I would expect the lag to be a constant offset. Eg, if x=0, im getting ~0.01seconds, if x=1.0, I get 0.83s, if x=5.0, I get 4.73 seconds… confusing!!
Could you try using the latest version of Isaac Sim (Isaac Sim 4.0.0)? You can also set the publisher and subscriber qos settings (in Isaac Sim) to have a depth of 1 to ensure no additional messages get queued up in the process which may cause additional delays.
I upgraded to Isaac 4.0. Didnt seem to impact anything.
I failed to mention that I had other action graphs in my stage that also had ros activities. Even though they were disabled in the stage tree, it seems like they are somehow still active. When I deleted those other action graphs to isolate just my ping test, the strange lag artifacts disappeared! Now when I sleep in my ros node, I get the expected constant lag offset.
The lag appears to be pretty consistently 0.01666 seconds, for anyone interested.
I attached the usd file and ros node here for anyone interested in replicating.
import rclpy
from rclpy.node import Node
from builtin_interfaces.msg import Time
from rosgraph_msgs.msg import Clock
import time
class ClockRelayNode(Node):
def __init__(self):
super().__init__('clock_relay_node')
# Subscriber
self.subscription = self.create_subscription(
Clock,
'isaac_clock',
self.clock_callback,
10)
# Publisher
self.publisher = self.create_publisher(Clock, 'ros_clock', 10)
def clock_callback(self, msg):
relay_msg = Clock()
relay_msg.clock.sec = msg.clock.sec
relay_msg.clock.nanosec = msg.clock.nanosec
self.get_logger().info(f'Published timestamp: {relay_msg.clock}')
time.sleep(1.0)
self.publisher.publish(relay_msg)
def main(args=None):
rclpy.init(args=args)
node = ClockRelayNode()
rclpy.spin(node)
node.destroy_node()
rclpy.shutdown()
if __name__ == '__main__':
main()
I tried running your example however it seems I would also need to have your other script files time_diff.py and and clock_relay.py. If you could send those over too that would be great!
in the default settings of USD file, the physics rate = render rate = 60FPS = 1/60 sec = 0.0167 seconds
So the result of pub-sub latency is reasonable.
If the physics rate is changed, I think we may get difference result
@andrew.gillies I would ask you to take a look at that reference for a better understanding on how we step through each frame. The delay of 0.0167s for a physics rate of 60 fps makes sense since you might be receiving the data n the next frame that is ticked. What is expected is with a x sec delay you should get around (x+0.0167) time delay. As you increase the physics rate, the consistent delay of 0.0167 should shrink since you are ticking the subscriber at higher rates.
The inconsistencies that you are noticing is likely due to the performance of sim dropping slightly and hence the simulation time is slower than wall clock time. Since your delay is based on wall clock on an external node, discrepancies are expected.
To play around with changing the physics rates you can take a look at this tutorial