I am evaluating Isaac ROS VSLAM capabilities and in order to do so, I am using Isaac Sim 4.5 to simulate image streams for Isaac ROS VSLAM. I am following the official tutorial from Isaac Ros Tutorial for Visual SLAM with Isaac Sim — isaac_ros_docs documentation.
My end game is to use Isaac ROS VSLAM odometry togheter with RtabMap Slam and Nav2.
So far I haven’t got good results from Isaac ROS VSLAM as I experience “jumps” of the robot on the map, and robot position is never recovered to the correct value. This happens while I navigate the robot in the scene using a joystick (see in the attached video what I mean by “jumps”).
I wasn’t able to convince Isaac Sim 4.5 to sustain the image-publish rate (≥ 30 Hz) with low-jitter timing (± 2 ms) required by Isaac ROS VSLAM.
Even on a g6e.12xlarge instance (4 × L40 GPUs, 192 GB GPU memory, 48 vCPUs, 384 GB RAM) the image topics average only ≈ 20 Hz.
Inter-frame jitter reaches 50–83 ms and spikes far higher, producing visible “jumps” in Isaac ROS VSLAM odometry. Hardware is not saturated—GPU usage stays under 50 % and CPU at ~30 % per core.
Steps to Reproduce
Create an g6e.12xlarg EC2 instance using NVIDIA Isaac Sim™ Development Workstation (Linux) AMI as described in these steps : AWS Deployment — Isaac Sim Documentation
Ok well first of all, you may have 4 GPUs but you are only using 1 of them. The WHITE card listed is in use. The GREY cards listed are not in use. This is because you are running in realtime mode at a low resolution, so we don’t allow the other cards to “kick in” until you give them something good to do. If you want to change that you have to go to your RENDER SETTINGS and look for the MGPU tab down at the bottom. It will have automatic tiling on. You can either increase the render resolution above 1440p or you can force them all to come on at a lower resolution.
Ok, thanks for that. I will try what you suggested and come back with what I found.
I saw that only one gpu is marked white, but when I inspected the GPU usage using nvtop, I saw that all 4 gpus are used at around 25%, so I assumed that the the white mark, might mean something else.
I’ve increased the resolution of the viewport, as result all gpus that are listed in viewport are marked with white (before I hit Play).
I’ve disabled automatic tiles and increased the number of tiles used, such that all GPUS that are listed in viewport are marked with white (before I hit Play).
BUT, when I hit ‘Play’ then only one gpu is marked white in viewport.
I checked GPU usage with nvtop, and according to nvtop all GPUS are used, even though after I hit play, according to the viewport only 1 GPU is used.
Yes I see what you mean. The fps when stopped is good and all cards are engaged.
However, when you hit play, the fps syncs to the actual fps of the stage, which by default is 30fps. That is very important. If you have specific need for exact timings, and you need a sync of 30fps, which you set in your stage, then the playhead respects that. Notice it drops to almost exact 30fps, give or take.
The other cards shut down because one card can easily maintain the 30fps. If you want more, then open your time line at set it for 60fps, or even 120fps.
You do not have your timeline on or any of the normal timeline play controls. If you are “playing” something you need to have your timeline open and configure those settings at the bottom correctly. Just to be clear, the play button is to literally start playing a locked fps animation. It is not just there to run a simulation. Go to Window > Timeline
Does this scene have any physics in it? It might be cpu bound, meaning that when you play the animation, it has to run all the physics through the cpu and this is the bottleneck. Can you try sending me the file, so I can have a look? You want to “collect it”. Go to File > Collect and make a full local copy and zip it up. Then DM it to me.
What about a really basic scene. Something without physics in it.
Ok thanks. But that is what I am saying. That could be very heavy for Physics and your cpu is struggling. I want you to just load up a basic non animated, non physics scene. Go to your Extension Manager and load in the “Sample” Browsers. Load the “Astronaut” file. Look for a big difference between the fps when stopped and the fps when playing. Make sure you are in the REALTIME mode. Also make sure that ECO Mode in the Rendering Settings is OFF.
What you should see if there should be no difference between playing and stopping, for scenes with no physics in them. For example I get 38fps on this scene regardless of whether I am playing the animation or not.
Also, if you are just learning Isaac Sim and starting off with a basic tutorial, why are you running it on a super powerful 4 x GPU machine on AWS? Seem a very expensive way to test out software. Just try it on your local machine, workstation, laptop etc. Do you have projects of your own to test out?
Finally, if you want to try that same test scene again, but this time test it with no physics at all, then load up the “Physics Debug” Extension from “Window > Physics > Debug” and then go to this button at the bottom that removes all of the physics from the scene
Ok I spoke to the engineers and I have confirmed that Physics can only use 1 GPU, regardless of the amount of GPUs you have. 1 GPU for solving physics, and as many GPUs you want to just play normal animation. So that scene must contain physics, and when you hit play, it is forced to switch to a single GPU. Then when you hit stop, all the GPUs kick back in.
So try:
a scene with no physics in it
the same scene but remove the physics
Try turning off “Simulations” here in the right click of the play button
I wonder what Viewport FPS you get if you open the ROS2 → Isaac ROS → Sample Scene from Robotics Examples. To open the scene you need to have ROS2 enabled, as documented in the steps to reproduce I referenced. That is:
open the ROS2 → Isaac ROS → Sample Scene from Robotics Examples .
Also, if you are just learning Isaac Sim and starting off with a basic tutorial, why are you running it on a super powerful 4 x GPU machine on AWS? Seem a very expensive way to test out software. Just try it on your local machine, workstation, laptop etc. Do you have projects of your own to test out?
I don’t have a local workstation where to run Isaac Sim.
I am renting from AWS. I started with an instance with 1 gpu as recommended by this install guide: g6e.2xlarge AWS Deployment — Isaac Sim Documentation
The issue happens also on the instace with 1 GPU.
I’ve upgraded to the next available instance that has more than 1 GPU ( AWS offers only 4GPUS or more) to see if this issue is caused by hardware limitation. But it looks like with plenty of hardware, I still get <30 FPS even on 4 GPUS.
What I am after, is to use Isaac Ros Vslam (which requires at least 30 FPS at a steady pace - no jitter.)
Looks like Isaac Sim, or something down the road prevents to ingest in Isaac Ros Vsalm a stream of images at a steady pace with more than 30 FPS.
Ok. Possible breakthrough, although we are confused here as to how this has been done. In the video screen recording is shows so yellow warnings. They really should mark these as RED because they are so critical. You have your Server set up wrong. Your GPUs are in ECC mode. That is not good. We cannot run with your GPUs in this mode. Part of the Isaac Sim and Omniverse Kit setup is that all gpus must have ECC turned off.
I do not really understand whether Amazon failed to turn this off, or you did not realize. But this certainly needs to be set to OFF. Let’s try that and see.
Also I needed you to test the Astronaut scene but in REALTIME mode. Not Path Tracing mode. I can see it works fine in Path tracing mode. Although, not really, because according to what I can see you are getting horrible black horizontal lines through the viewport, correct?
Tried to disable ECC on GPU, but without success, the change is not persisted.
I’ve tried to disable ECC in the official AMI, but I was not able to do so.
Please see my logs here: trying-to-disable-ecc-log.txt
Once the ECC is disabled using nvidia-smi -e 0, it asks for a reboot, and after reboot the changes are lost, so persistence does not work on the oficial AMI.
I’ve verified that AWS let’s me disable ECC, using another AMI image, and the changes are persisted after instance reboot. This is the AMI I’ve used: