Every time I run Isaac Sim from python it causes about ~0.4GB of system memory to become persistently unavailable. This memory is locked up even after the run completes (with no explanation for its use that I can find). The only way I have found to restore the memory is to reboot the machine. My only guess is that the memory is being leaked by the driver.
This occurs with code as simple as starting then closing SimulationApp. It occurs both natively and when run from within singularity container (converted from the docker image) where the host’s memory remains unavailable after the container is closed.
My environment:
Ubuntu 20.04.5 LTS
Isaac Sim 2022.2.0
Driver version 525.85.12
GPUs: GeForce RTX 2080 Ti & GeForce RTX 3090
Code to reproduce the memory leak:
#!/usr/bin/env python
import subprocess
from pathlib import Path
def run(python: Path):
print_memory()
for _ in range(30):
subprocess.run(
[
str(python),
"-c",
"; ".join(
[
"from omni.isaac.kit import SimulationApp",
"app = SimulationApp({'headless': True})",
"app.close()",
]
),
],
stdout=subprocess.DEVNULL,
stderr=subprocess.STDOUT,
)
print_memory()
def print_memory():
with open("/proc/meminfo", "r") as f:
for line in f:
if line.startswith("MemAvailable"):
print(line.strip())
return
for path in [
Path("/isaac-sim"),
Path.home() / ".local" / "share" / "ov" / "pkg" / "isaac_sim-2022.2.0",
]:
if path.exists():
isaac_sim_python = path / "python.sh"
break
else:
raise ValueError("Isaac sim path not found")
run(isaac_sim_python)
Yes, I have this issue with Composer / Create (I could only find it under the name “Create” in the Exchange). I repeatedly launched Create 2022.3.3 from omniverse launcher, waited for everything to finish loading/rendering, closed it from the GUI and then measured my available memory:
I ran a control without opening / closing anything and the available memory fluctuated by ±50MB / 30s vs the consistent ~200MB decrease per Create run.
@eric.langlois - We have few follow-up question that we want to ask:
Are you trying to run simulation with those two GPUs active or using one of them? The reason we ask because the app can’t run with cross architecture GPUs. We are verifying internally to reproduce this.
It does not matter to me (at present) whether it runs on one or both of the GPUs. Both GPUs are active on the system and I have not been taking any action to specify GPU(s) for it to use, instead assuming that Isaac Sim will select a sensible allocation of computation to the GPUs that I might later try to tweak if it proved suboptimal.
Let me know if there is anything you would like me to try in terms of restricting the GPUs that it runs on, e.g. which environment variables would need to be set.
Thank you for the response. Can you please run the same script by isolating one GPU (CUDA_VISIBLE_DEVICES=0 python.sh …) at a time and provide the results?
Is there any update on this? On our shared machines, this causes a lot of issues with mandatory reboots, as system memory is “lost” after every run.
Our configuration is very similar to OP,
Ubuntu 20.04 / Kernel 5.15.0-73
Nvidia driver 525.105.17
Isaac Sim 2022.2.0
CPU AMD Ryzen 9 7950X
GPU: GeForce RTX 4090
We only have one GPU in this machine.
We noticed the following:
The memory leaked seems to be proportional to the amount of robots duplicatd
checking /proc/meminfo, the leaked memory increased the following categories:
DirectMap4k by ~1 GB per run with 8192 robots
AnonPages by ~ 0.5 GB per run
Inactive: by ~ 1 GB per run or so.
Take these numbers with a grain of salt - it doesn’t behave exactly the same way always. But most apparent is the ever increasing number of DirectMap4k, and it never gets released until reboot.
Hi @pnm - I would suggest to update your Isaac Sim to latest Isaac Sim 2022.2.1 release. Even with that version memory leak is known issue which is currently being worked on and hopefully will be fixed in the next Isaac Sim release (~Aug/Sep) timeframe.