Isaac Sim OS (driver?) persistent memory leak

Every time I run Isaac Sim from python it causes about ~0.4GB of system memory to become persistently unavailable. This memory is locked up even after the run completes (with no explanation for its use that I can find). The only way I have found to restore the memory is to reboot the machine. My only guess is that the memory is being leaked by the driver.

This occurs with code as simple as starting then closing SimulationApp. It occurs both natively and when run from within singularity container (converted from the docker image) where the host’s memory remains unavailable after the container is closed.

My environment:

  • Ubuntu 20.04.5 LTS
  • Isaac Sim 2022.2.0
  • Driver version 525.85.12
  • GPUs: GeForce RTX 2080 Ti & GeForce RTX 3090

Code to reproduce the memory leak:

#!/usr/bin/env python
import subprocess
from pathlib import Path


def run(python: Path):
    print_memory()
    for _ in range(30):
        subprocess.run(
            [
                str(python),
                "-c",
                "; ".join(
                    [
                        "from omni.isaac.kit import SimulationApp",
                        "app = SimulationApp({'headless': True})",
                        "app.close()",
                    ]
                ),
            ],
            stdout=subprocess.DEVNULL,
            stderr=subprocess.STDOUT,
        )
        print_memory()


def print_memory():
    with open("/proc/meminfo", "r") as f:
        for line in f:
            if line.startswith("MemAvailable"):
                print(line.strip())
                return


for path in [
    Path("/isaac-sim"),
    Path.home() / ".local" / "share" / "ov" / "pkg" / "isaac_sim-2022.2.0",
]:
    if path.exists():
        isaac_sim_python = path / "python.sh"
        break
else:
    raise ValueError("Isaac sim path not found")

run(isaac_sim_python)

Can you try the latest Isaac Sim and see if you still see the leak?

@eric.langlois Do you see the same leak when running Create/USD Composer?

We tested on 2022.2.1 with that script and still have this issue. I haven’t tested Create / Composer yet.

Hi @eric.langlois - Please let us know once you test script with USD Composer (previously called Create).

Yes, I have this issue with Composer / Create (I could only find it under the name “Create” in the Exchange). I repeatedly launched Create 2022.3.3 from omniverse launcher, waited for everything to finish loading/rendering, closed it from the GUI and then measured my available memory:

2023-05-15 11:20:28.027608: MemAvailable:   57494696 kB
2023-05-15 11:20:54.181525: MemAvailable:   57242132 kB
2023-05-15 11:21:17.997116: MemAvailable:   56986692 kB
2023-05-15 11:21:39.749033: MemAvailable:   56732896 kB
2023-05-15 11:22:01.580873: MemAvailable:   56478768 kB
2023-05-15 11:22:24.588311: MemAvailable:   56259640 kB
2023-05-15 11:22:45.980537: MemAvailable:   55981292 kB
2023-05-15 11:23:05.515987: MemAvailable:   55827360 kB
2023-05-15 11:23:25.635777: MemAvailable:   55612920 kB

I ran a control without opening / closing anything and the available memory fluctuated by ±50MB / 30s vs the consistent ~200MB decrease per Create run.

Hi @eric.langlois - Thank you for checking this. I have raised this issue with internal team. I will keep you posted.

@eric.langlois - We have few follow-up question that we want to ask:

  1. Are you trying to run simulation with those two GPUs active or using one of them? The reason we ask because the app can’t run with cross architecture GPUs. We are verifying internally to reproduce this.

It does not matter to me (at present) whether it runs on one or both of the GPUs. Both GPUs are active on the system and I have not been taking any action to specify GPU(s) for it to use, instead assuming that Isaac Sim will select a sensible allocation of computation to the GPUs that I might later try to tweak if it proved suboptimal.

Let me know if there is anything you would like me to try in terms of restricting the GPUs that it runs on, e.g. which environment variables would need to be set.

Thank you for the response. Can you please run the same script by isolating one GPU (CUDA_VISIBLE_DEVICES=0 python.sh …) at a time and provide the results?

Halving the number of GPUs turns out to halve the rate of memory leak. 200MB / run with 1 GPU instead of 400MB / run with both.

> $HOME/.local/share/ov/pkg/isaac_sim-2022.2.1/python.sh test-isaac-sim.py
MemAvailable:   56858220 kB
MemAvailable:   56479864 kB
MemAvailable:   56091252 kB
MemAvailable:   55682688 kB
MemAvailable:   55301808 kB
> CUDA_VISIBLE_DEVICES=0 $HOME/.local/share/ov/pkg/isaac_sim-2022.2.1/python.sh test-isaac-sim.py
MemAvailable:   57844596 kB
MemAvailable:   57670288 kB
MemAvailable:   57467064 kB
MemAvailable:   57271088 kB
MemAvailable:   57077492 kB
> CUDA_VISIBLE_DEVICES=1 $HOME/.local/share/ov/pkg/isaac_sim-2022.2.1/python.sh test-isaac-sim.py
MemAvailable:   54886668 kB
MemAvailable:   54711112 kB
MemAvailable:   54514800 kB
MemAvailable:   54303724 kB
MemAvailable:   54116344 kB
1 Like

Hello,

Is there any update on this? On our shared machines, this causes a lot of issues with mandatory reboots, as system memory is “lost” after every run.

Our configuration is very similar to OP,
Ubuntu 20.04 / Kernel 5.15.0-73
Nvidia driver 525.105.17
Isaac Sim 2022.2.0
CPU AMD Ryzen 9 7950X
GPU: GeForce RTX 4090
We only have one GPU in this machine.

We noticed the following:

  • The memory leaked seems to be proportional to the amount of robots duplicatd
  • checking /proc/meminfo, the leaked memory increased the following categories:
    • DirectMap4k by ~1 GB per run with 8192 robots
    • AnonPages by ~ 0.5 GB per run
    • Inactive: by ~ 1 GB per run or so.

Take these numbers with a grain of salt - it doesn’t behave exactly the same way always. But most apparent is the ever increasing number of DirectMap4k, and it never gets released until reboot.

Hi @pnm - I would suggest to update your Isaac Sim to latest Isaac Sim 2022.2.1 release. Even with that version memory leak is known issue which is currently being worked on and hopefully will be fixed in the next Isaac Sim release (~Aug/Sep) timeframe.