Crash with ERROR_OUT_OF_DEVICE_MEMORY on long runing synthetic data generation runs

Hello everyone,

I have a script which runs Isaac Sim with the following config:

  renderer: RayTracedLighting
  headless: True
  active_gpu: 0
  physics_gpu: 0
  width: 640
  height: 480

It then creates the following in order:

  • Environment
  • RTXLidar with a config
  • load render products and writer
  • replicator

I record multiple pointclouds in the environment at different positions which is controlled by replicator API.
Now I need to change the environment for recording pointcloud in a different location so I do the following:

  • Delete the prim which contains the Environment
  • Load another Environment
  • deatch()all the writers
  • destroy() all the render products
  • Delete the prim /Replicator using prims_utils
  • Register a new randomizer
  • Call the randomizer
  • Setup new render products
  • Setup new writers

The replicator orchestrator is run on every timeline event with:

self._timeline.pause()
self._timeline_sub.unsubscribe()
rep.orchestrator.step(rt_subframes=self._rt_subframes, pause_timeline=False)
# _setup_next_frame checks if the environment should be changed or lidar position has to be changed etc.
self._setup_next_frame()
self._timeline.play()
self._timeline_sub = self._timeline.get_timeline_event_stream().create_subscription_to_pop_by_type(
    int(omni.timeline.TimelineEventType.CURRENT_TIME_TICKED),
    self._on_timeline_event,
)

This script works with when I generated relatively low number of point clouds, i.e. around 7k to 10k.
It crashes with the following error on console when i try to generate around15K point clouds with ground truth.
Error Log:

2024-05-09 20:57:36 [81,491,159ms] [Error] [gpu.foundation.plugin] Unable to allocate buffer
2024-05-09 20:57:36 [81,491,159ms] [Error] [gpu.foundation.plugin] Buffer creation failed for the device: 0.
2024-05-09 20:57:36 [81,491,159ms] [Error] [rtx.scenedb.plugin] Failed to allocate upload buffer for table upload: size: 256
2024-05-09 20:57:36 [81,491,159ms] [Error] [carb.graphics-vulkan.plugin] VkResult: ERROR_OUT_OF_DEVICE_MEMORY
2024-05-09 20:57:36 [81,491,159ms] [Error] [carb.graphics-vulkan.plugin] vkAllocateMemory failed for flags: 0.
2024-05-09 20:57:36 [81,491,159ms] [Error] [gpu.foundation.plugin] Unable to allocate buffer
2024-05-09 20:57:36 [81,491,159ms] [Error] [gpu.foundation.plugin] Buffer creation failed for the device: 0.
2024-05-09 20:57:36 [81,491,159ms] [Error] [rtx.scenedb.plugin] Failed to allocate upload buffer for table upload: size: 269312
2024-05-09 20:57:36 [81,491,159ms] [Error] [carb.graphics-vulkan.plugin] VkResult: ERROR_OUT_OF_DEVICE_MEMORY
2024-05-09 20:57:36 [81,491,159ms] [Error] [carb.graphics-vulkan.plugin] vkAllocateMemory failed for flags: 0.
2024-05-09 20:57:36 [81,491,159ms] [Error] [gpu.foundation.plugin] Unable to allocate buffer
2024-05-09 20:57:36 [81,491,159ms] [Error] [gpu.foundation.plugin] Buffer creation failed for the device: 0.
2024-05-09 20:57:36 [81,491,159ms] [Error] [rtx.scenedb.plugin] Failed to allocate upload buffer for table upload: size: 202240
2024-05-09 20:57:36 [81,491,174ms] [Error] [carb.graphics-vulkan.plugin] VkResult: ERROR_OUT_OF_DEVICE_MEMORY
2024-05-09 20:57:36 [81,491,174ms] [Error] [carb.graphics-vulkan.plugin] vkAllocateMemory failed for flags: 0.
2024-05-09 20:57:36 [81,491,174ms] [Error] [gpu.foundation.plugin] Unable to allocate buffer
2024-05-09 20:57:36 [81,491,174ms] [Error] [gpu.foundation.plugin] Buffer creation failed for the device: 0.
2024-05-09 20:57:36 [81,491,174ms] [Error] [rtx.scenedb.plugin] Failed to allocate upload buffer for table upload: size: 256
2024-05-09 20:57:36 [81,491,174ms] [Error] [carb.graphics-vulkan.plugin] VkResult: ERROR_OUT_OF_DEVICE_MEMORY
2024-05-09 20:57:36 [81,491,174ms] [Error] [carb.graphics-vulkan.plugin] vkAllocateMemory failed for flags: 0.
2024-05-09 20:57:36 [81,491,174ms] [Error] [gpu.foundation.plugin] Unable to allocate buffer
2024-05-09 20:57:36 [81,491,174ms] [Error] [gpu.foundation.plugin] Buffer creation failed for the device: 0.
2024-05-09 20:57:36 [81,491,174ms] [Error] [rtx.scenedb.plugin] Failed to allocate upload buffer for table upload: size: 269312
2024-05-09 20:57:36 [81,491,174ms] [Error] [carb.graphics-vulkan.plugin] VkResult: ERROR_OUT_OF_DEVICE_MEMORY
2024-05-09 20:57:36 [81,491,174ms] [Error] [carb.graphics-vulkan.plugin] vkAllocateMemory failed for flags: 0.
2024-05-09 20:57:36 [81,491,174ms] [Error] [gpu.foundation.plugin] Unable to allocate buffer
2024-05-09 20:57:36 [81,491,174ms] [Error] [gpu.foundation.plugin] Buffer creation failed for the device: 0.
2024-05-09 20:57:36 [81,491,174ms] [Error] [rtx.scenedb.plugin] Failed to allocate upload buffer for table upload: size: 202240
2024-05-09 20:57:36 [81,491,187ms] [Error] [carb.graphics-vulkan.plugin] VkResult: ERROR_INITIALIZATION_FAILED
2024-05-09 20:57:36 [81,491,187ms] [Error] [carb.graphics-vulkan.plugin] vkGetMemoryFdKHR failed.
2024-05-09 20:57:36 [81,491,187ms] [Error] [gpu.foundation.plugin] Cannot create shared handle for resource!
Fatal Python error: Segmentation fault

Thread 0x000071f9abfff640 (most recent call first):
  File "/isaac-sim/extscache/omni.replicator.core-1.10.20+105.1.lx64.r.cp310/omni/replicator/core/scripts/backends/io_queue.py", line 174 in worker
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 953 in run
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x000071f9b0bfc640 (most recent call first):
  File "/isaac-sim/extscache/omni.replicator.core-1.10.20+105.1.lx64.r.cp310/omni/replicator/core/scripts/backends/io_queue.py", line 174 in worker
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 953 in run
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x000071f9b13fd640 (most recent call first):
  File "/isaac-sim/extscache/omni.replicator.core-1.10.20+105.1.lx64.r.cp310/omni/replicator/core/scripts/backends/io_queue.py", line 174 in worker
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 953 in run
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x000071f9b1bfe640 (most recent call first):
  File "/isaac-sim/extscache/omni.replicator.core-1.10.20+105.1.lx64.r.cp310/omni/replicator/core/scripts/backends/io_queue.py", line 174 in worker
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 953 in run
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 973 in _bootstrap

Thread 0x000071f9fffff640 (most recent call first):
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 324 in wait
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 607 in wait
  File "/isaac-sim/kit/python/lib/python3.10/site-packages/tqdm/_monitor.py", line 60 in run
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
  File "/isaac-sim/kit/python/lib/python3.10/threading.py", line 973 in _bootstrap

Current thread 0x0000720b0b75eb80 (most recent call first):
  File "/isaac-sim/extscache/omni.replicator.core-1.10.20+105.1.lx64.r.cp310/omni/replicator/core/scripts/orchestrator.py", line 1259 in step
  File "/root/ws/simulation.py", line 390 in _run_sdg
  File "/root/ws/simulation.py", line 149 in _on_timeline_event
  File "/isaac-sim/exts/omni.isaac.kit/omni/isaac/kit/simulation_app.py", line 423 in update
  File "/root/ws/simulation.py", line 416 in <module>

Extension modules: yaml._yaml, box.exceptions, box.converters, box.box, box.box_list, box.config_box, box.from_file, box.shorthand_box, psutil._psutil_linux, psutil._psutil_posix, pydantic.typing, pydantic.utils, pydantic.class_validators, pydantic.color, pydantic.datetime_parse, pydantic.validators, pydantic.networks, pydantic.types, pydantic.json, pydantic.main, pydantic.dataclasses, pydantic.env_settings, numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, omni.mdl.pymdlsdk._pymdlsdk, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg._cythonized_array_utils, scipy.linalg._flinalg, scipy.linalg._solve_toeplitz, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_lapack, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial.transform._rotation, PIL._imaging, PIL._imagingft, numpy.linalg.lapack_lite, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, osqp._osqp, multidict._multidict, yarl._quoting_c, aiohttp._helpers, aiohttp._http_writer, aiohttp._http_parser, aiohttp._websocket, cchardet._cchardet, _cffi_backend, frozenlist._frozenlist, scipy.io.matlab._mio_utils, scipy.io.matlab._streams, scipy.io.matlab._mio5_utils (total: 98)
/isaac-sim/python.sh: line 41:    16 Segmentation fault      (core dumped) $python_exe "$@" $args
There was an error running python
Simulation finished

All the help would really appriciated, Thank you in advance :)