VkResult: ERROR_OUT_OF_DEVICE_MEMORY

quentin.deyna · November 16, 2023, 2:16pm

Hi,

We encounterd an error after the 2023 release :

2023-11-16 13:59:01 [284,910ms] [Error] [carb.graphics-vulkan.plugin] VkResult: ERROR_OUT_OF_DEVICE_MEMORY
2023-11-16 13:59:01 [284,910ms] [Error] [carb.graphics-vulkan.plugin] vkAllocateMemory failed for flags: 0.
2023-11-16 13:59:01 [284,910ms] [Error] [gpu.foundation.plugin] Unable to allocate buffer
2023-11-16 13:59:01 [284,910ms] [Error] [gpu.foundation.plugin] Buffer creation failed for the device: 0.
2023-11-16 13:59:01 [284,910ms] [Error] [gpu.foundation.plugin] Failed to update params for RenderOp 453
2023-11-16 13:59:01 [284,910ms] [Error] [gpu.foundation.plugin] Failed to update params for RenderOp Cached PT ClearAll. Will not execute this or subsequent RenderGraph operations. Aborting RenderGraph execution
2023-11-16 13:59:01 [284,910ms] [Error] [carb.scenerenderer-rtx.plugin] Failed to execute RenderGraph on device 0. Error Code: 7
2023-11-16 13:59:02 [285,034ms] [Error] [carb.graphics-vulkan.plugin] VkResult: ERROR_OUT_OF_DEVICE_MEMORY
2023-11-16 13:59:02 [285,034ms] [Error] [carb.graphics-vulkan.plugin] vkAllocateMemory failed for flags: 2.
2023-11-16 13:59:02 [285,034ms] [Error] [gpu.foundation.plugin] Texture creation failed for the device: 0.

with sometimes this warning :

2023-11-16 14:12:58 [573,059ms] [Warning] [carb] Client omni.ui has acquired [carb::svg::Svg v0.1] 100 times. Consider accessing this interface with carb::getCachedInterface() (Performance warning)
2023-11-16 14:12:58 [573,060ms] [Warning] [carb] Client omni.ui has acquired [omni::kit::renderer::IRenderer v1.9] 100 times. Consider accessing this interface with carb::getCachedInterface() (Performance warning)

It’s happen when we run our workspace which include 5 cameras rendering and publishing ros2 rgb+pcd.
With one camera it works but with the five it crash.

On the 2022 release it works without problem.

Here is our setup : Ubuntu 22.04 with 64gb/ram
|---------------------------------------------------------------------------------------------|
| Driver Version: 525.85.05 | Graphics API: Vulkan
| GPU | Name | Active | LDA | GPU Memory | Vendor-ID | LUID |
|---------------------------------------------------------------------------------------------|
| 0 | NVIDIA GeForce RTX 4070 Ti | Yes: 0 | | 12528 MB | 10de | 0 |
|---------------------------------------------------------------------------------------------|
| 1 | Intel(R) Graphics (RPL-S) | | | 48048 MB | 8086 | 0 |

We will try a workaround by enabling/disabling ros2 publisher “on the fly” but I’m wondering if there is an issue somwhere with driver or anything else.

Thanks in advance and have a good day !

ksavevska · March 14, 2024, 10:07am

Hi @quentin.deyna

Were you able to resolve this issue? I’m encountering the same problem, so I’m curious if you have any updates on the error.

Thanks!

rthaker · March 19, 2024, 6:16pm

Hi @ksavevska - What Isaac Sim version are you using?

quentin.deyna · March 19, 2024, 9:03pm

Sorry @ksavevska, I didn’t see your question :/

I haven’t exactly resolved the issue; I’m still pondering what it was referring to, but I’ve made some improvements.

I noticed a lack of vRAM even with an RTX4080. When running Isaac with 3xRGB and Depths ROS2 topics, consuming around 5 or 6 Gbits, alongside some greedy AI algorithms, the simulation crashes when the cameras’ views are generated and published, especially when starting the other algorithms.

With some optimization and utilizing another PC to run the AI parts in parallel, the performance improved. I’ve also reduced the number of simulated cameras, which sufficed for testing purposes.

One thing I plan to try is publishing 5 rgbd camera viewports at periodic timestamps with a small offset between each, instead of publishing at every frame.

Additionally, another pain point is that Rviz2 and rqt consume a lot of power and kill the fps, depending on what you’re logging.

Hope you’re not to stuck by this :/

ksavevska · March 20, 2024, 12:44pm

Thank you, @quentin.deyna, for your response. I’ll try to optimize the code and observe the results.
However, when examining the system metrics, I noticed that no more than 40% of the vRAM is being utilized (during PPO learning with sb3 with a custom humanoid robot). Therefore I assume that the issue may not be due to a lack of vRAM.

ksavevska · March 20, 2024, 12:49pm

Hi @rthaker,

I am using 2023.1.1 in a docker container with a 535.86.05 driver.

edward.schneeweiss · June 4, 2024, 9:30pm

@rthaker

I am having a similar issue with vram in the exact same docker container. My custom extension works just fine when running on the regular Isaac-Sim application, but when I try doing exacltly the same thing from within the docker container it says VkResult: ERROR_OUT_OF_DEVICE_MEMORY but I have only used 2GB out of my 24GB available.

Any help would be greatly appreciated!

edward.schneeweiss · June 7, 2024, 5:42pm

The issue seems to be caused by some error related to Optix, and the solution was to not only pass “–gpus all” to docker but also “–runtime=nvidia”. I’m curious as to why this is required, from reading the docs it seems like specifying “–runtime=nvidia” is outdated, and in fact most gpu operations work. This feels like a niche issue.

Hammad_M · June 7, 2024, 6:00pm

@edward.schneeweiss looking at the latest docs using the runtime flag is still recommended.
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/sample-workload.html

Is it possible that you have a second integrated gpu (like intel)? in which case specifying the runtime might be needed

rob91 · January 10, 2025, 3:22pm

Hi everyone, I have the same problem using official docker image. But also adding the flag --runtime=nvidia the error remains

VickNV · January 10, 2025, 4:39pm

Could you please create a new topic and include a link to this one? Thank you.

Topic		Replies	Views
Out of memory when loading scene Isaac Sim	6	1854	April 5, 2024
Isaac 2022.2.1 vkCreateInstance failed. Vulkan 1.1 is not supported, or your driver requires an update Isaac Sim	3	2928	March 30, 2023
Error occurred during installation of Isaac Sim: Isaac Sim isaacsim	4	751	April 5, 2024
Remote Desktop Error: [VkResult: ERROR_OUT_OF_DEVICE_MEMORY] Isaac Sim	2	396	February 27, 2024
Out of GPU memory but only if adding semantic labels Isaac Sim isaac-sim-v4-2-0	2	93	October 14, 2024
Cannot load assets in IsaacSim - VkResult: ERROR_DEVICE_LOST Isaac Sim	4	602	February 22, 2024
Nothing appears in the view area when opening isaac sim Isaac Sim camera , boot , isaacsim , isaac-sim-v4-5-0	3	23	April 6, 2025
Failed to run isaac-sim on rtx3080, returns Failed to find a graphics and/or presenting queue Isaac Sim setup	11	1298	April 5, 2024
Isaac Sim 2022.2.1 stuck opening Hello World or any .USD saved file Isaac Sim	7	693	October 12, 2023
Error occurred during installation of Isaac Sim Isaac Sim	2	624	April 5, 2024

VkResult: ERROR_OUT_OF_DEVICE_MEMORY

Related topics