Hi everyone,
I’ve encountered an issue with cudaMalloc on my Jetson TX2 NX (4GB model) running Tegra 32.7.4 and I’m hoping to get some insights from the community.
Issue:
When the system reaches approximately 700MB of free RAM, cudaMalloc begins to fail and triggers the OOM killer. Interestingly, standard malloc continues to function at this point. I suspect this might be due to a lack of contiguous memory available for the GPU.
Workaround:
After some experimentation, I discovered that disabling the “vpr-carveout” appears to resolve the issue. With this setting disabled, cudaMalloc is able to utilize all available RAM.
Questions:
- Is it advisable to completely disable vpr-carveout on the TX2 NX with Tegra 32.7.4? Are there any potential risks or side effects to consider?
- Are there alternative solutions or best practices to address this issue without disabling vpr-carveout?
- Has anyone experienced similar behavior with cudaMalloc on the TX2 NX or other 4GB Jetson devices, particularly those running Tegra 32.7.4?
- Is this behavior specific to the TX2 NX with 4GB RAM and Tegra 32.7.4, or is it a more widespread issue across Jetson platforms and Tegra versions?
I appreciate any insights, explanations, or alternative solutions you can provide. Thank you in advance for your assistance.