Hi,
I’m running a PyTorch YOLO-based inference on a Jetson Orin Nano Super, and I frequently get these errors (not always, but randomly):
NvMapMemAllocInternalTagged: 1075072515 error 12
NvMapMemHandleAlloc: error 0
Error : NVML_SUCCESS == r INTERNAL ASSERT FAILED at "/opt/pytorch/pytorch/c10/cuda/CUDACachingAllocator.cpp":838, please report a bug to PyTorch.
I tried the following, but the issue still occurs:
-
with torch.no_grad()during inference -
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'expandable_segments:True' -
Full cleanup using
torch.cuda.empty_cache(),gc.collect(), and reloading the model
The error isn’t always caught by try/except and sometimes crashes the process.
Setup:
-
Jetson Orin Nano
-
JetPack 6.2.1
-
PyTorch (from NVIDIA SDK): 2.5.0a0+872d972e41.nv24.08
-
Model: YOLO (tracking mode)
Questions:
-
Is this a PyTorch issue or a Jetson memory allocator issue (NvMapMemAlloc)?
-
Any known fix or configuration to prevent this intermittent error?
Thanks in advance for any suggestions.