Jetson Xavier NX 8GB and unified RAM management

Using jetpack 5.1.6 version 35.6.1 when using torch with cuda and expand the swap to 16GB, I get 90% of the time:

Loading TTS tacotron-internal model, it takes a while, please be patient…
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
TTS tts_models/en/ljspeech/tacotron2-DDC Loaded!
Loading ZeroShot knnvc model, it takes a while, please be patient…
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
NvMapMemAllocInternalTagged: 1074810371 error 12
NvMapMemHandleAlloc: error 0
_load_api() error: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 6.67 GiB of which 324.95 MiB is free. Of the allocated memory 414.54 MiB is allocated by PyTorch, and 13.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management ( Redirecting… )
init() error: _load_engine_zs() error: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 6.67 GiB of which 324.95 MiB is free. Of the allocated memory 414.54 MiB is allocated by PyTorch, and 13.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management ( Redirecting… )
Traceback (most recent call last):
File “/home/workmin/repos/ebook2audiobook/lib/classes/tts_engines/common/utils.py”, line 383, in _load_engine_zs
engine_zs = self._load_api(self.tts_zs_key, default_vc_model, device)
File “/home/workmin/repos/ebook2audiobook/lib/classes/tts_engines/common/utils.py”, line 287, in _load_api
engine = TTSEngine(model_path).to(device)
File “/home/workmin/repos/ebook2audiobook/python_env/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1174, in to
return self._apply(convert)
File “/home/workmin/repos/ebook2audiobook/python_env/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 780, in _apply
module._apply(fn)
File “/home/workmin/repos/ebook2audiobook/python_env/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 780, in _apply
module._apply(fn)
File “/home/workmin/repos/ebook2audiobook/python_env/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 780, in _apply
module._apply(fn)
[Previous line repeated 5 more times]
File “/home/workmin/repos/ebook2audiobook/python_env/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 805, in _apply
param_applied = fn(param)
File “/home/workmin/repos/ebook2audiobook/python_env/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1160, in convert
return t.to(
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 6.67 GiB of which 324.95 MiB is free. Of the allocated memory 414.54 MiB is allocated by PyTorch, and 13.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management ( Redirecting… )

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/workmin/repos/ebook2audiobook/lib/classes/tts_engines/tacotron.py”, line 58, in init
self.engine_zs = self._load_engine_zs(self.device)
File “/home/workmin/repos/ebook2audiobook/lib/classes/tts_engines/common/utils.py”, line 390, in _load_engine_zs
raise ValueError(error)
ValueError: _load_engine_zs() error: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 6.67 GiB of which 324.95 MiB is free. Of the allocated memory 414.54 MiB is allocated by PyTorch, and 13.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management ( Redirecting… )

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/workmin/repos/ebook2audiobook/lib/core.py”, line 2197, in convert_chapters2audio
tts_manager = TTSManager(session)
File “/home/workmin/repos/ebook2audiobook/lib/classes/tts_manager.py”, line 18, in init
self.engine = engine_cls(session)
File “/home/workmin/repos/ebook2audiobook/lib/classes/tts_engines/tacotron.py”, line 61, in init
raise ValueError(error)
ValueError: init() error: _load_engine_zs() error: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 6.67 GiB of which 324.95 MiB is free. Of the allocated memory 414.54 MiB is allocated by PyTorch, and 13.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management ( Redirecting… )
Caught DependencyError: init() error: _load_engine_zs() error: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 6.67 GiB of which 324.95 MiB is free. Of the allocated memory 414.54 MiB is allocated by PyTorch, and 13.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management ( Redirecting… )
convert_chapters2audio() error: init() error: _load_engine_zs() error: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 6.67 GiB of which 324.95 MiB is free. Of the allocated memory 414.54 MiB is allocated by PyTorch, and 13.46 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management ( Redirecting… )

I’m using the same python app on an old windows laptop with only 500MB free and it’s running slow but running. why not on jetson xavier nx?
thanks

Hi,

Based on the error, you are running out of memory.
Please note that the swap memory cannot be accessed via the GPU.

Thanks.

Hi,

yes I now it’s a memory exhaustion, but the jetson does not run anything but the OS, graphic interface disabled, the free memory is around 5 to 6GB, as it’s a shared memory is there at least a way to swap the missing memmory to a shm or else? or prioritize the process and shutdown idle processes?

Hi,

The error is CUDA out of memory so the root cause is that there is not enough memory for the GPU.
You try to monitor the system with tegrastats to confirm this.

THanks.

Hi,

I know it’s out of memory issue thanks. my question is why the jetson GPU, as it’s memory is unified with the OS RAM has not an escape way to delegate the missing memory to a kind of swap rather than brutally crash the app like other machine’s do? should we need to develop it or?

Hi,

The swap memory cannot be used by the GPU so it doesn’t increase the available memory for the GPU.

Another related fix is about the following PSIRT issue (although it is added after r35.6.2).

Thanks.