OOM Error Running canary-1b-v2 on Jetson AGX Orin (Memory Not Actually Full)

yakovyna · October 31, 2025, 5:04pm

Hi,

I’m experiencing OOM (Out of Memory) errors when trying to run nvidia/canary-1b-v2 on a Jetson AGX Orin 64GB.
The strange part is that, according to jtop monitoring, GPU memory isn’t actually being fully utilized when these errors occur.

Here’s the minimal example from the model card along with the provided short audio sample:

wget https://dldata-public.s3.us-east-2.amazonaws.com/2086-149220-0033.wav

from nemo.collections.asr.models import ASRModel

asr_ast_model = ASRModel.from_pretrained(model_name="nvidia/canary-1b-v2")
output = asr_ast_model.transcribe(['2086-149220-0033.wav'], source_lang='en', target_lang='en')
print(output[0].text)

Tested on Jetson AGX Orin using two different base images:

nvcr.io/nvidia/pytorch:25.10-py3-igpu
dustynv/pytorch:2.7-r36.4.0-cu128-24.04

Error messages:

return inputs.to(device, non_blocking=non_blocking)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA driver error: out of memory

or

x = torch.cat((x[:, 0].unsqueeze(1), x[:, 1:] - self.preemph * x[:, :-1]), dim=1)
                        ~~~^~~~~~~~~~
RuntimeError: CUDA driver error: out of memory

What’s puzzling is that jtop clearly shows GPU memory not being fully consumed at the time of failure — suggesting it might not be a real out-of-memory condition.

For comparison:

nvidia/canary-1b-flash works perfectly fine on the same setup.
canary-1b-v2 runs on an A100 GPU with a 30-minute audio file, using around 20 GB of GPU memory without issues.

This seems similar to the issue reported here: Nemo > Canary 1B > RuntimeError: CUDA driver error: out of memory .

I also tried the approach mentioned there — limiting container RAM and extending swap. Specifically:

Docker memory limit: 500 MB
Swap size: 8 GB

However, the kernel was terminated by OOM before any GPU inference could even start.

Any insights into what might be causing this or how to correctly run canary-1b-v2 on Jetson AGX Orin would be greatly appreciated.

AastaLLL · November 3, 2025, 4:08am

Hi,

Please note that swap is not a GPU allocatable memory.
Could you check the memory status with free and share the output with us?

$ free -h

Thanks.

system · December 2, 2025, 11:50pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Nemo > Canary 1B > RuntimeError: CUDA driver error: out of memory Jetson AGX Orin cuda , nemo , generative_ai , speech	10	555	January 22, 2025
CUDA out of memory Jetson Orin Nano cuda	6	482	November 6, 2025
How Jetson allocate memory for GPU? Jetson Orin Nano generative_ai	5	136	January 20, 2026
Jetson Orin Nano Super insufficient GPU memory Jetson Orin Nano cudnn	20	998	April 29, 2025
Ollama errors orin nano Jetson Orin NX nvbugs , generative_ai	42	1914	February 12, 2026
Problems about running tinycudann on Jetson AGX Orin Jetson AGX Orin cuda	12	1567	April 25, 2023
Unable to load large models on Jetson Orin Nano Super despite sufficient RAM Jetson Orin Nano llm	6	459	October 28, 2025
Issue with Nvidia Jetson AGX Orin Developer Kit (64 Gb) Jetson AGX Orin cuda , generative_ai	5	327	July 30, 2025
Ollama timing out when attempting to use GPU instead of CPU Jetson AGX Orin cuda , jetson-inference , generative_ai	9	5965	August 27, 2024
Jetson orin nano deploying yolo with llm Jetson Orin Nano llm	3	54	February 9, 2026

OOM Error Running canary-1b-v2 on Jetson AGX Orin (Memory Not Actually Full)

Related topics