Cuda0 Buffer Error

heath.davis87 · October 18, 2025, 7:44pm

My Jetson Orin Nano Dev Kit Super (8 GB) has become unstable since updating to JetPack 6.2 (L4T 36.4.x). It was previously able to run large 8B models without issues on older JetPack, but now even small 1B or 3B models fail to load. The system also freezes and sometimes reboots when this happens.

The main error is:

error loading model: unable to allocate CUDA0 buffer

Sometimes kernel logs also show:

nvidia-modeset: ERROR: GPU:0: Failed to allocate 2743000 KBPS Iso and 4294967295 KBPS Dram

nvidia-modeset: ERROR: GPU:0: Unexpectedly failed to lock to max DRAM pre-modeset!

NVRM nvAssertOkFailedNoLog: Assertion failed … @ kern_disp.c:1161

I’m running this in Docker with NVIDIA runtime using the dustynv/ollama:0.6.8-r36.4-cu126-22.04 container. The problem occurs even when running very small LLMs like llama3.2:1b.

Command used:

sudo docker run -d –name ollama –runtime nvidia -e OLLAMA_MAX_LOADED_MODELS=1 -e OLLAMA_NUM_PARALLEL=1 -e OLLAMA_CONTEXT_LENGTH=1024 -p 11434:11434 -v ollama:/data dustynv/ollama:0.6.8-r36.4-cu126-22.04 ollama serve

echo “hi” | sudo docker exec -i ollama ollama run llama3.2:1b

I’ve tried MAXN_SUPER and 15W modes, jetson_clocks, stopping gdm3 and nvargus-daemon, adding swap, and even clean reinstalling JetPack and L4T packages. The behavior doesn’t change.

It looks like CUDA memory allocation or GPU carveout is broken in 36.4.x, possibly related to the new display driver.

Is this a known regression? Which JetPack version is currently stable for CUDA inference on the Orin Nano Super?

AastaLLL · October 20, 2025, 2:37am

Hi,

There is no known regression on the CUDA memory.
Which version did you use previously? Could you also list the JetPack and container for the stable environment with us, so we can check it further?

Thanks.

jacquezte · October 20, 2025, 5:18pm

got the same issue.

What happened to me:

Ollama with llama3.2:3b worked fine on r36.4.3
Ran Ubuntu software updates → upgraded to r36.4.7 (auto updated after booting up)
After update: Ollama broke with “unable to allocate CUDA0 buffer”

something about the update breaks the malloc for GPU i think.

AastaLLL · October 22, 2025, 7:37am

Hi,

Thanks a lot for reporting the CUDA issue.

We also got a very similar report on the topic 347862:

However, we cannot reproduce the CUDA error with the steps shared in that topic (upgrade to r36.4.7 from r36.4.4).
Could you also share the steps that reproduce the issue on your side?
We would like to give it a try as well.

Thanks.

alrough · October 22, 2025, 7:14pm

Same message: “unable to allocate CUDA0 buffer”. There are already so many entries about this issue — why is there still no solution?

AastaLLL · October 23, 2025, 6:52am

Hi,

We are checking this issue internally.
Will keep you updated on the latest status.

Thanks.

wolthuis · October 24, 2025, 4:38pm

I can confirm that updating Ubuntu via update/upgrade breaks (in my case) the stable diffusion webui Jetson container. Also the (directly installed) Ollama prog no longer works.

My temporarily work-around: fresh install JetPack 6.2.1. and the Jetson container stable diffusion webui and Ollama directly (bash installl). NOT UPDATING UBUNTU keeps both programs functioning.

It’s annoying but it works.

bahaserdaroglu3311 · October 25, 2025, 7:58pm

The same thing happened to me I can’t even load llama3.2:1b let alone try to run bigger models

AastaLLL · October 27, 2025, 7:15am

Hi, all

We are checking on this issue.
Sorry for the inconvenience, and hope to share more information with you soon.

Thanks.

stephen131 · November 4, 2025, 6:09am

I also have this issue with a Jetson Orin Nano Super after updating the system:

Error: 500 Internal Server Error: llama runner process has terminated: error loading model: unable to allocate CUDA0 buffer
llama_model_load_from_file_impl: failed to load model

ph.charriere · November 4, 2025, 10:06pm

I have the same issue:

k33g@k33g-jetson:~$ ollama run qwen2.5:1.5b
pulling manifest 
pulling 183715c43589: 100% ▕██████████████████▏ 986 MB                         
pulling 66b9ea09bd5b: 100% ▕██████████████████▏   68 B                         
pulling eb4402837c78: 100% ▕██████████████████▏ 1.5 KB                         
pulling 832dd9e00a68: 100% ▕██████████████████▏  11 KB                         
pulling 377ac4d7aeef: 100% ▕██████████████████▏  487 B                         
verifying sha256 digest 
writing manifest 
success 
Error: 500 Internal Server Error: llama runner process has terminated: error loading model: unable to allocate CUDA0 buffer
llama_model_load_from_file_impl: failed to load model

So the Jetson isn’t usable.

I’m using:

Ubuntu 22.04.5 LTS on Jetson Orin Nano 8GB RAM + JetPack 6

I installed Ollama with this command:

curl -fsSL https://ollama.com/install.sh | sh

AastaLLL · November 5, 2025, 7:00am

Hi, both

The “unable to allocate CUDA0 buffer” is a known issue related to the r36.4.7 update.
Please find more information on the topic below:

Thanks.

system · December 2, 2025, 11:55pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
CUDA0 Buffer Error Jetson Orin Nano cuda , jetson , llama	4	275	January 5, 2026
Ollama errors orin nano Jetson Orin NX nvbugs , generative_ai	42	2029	February 12, 2026
CUDA out of memory Jetson Orin Nano cuda	6	507	November 6, 2025
Llama3.2:3b randomly outputting "GGGGGGGG" when running under ollama on Jetson Orin Nano Super (JP6.2) Jetson Orin Nano generative_ai	43	1071	February 25, 2026
"unable to allocate CUDA0 buffer" after Updating Ubuntu Packages Jetson Orin Nano cuda , jetson , generative_ai , llama	236	13190	March 1, 2026
Updating Orin Nano breaks Ollama Jetson Orin Nano cuda , generative_ai	26	1299	December 11, 2025
Unable to load large models on Jetson Orin Nano Super despite sufficient RAM Jetson Orin Nano llm	6	473	October 28, 2025
High RAM usage on Jetson Orin Nano Super when idleing Jetson Orin Nano cuda	5	133	January 22, 2026
Jetpack 6 Orin Nano 4GB GPU Driver Issues Jetson Orin Nano cuda	11	634	August 13, 2024
Cannot get Ollama running on Jetpak 6.2.1 in native install and had no luck loading a container Jetson Orin Nano generative_ai	3	316	September 5, 2025

Cuda0 Buffer Error

Related topics