"unable to allocate CUDA0 buffer" after Updating Ubuntu Packages

AastaLLL · November 10, 2025, 6:46am

Thanks for your testing. Here are some statuses from our test for your reference:

r36.4.4: allocate 3G/4G/5G/6G CUDA buffer can work but allocate 7G trigger system hang and reboot.
r36.4.7: allocate 4G occasionally fails (depends on the cache size)

Dropping cache or test on a clear reboot system allows CUDA allocation up to 4GB.
This might help in some use cases. (ex, llama3.2:3b)

$ sudo su
# sync && echo 3 > /proc/sys/vm/drop_caches
# exit

Please note that our internal team is actively working on this issue.
Will keep updating the topic.

Thanks.

Topic		Replies	Views
Updating Orin Nano breaks Ollama Jetson Orin Nano cuda , generative_ai	26	873	December 11, 2025
Ollama errors orin nano Jetson Orin NX generative_ai	29	1182	December 16, 2025
Llama3.2:3b randomly outputting "GGGGGGGG" when running under ollama on Jetson Orin Nano Super (JP6.2) Jetson Orin Nano generative_ai	40	756	December 12, 2025
Cuda0 Buffer Error Jetson Orin Nano cuda	12	702	November 5, 2025
How to control amount of shared memory available to LLM on Jetson Thor? Jetson Thor generative_ai	21	565	November 10, 2025
Introducing Ollama Support for Jetson Devices Jetson Projects cuda , natural-language-processing-nlp , artificialintelligence , interactive , docker-machine-learning , generative_ai	29	13154	August 28, 2024
Ollama and Jetson issue Jetson Orin NX jetson-inference , generative_ai	12	6033	March 20, 2024
Run llm stuck while use jetson thor Jetson Thor cuda , generative_ai	7	321	September 25, 2025
@Dusty_nv has anyone managed to get Ollama running with llama3.2-vision yet? Jetson AGX Orin cuda , generative_ai , llama	7	637	December 28, 2024
Failed Llama.cpp inference on AGX Xavier: Need to downgrade L4T from 35.6.3 to 35.6.2 Jetson AGX Xavier llama	4	83	November 18, 2025