"unable to allocate CUDA0 buffer" after Updating Ubuntu Packages

They had to figure out how to fix it without incurring the original problem the break was meant to solve. They patched the kernel to fix a potential security bug with memory allocation buffer overrun. In doing so they crippled the primary AI application people purchase the device for. Now they have to integrate the new solution into mainline without requiring the users to downgrade, perform a complex kernel patch, or run in CPU slug mode.

2 Likes

well summarised.

Some sort of gesture to the buyers would be appropriate even after the fix finally arrives.

If possible a memory upgrade option would be appreciated. Somebody already tried this themselves by desoldering and switching the 8Gb memory modules for 16Gb (and flashing a modified firmware)

An official path to do this would unlock the full power of this board!

Thank you for putting the instructions in a step-by-step way. I was able to do this and get the Ollama to work without needing to reflash the jetson.

1 Like

Super helpful!

1 Like

Hi,

We need more time for the r35 changes.
Will let you know once it is ready.

Thanks.

1 Like

Absolute Legend, wasn’t on my list of things to learn (patching and compiling a Kernel) but this was the perfect How-to for it.

Was running r36.4.7 and this worked perfectly to llama3 running in a container. Thanks much!

1 Like

Hi, all

Please ignore this message if you are using JetPack6/r36.4.7.
For the user who uses JetPack 5/r35.6.3, please find below the fix information.

Thanks.

fix_kernel_auto.sh.txt (8.5 KB)

Attached is a bash script file that performs all of the steps for the patch above. It includes some error handling and if a step fails, run again and it’ll start back where it was. Use at your own risk. Just rename it to get rid of the txt extension, make it executable (chmod +x) and run it.

I tested it and it and my issues were fixed perfectly.

8 Likes

ggilman’s script worked flawlessly. thanx dude.

I have a fresh install with the latest software and it still has this issue. When can we expect a simple solution that is through an update to fix this? I appreciate that customers went through the process to patch this but when will Nvidia fix the issue? Right now this device cannot run ollama or any of the the STT models I can use on every other machine I have. It runs YOLO but that is about it.

1 Like

I know just enough to be dangerous… I renamed your txt file to .sh, used chmod +x to make it executable, cd to the directory, then try to run with “./fix_kernel_auto.sh” I’m getting an error “/bin/bash^M: bad interpreter: No such file or directory”. I tried moving the script to a few different locations as well as running it sudo with no luck… Can you provide a quick step by step for a noob?

Nevermind I found this post: https://stackoverflow.com/questions/14219092/bash-script-bin-bashm-bad-interpreter-no-such-file-or-directory and ran “sed -i -e ‘s/\r$//’ fix_kernel_auto.sh” and the script ran sudo after that! Thanks!

If you use r36 then what? Is the fix already released and available?

1 Like

Here is more information that worked for me:
The “Memory Bug” Restoration Guide

This process aligns the operating system, kernel, and driver modules to a state where large model context windows (like 16k) can be allocated successfully.


Phase 1: (Host PC) Pre-Flight Check

This is to be done on a separate ubuntu computer with SDKManager installed and Jetson 6.2.1 downloaded.

Before touching the Jetson, you must definitively confirm the host has the “Golden Version” (36.4.4). DO NOT proceed if the host is at 4.7, as the kernel patch source code is not yet available for that revision.

  1. Run the Definitive Version Check:

    Bash

    cat ~/nvidia/nvidia_sdk/JetPack_6.2.1_Linux_JETSON_ORIN_NANO_TARGETS/Linux_for_Tegra/rootfs/etc/nv_tegra_release
    
    
  2. Verify the Output:

    • Success: It must say # R36 (release), REVISION: 4.4.

    • Fail: If it says 4.7 or is missing, you are flashing a broken base.


Phase 2: The Clean Slate (Flash)

Once the host is verified as 4.4, you must flash the Nano to ensure it is not running the broken 4.7 update.

  1. Recovery Mode: Put the Nano in recovery mode (Jumper on pins 9-10).

  2. Execute Flash: From the Linux_for_Tegra folder on Agent1:

    Bash

    sudo ./flash.sh jetson-orin-nano-devkit internal
    
    
  3. Verify Flash Log: The very top of the log must confirm: # R36 , REVISION: 4.4.


Phase 3: Preparation on Nano 2

Before running the fix, you must clear “ghost” data and corrupted downloads from previous attempts.

  1. Wipe the Directory:

    Bash

    sudo rm -rf ~/jetson-kernel-fix
    
    

    This forces the script to actually download the 300MB+ 4.4 source code instead of skipping the step.

  2. Check Script URL: Open fix_kernel_auto.sh and ensure the URL_SOURCE points to the v4.4 link.


Phase 4: The Kernel Patch (The “Fix”)

This phase recompiles the kernel memory management logic to allow contiguous CUDA buffer allocations.

  1. Run the Script:

    Bash

    sudo ./fix_kernel_auto.sh
    
    
  2. Wait for Completion: Extraction and compilation will take 1 to 2 hours.

  3. Reboot: Once the script displays SUCCESS!, reboot the Nano.


Phase 5: Locking the System (Prevention)

This is the most critical step to prevent a future sudo apt upgrade from overwriting your fix.

  1. Pin the Kernel Packages: Run this on the Nano immediately after rebooting:

    Bash

    sudo apt-mark hold nvidia-l4t-kernel nvidia-l4t-kernel-oot-modules
    
    

Verification

Confirm the fix by running ollama with a model that would not load before the fix.

Hi,

To be clear:
The patches for r36: "unable to allocate CUDA0 buffer" after Updating Ubuntu Packages - #169 by AastaLLL
The patches for r35: Jetson 35.6.3 - #26 by AastaLLL

The public release with the fix for r36 has not been released yet (planned in Q1).
Thanks.

Will there be also a public release for r35?

Thank you. This could be the best solution before the official release

1 Like

@tlgraf84 steps

Applied the patch on 36.4.7 (board is jetson orin nano super).

Before this patch, nemo toolkit with parakeet asr model was giving memory allocation error but now it’s freezing the system. In older release it used to work fine. Anything I am missing?

I notice it takes a LONG time to load models on the Orin Nano in Llama.cpp, even small ones. The ‘freeze’ may be just that; long model load time. On my Thor, I can load Llama 70b Q4, which is 42 Gigs, in under 30 seconds, on Orin Nano it takes tens of minutes to load Llama 3B. HTH.