I received a recommendation to upgrade Ubuntu, clicked Update, and upgraded L4T from 35.6.2 to 35.6.3.
After the upgrade I recompiled llama.cpp and found that inference produced a “memory allocation” error.
Investigating revealed that internal changes in L4T 35.6.3 affected GPU memory allocation, causing llama.cpp to be unable to obtain enough VRAM.
I then used apt --allow-downgrades to downgrade L4T back to 35.6.2, recompiled, and everything worked fine.
I cannot provide further details, as I only pasted the commands that ChatGPT gave me.
Hi,
We meet some similar issues when upgrading r36.4.4 to r36.4.7.
Since the upgrade might change the glibc version (we found this in the r36 case).
Could you try to remove all the llama.cpp log/cache/model and try it again to see if it helps?
Thanks.
I re-cloned llama.cpp, but it didn’t make a difference.
my OS version is still 20.04 LTS. Do I need to update it?
There is no update from you for a period, assuming this is not an issue anymore.
Hence, we are closing this topic. If need further support, please open a new one.
Thanks ~1119
Hi,
Do you meet the error when using r35.6.2?
Could you share the complete output logs with us so we can know more about the error?
Thanks.