How to run local llm with cuda 10.2 support

Hi, I recently bought a Jetson Nano Development Kit and tried running local models for text generation on it. For example, Ollama works, but without CUDA support, it’s slower than on a Raspberry Pi! The Jetson Nano costs more than a typical Raspberry Pi, but without CUDA support, it feels like a total waste of money.

Is there a way to run these models with CUDA 10.2 support?


Unfortunately, we don’t have experience with llm on Jetson Nano with CUDA 10.2.
However, we have a generative AI tutorial for the Orin series (including Orin Nano).

If this is an option for you, please give it a try.


All these solutions for Orin do not support Cuda 10.2, so it is slower than on Raspberry Pi 5


If you want to get CUDA support for Ollama, please try our new device.

@aniolekx if you follow this thread, Jetson support appears to be in ollama dating back to Nano / CUDA 10.2:

You may need to compile it from source. If you face issue, please file issues against the upstream ollama repo who is maintaining the project.

