Ollama run Gives: Error-GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:60: !"CUDA error"

king.nk.8271 · May 9, 2024, 7:07am

I am using Nvidia AGX Orion with 6.0DP.Nvidia driver is displaying NA when using nvidia-smi. But after installing below version of torch :-https://developer.download.nvidia.com/compute/redist/jp/v60dp/pytorch/torch-2.3.0a0+6ddf5cf85e.nv24.04.14026654-cp310-cp310-linux_aarch64.whl
torch.cuda.is_available() became true thus tried to install Ollama and pulled small model’s like llama3 and also tried phi3 which are small models but getting Error: timed out waiting for llama runner to start: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED
current device: 0, in function ggml_cuda_mul_mat_batched_cublas at /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:1848
cublasGemmBatchedEx(ctx.cublas_handle(), CUBLAS_OP_T, CUBLAS_OP_N, ne01, ne11, ne10, alpha, (const void **) (ptrs_src.get() + 0ne23), CUDA_R_16F, nb01/nb00, (const void **) (ptrs_src.get() + 1ne23), CUDA_R_16F, nb11/nb10, beta, ( void **) (ptrs_dst.get() + 0*ne23), cu_data_type, ne01, ne23, cu_compute_type, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:60: !“CUDA error”
My Ollama Version is:- 0.1.33
sudo journalctl -u ollama.service.txt (14.3 KB)

AastaLLL · May 10, 2024, 6:27am

Hi,

We have an Ollama container that is built on JetPack 6 DP.
Is this an option for you?

If you prefer to install it locally, please check the below discussion for enabling Ollam on Jetson:

Thanks.

king.nk.8271 · May 15, 2024, 7:43am

@AastaLLL Thanks it’s working now.

system · June 5, 2024, 6:44am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Ollama timing out when attempting to use GPU instead of CPU Jetson AGX Orin cuda , jetson-inference , generative_ai	9	4276	August 27, 2024
Ollama unable to detect gpu on JetPack 6.1 Jetson AGX Orin generative_ai	7	650	October 15, 2024
Introducing Ollama Support for Jetson Devices Jetson Projects cuda , natural-language-processing-nlp , artificialintelligence , interactive , docker-machine-learning , generative_ai	29	11383	August 28, 2024
Running Ollama / llama3.1 on Jetson AGX Xavier 16gb is it possible? how-to? Jetson AGX Xavier generative_ai , llama-31-8b-instruct	8	1943	October 19, 2024
Ollama on Docker does not finmd GPU Jetson Orin Nano generative_ai	4	635	March 5, 2025
Ollama Docker in Jetson AGX Orin Jetson AGX Orin docker , generative_ai	2	308	November 26, 2024
Ollama and Jetson issue Jetson Orin NX jetson-inference , generative_ai	12	5339	March 20, 2024
LLaMa3.1 required an upgrade to Ollama Jetson Orin NX generative_ai	6	1391	August 28, 2024
Installed Ollama container from jetson-containers, error: no compatible GPUs found Jetson AGX Xavier containers , generative_ai	5	108	April 9, 2025
@Dusty_nv has anyone managed to get Ollama running with llama3.2-vision yet? Jetson AGX Orin cuda , generative_ai , llama	7	400	December 28, 2024

Ollama run Gives: Error-GGML_ASSERT: /go/src/github.com/ollama/ollama/llm/llama.cpp/ggml-cuda.cu:60: !"CUDA error"

Related topics