I’ve run Ollama on a Mac with reasonable results but was looking to speed it up by moving it to a freshly purchased Jetson Thor AGX. I’ve followed a guide jetson-ai-lab.com and have it running. Specifically I’m using the following docker command line:
docker run --runtime nvidia --gpus all -it -v ${HOME}/oll ama-data:/data ``ghcr.io/nvidia-ai-iot/ollama:r38.2.arm64-sbsa-cu130-24.04
Occasionally the ollama client will receive nonsense data back (usually rendered as random non Latin characters) and when I look at the kernel log I see:
Do you have any timeframes on when a fix might be prepared? Are there any workarounds? Is there anything I can do to mitigate the problem? And lastly, are other LLM “engines” impacted with this problem?
Obviously running inside docker would probably be better, but since (at least at the moment) ollama is the only application running on my Jetson, that’s fine.
Since I’ve switched to running ollama 0.12.9 (following that bug report I linked above) outside of docker I’ve not had that kernel message, nor any unexpected output characters from ollama. I’ve not tried the latest ollama.
I have for Ollama yeah. Am very, very happy. But I’ve still not had any luck whatsoever running anything else in Docker. Eg:
jetson-containers run $(autotag stable-diffusion-webui)
Results in:
...
Python 3.10.12 (main, Feb 4 2025, 14:57:36) [GCC 11.4.0]
Version: v1.10.1
Commit hash: 82a973c04367123ae98bd9abdf80d9eda9b910e2
Traceback (most recent call last):
File "/opt/stable-diffusion-webui/launch.py", line 48, in <module>
main()
File "/opt/stable-diffusion-webui/launch.py", line 39, in main
prepare_environment()
File "/opt/stable-diffusion-webui/modules/launch_utils.py", line 387, in prepare_environment
raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check