Which container - jetson container or nvidia container?

baqwas · March 10, 2025, 12:19pm

I barely know anything about Docker, except to have successfully downgraded to v37 for my Jetson Orin Nano SDK. No error messages so far but I need guidance on the recommended one, please.

The instructions at https://www.jetson-ai-lab.com/tutorial_openwebui.html worked fine. I am able to download Ollama compliant models, but the query responses are returned as a paragraph worth of gray bars - no actual text, but then several minutes later (in some instances) the bars morph into proper text that represent the answers. What is the error on my part? Thanks.

Regards.
P.S.
Sorry for the two-part question but perhaps I messed somewhere in deploying Open WebUI with the pre-built Docker container. Thanks for your understanding.

AastaLLL · March 11, 2025, 3:39am

Hi,

The docker v28.0.1+ can work on Jetson and you don’t need to downgrade anymore.

Do you meet this issue constantly?
It’s expected that the first test to be slower as it might need to download or initiate the environment.

Thanks.

baqwas · March 11, 2025, 11:24am

Hello @AastaLLL,

Allow me to back up a little bit on the Docker situation before I answer your specific question because that might shed some light on some self-inflicted issues at my end.

After receiving my Nano SDK (from Seeed in late Jan - I mention this specifically because there may be some doubts about this SKU based on some constrained supply threads in this forum, but the box clearly stated SDK):

Flashed microSD and booted and completed the standard Ubuntu install
Rebooted the obligatory 3 times (including applying JetPack 6)
Happy to see MAXN SUPER
Couldn’t run Docker v28; read the relevant guides (Jetson) and was successful in deploying Docker v27; not any wiser, so I have stuck with v27 (and thanks to Jetson Developer team to guide me into using Docker after all these years of challenges on other single board computers)
Ollama is running as a service outside Docker; runs rather well in CLI mode too
Open WebUI is running in a container using instructions from https://www.jetson-ai-lab.com/tutorial_openwebui.html

Sorry, for this long-winded preamble but now back to your original question:

Yes, the issue is fully repeatable
My limited knowledge tells me that the Ollama response is fast based on visual interpretation of the bars versus text in the response paragraph
The rendering by “Javascript?” is the issue?
After an extended lapse of time (over 5+ minutes, guessing here), partial text is revealed; after even more time, more text is revealed - all responses are correct (the queries are on dimensions of objects in the solar system) with math formulae but even simple text-only responses display the same delays
Since I was in headless mode, I thought that DNS (or other network latency) could be an issue (I have Pihole/unbound/OPNsense in the resolution path) because I had to do some network changes but local DNS queries are never pushed out to the WAN
So I turned to local access directly on the Jetson (using the default 127.0.0.1:8080 address) but the manifestation did not change - gray bars instead of text and then after several minutes a few lines reveal themselves
So my simple conclusion is that the access method is not the issue but something to do with Open WebUI; not sure that this a bug, so I have refrained from posting an issue at their GitHub site - will do if you advise as such

Bottom line, I don’t have any specific Docker version dependencies. I simply want to use some UI on the Jetson for LLM self-paced learning. There is no arm64/aarch64 for MSTY.

Thanks again.

Regards.

AastaLLL · March 12, 2025, 6:09am

Hi,

Do you mean each time the query is sent, the same behavior occurs? (gray bar ->(some minutes)-> proper text).
If so, would you mind sharing a video or picture to capture the issue you face and share it with us?

Thanks.

baqwas · March 12, 2025, 10:31am

Sure, let me get my ducks in a row and I’ll upload some screen shots (since it is mostly static) with approximate latency numbers. Thanks.

Regards.

baqwas · March 12, 2025, 11:36am

Hello @AastaLLL,

I feel that it is related to my choice of models, which is perfectly understandable.

llama3.2 (3.2b?) has no issue, it even seems to look ahead before I hit the Send key. On the other hand, gemma:latest (9b) continues to display the issue (see attached screenshot) for approximately a minute or so for this query, then displays the response. Other queries hang for much longer. I didn’t get the parameter size of either the llama or the gemma models (just chose the first one in the list - number in parentheses above should be correct).

Please consider the issue resolved. I will simply have to be careful about the choice of models (especially on the Nano). At least now that I understand that I have to try out different models and compile my own list of usable models, I am satisfied with the status quo. Appreciate your nudge in the right direction for me.

Regards.

P.S.
llama does display gray bars, but it is over in a flash.

baqwas · March 12, 2025, 11:46am

One other observation:

No issue with gemma while using Ollama for the same query on the Nano. The issue manifests only when using Open WebUI.

Regards.

AastaLLL · March 13, 2025, 7:31am

Hi,

Yes, the issue is related to the model size.
We usually recommend to try the model with size <4B on the Orin Nano.

Gemma 9B is too big for the Orin Nano.
But there is a Gemma 2B model, could you give it a try?
Is the Gemma ran with Ollama also a 9B version?

You can find our testing for different model in the link below:

Thanks.

baqwas · March 13, 2025, 3:26pm

OK. Thanks for the heads up on Models page. That is a great list. It will take some time to step through (given all the quantization granularity, too).

The gemma3:4b didn’t show any improvement. As you have indicated, I will have to go down a size or two. I will continue to step lower but I have sufficient guidance from you on what steps I need to perform. Thanks.

system · April 9, 2025, 2:57am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Problems with "Tutorial - text-generation-webui" Jetson Orin Nano generative_ai	6	331	February 24, 2025
Jetson Orin Nano Super: Error Running Gemma 3 4B Model Jetson Orin Nano generative_ai	8	418	April 2, 2025
Jetson Container `Nano_llm` version 24.6-r36.2.0 error on Jepack 6.0 DP Jetson Orin NX containers , generative_ai	5	255	July 4, 2024
Ollama on Docker does not finmd GPU Jetson Orin Nano generative_ai	4	847	March 5, 2025
Nano_LLM or nanollm for Python package? Jetson Orin Nano generative_ai , llama	8	54	May 15, 2025
Introducing Ollama Support for Jetson Devices Jetson Projects cuda , natural-language-processing-nlp , artificialintelligence , interactive , docker-machine-learning , generative_ai	29	11753	August 28, 2024
Gemma3:4b not using the gpu while gemma3:1b does on orin Jetson Nano super Jetson Orin Nano generative_ai , llama	1	80	June 2, 2025
Ollama and Jetson issue Jetson Orin NX jetson-inference , generative_ai	12	5442	March 20, 2024
Help Needed to Update Ollama Container for Newer Model Support (JetPack 6.0 DP) Jetson Orin Nano cuda , jetson-inference , llama	7	662	November 5, 2024
Running ai docker containers on jetson orin nano with gpu support Jetson Orin Nano docker , generative_ai	7	59	June 18, 2025

Which container - jetson container or nvidia container?

Related topics