I barely know anything about Docker, except to have successfully downgraded to v37 for my Jetson Orin Nano SDK. No error messages so far but I need guidance on the recommended one, please.
The instructions at https://www.jetson-ai-lab.com/tutorial_openwebui.html worked fine. I am able to download Ollama compliant models, but the query responses are returned as a paragraph worth of gray bars - no actual text, but then several minutes later (in some instances) the bars morph into proper text that represent the answers. What is the error on my part? Thanks.
Regards.
P.S.
Sorry for the two-part question but perhaps I messed somewhere in deploying Open WebUI with the pre-built Docker container. Thanks for your understanding.
Allow me to back up a little bit on the Docker situation before I answer your specific question because that might shed some light on some self-inflicted issues at my end.
After receiving my Nano SDK (from Seeed in late Jan - I mention this specifically because there may be some doubts about this SKU based on some constrained supply threads in this forum, but the box clearly stated SDK):
Flashed microSD and booted and completed the standard Ubuntu install
Rebooted the obligatory 3 times (including applying JetPack 6)
Happy to see MAXN SUPER
Couldn’t run Docker v28; read the relevant guides (Jetson) and was successful in deploying Docker v27; not any wiser, so I have stuck with v27 (and thanks to Jetson Developer team to guide me into using Docker after all these years of challenges on other single board computers)
Ollama is running as a service outside Docker; runs rather well in CLI mode too
Sorry, for this long-winded preamble but now back to your original question:
Yes, the issue is fully repeatable
My limited knowledge tells me that the Ollama response is fast based on visual interpretation of the bars versus text in the response paragraph
The rendering by “Javascript?” is the issue?
After an extended lapse of time (over 5+ minutes, guessing here), partial text is revealed; after even more time, more text is revealed - all responses are correct (the queries are on dimensions of objects in the solar system) with math formulae but even simple text-only responses display the same delays
Since I was in headless mode, I thought that DNS (or other network latency) could be an issue (I have Pihole/unbound/OPNsense in the resolution path) because I had to do some network changes but local DNS queries are never pushed out to the WAN
So I turned to local access directly on the Jetson (using the default 127.0.0.1:8080 address) but the manifestation did not change - gray bars instead of text and then after several minutes a few lines reveal themselves
So my simple conclusion is that the access method is not the issue but something to do with Open WebUI; not sure that this a bug, so I have refrained from posting an issue at their GitHub site - will do if you advise as such
Bottom line, I don’t have any specific Docker version dependencies. I simply want to use some UI on the Jetson for LLM self-paced learning. There is no arm64/aarch64 for MSTY.
Do you mean each time the query is sent, the same behavior occurs? (gray bar ->(some minutes)-> proper text).
If so, would you mind sharing a video or picture to capture the issue you face and share it with us?
I feel that it is related to my choice of models, which is perfectly understandable.
llama3.2 (3.2b?) has no issue, it even seems to look ahead before I hit the Send key. On the other hand, gemma:latest (9b) continues to display the issue (see attached screenshot) for approximately a minute or so for this query, then displays the response. Other queries hang for much longer. I didn’t get the parameter size of either the llama or the gemma models (just chose the first one in the list - number in parentheses above should be correct).
Please consider the issue resolved. I will simply have to be careful about the choice of models (especially on the Nano). At least now that I understand that I have to try out different models and compile my own list of usable models, I am satisfied with the status quo. Appreciate your nudge in the right direction for me.
Regards.
P.S.
llama does display gray bars, but it is over in a flash.
OK. Thanks for the heads up on Models page. That is a great list. It will take some time to step through (given all the quantization granularity, too).
The gemma3:4b didn’t show any improvement. As you have indicated, I will have to go down a size or two. I will continue to step lower but I have sufficient guidance from you on what steps I need to perform. Thanks.