While following instructions in the “Build and Deploy a Multi-Agent Chatbot” playbook, got to Step 5, launched browser at http://localhost:3000/. I clicked on the basic “chat” sample prompt “Hey Spark! Can you draft an email asking a product manager in distributed systems to a coffee chat?”
It just says “Thinking…” at the top and the response does not appear to show up even after 30mins, I see there was an animated 3 dots animation below the message that shows for a few seconds and goes away. Memory usage appears to climb up to about 69B on initial prompt, but nothing returns.
Attempting to chat again with the same prompt appears to not do anything and CPU utilization remains low. No errors appear in browser console either.
Other observation I see is GPU in DGX Dashboard appears to go from 8% to 80% and alternates going down and up from there. Now it is at a resting 8-11%, but no response returns.
It appears all the containers are running fine:
CONTAINER ID NAMES STATUS
5ca725cb11b0 frontend Up 36 minutes
5afeb08b9140 milvus-standalone Up 36 minutes (healthy)
26a325d6cadb backend Up 36 minutes
a6cb5c0440d2 qwen2.5-vl Up 36 minutes (healthy)
8e1040bd32a6 deepseek-coder Up 36 minutes (healthy)
d6dce37fb817 qwen3-embedding Up 36 minutes (healthy)
af0988a4c7ec milvus-minio Up 36 minutes (healthy)
70f95a4aca45 postgres Up 36 minutes (healthy)
081cc8704dd9 milvus-etcd Up 36 minutes (healthy)
Any thoughts on how to get working?