I got to a working local sandbox on Jetson AGX Thor and wanted to share what actually moved the needle for me.
Environment
- Platform: Jetson AGX Thor
- Kernel: Linux wbrain 6.8.12-tegra #1 SMP PREEMPT Tue Dec 30 15:40:41 PST 2025 aarch64
- JetPack: 7.1-b107
- L4T / BSP: R38.4.0
- CUDA: 13.0
- Driver: 580.00
- Ollama: 0.18.2
- NemoClaw: v0.1.0
- OpenShell cluster image: Package openshell/cluster · GitHub
I would suggest fixing things in this order.
- Host bridge / netfilter fix first
This was the deepest blocker on my system.
Before the fix:
- br_netfilter not loaded
- overlay not loaded
- /proc/sys/net/bridge missing
- net.bridge.bridge-nf-call-iptables unavailable
- net.bridge.bridge-nf-call-ip6tables unavailable
That caused this exact failure pattern:
- direct pod IPs like 10.42.x.x worked
- ClusterIP VIPs like 10.43.x.x timed out
- CoreDNS pod IP worked directly
- kube-dns service IP failed
- sandboxes could not reliably resolve or reach internal services
Host fix:
sudo modprobe br_netfilter
sudo modprobe overlay
echo ‘br_netfilter’ | sudo tee /etc/modules-load.d/k8s.conf >/dev/null
echo ‘overlay’ | sudo tee -a /etc/modules-load.d/k8s.conf >/dev/null
sudo tee /etc/sysctl.d/99-kubernetes-cri.conf >/dev/null <<‘EOF’
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
Host verification:
lsmod | egrep ‘br_netfilter|overlay’
sysctl -a 2>/dev/null | egrep ‘net.bridge.bridge-nf-call-(iptables|ip6tables)’
ls -l /proc/sys/net/bridge
Expected:
- br_netfilter loaded
- overlay loaded
- bridge sysctls present and set to 1
- Download and use a local copy of nemoclaw.sh second
I downloaded a local copy of nemoclaw.sh instead of repeatedly using curl | bash:
curl -fsSL https://www.nvidia.com/nemoclaw.sh -o ~/nemoclaw.sh
chmod +x ~/nemoclaw.sh
I did this for two reasons:
1. to inspect the installer flow locally
2. to patch it for Thor local setup
In my case, I uncommented the install_or_upgrade_ollama call so that Ollama was installed before onboarding. That was necessary to get the local Ollama path working reliably on my system.
- Gateway hard reset + GPU injection third
On my system I had to manually apply the NVIDIA device plugin so that Kubernetes inside OpenShell advertised the GPU correctly.
These steps are not required, unless you want this fix for a gateway out side of this context
gateway hard reset + GPU injection:
#openshell gateway destroy
#openshell gateway start
While nemoclaw was running right after it said it started a gateway and it is waiting for a sandbox name, I went to another window and ran
openshell doctor exec – kubectl apply -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml
Important:
wait about 10 seconds before running the verification command below so the pod has time to initialize.
Verify GPU is visible to k8s:
openshell doctor exec – kubectl get nodes -o custom-columns=NAME:.metadata.name,GPU:.status.allocatable.‘nvidia\.com/gpu’
Expected:
NAME GPU
1
- Ollama host bind fix fourth
Once I got farther into onboarding, local Ollama worked only after making sure it listened on all interfaces, not just localhost.
Start Ollama correctly:
export OLLAMA_HOST=0.0.0.0:11434
nohup ollama serve >/tmp/ollama.log 2>&1 &
Verify:
curl http://localhost:11434/api/tags
ss -lntp | grep 11434
I needed port 11434 listening on * / 0.0.0.0, not just 127.0.0.1, otherwise the sandbox could not reach it via host.openshell.internal.
What finally worked
I ended up with:
- sandbox: thor-titan
- provider: ollama-local
- model: nemotron-3-nano:30b
Verification from my system:
- nemoclaw thor-titan status showed Provider: ollama-local
- ollama ps showed the model loaded and using 100% GPU
- nvidia-smi showed /usr/local/bin/ollama using GPU memory
Important caution
nemoclaw.sh behaved like a bootstrap / onboarding script, not a resumable installer. Rerunning it re-touched gateway / sandbox state.
Once I got to a working sandbox, I stopped rerunning the installer and only used:
- nemoclaw thor-titan status
- nemoclaw thor-titan connect
- nemoclaw thor-titan logs --follow
Bottom line
The biggest fix for me was not the model selection or the cloud/API side.
The biggest fix was the host kernel networking state:
- br_netfilter
- overlay
- bridge sysctls
Without that, ClusterIP routing inside the OpenShell k3s environment was broken, and the rest of the stack behaved unpredictably.