OpenClaw + Ollama hybrid + ClawMobile architecture

cher012410 · April 8, 2026, 1:20am

I’ve been refining a local AI agent architecture designed for real-world software engineering. After some trial and error with Docker isolation and memory bottlenecks, I wanted to share the specific hybrid setup that’s actually proving usable.

The Architecture: Decoupling Reasoning from Retrieval

The core philosophy here is separating the “Cognition” from the “Memory” to solve the latency issues caused by loading massive context into a large model every time it runs.

The AI Gateway (Docker): This container runs the heavy reasoning engine— llama-proxy.py + llama.cpp serving the Qwen 3.5 35B-A3B model.
The AI Agent (OpenClaw) & Orchestrator (Docker): A separate containerized environment for the workspace, project files, and the Orchestrator logic.
The “Second Brain” (Host-Side): I run a secondary embedding model (like nomic-embed-text) via Ollama on the host, integrated with the QMD search engine.

By using the host-side embedded model for data retrieval, I avoid forcing the 35B Qwen model to ingest all memory data for every request, which significantly improves responsiveness.

Claw-Hybrid-Platform/
├── ai-agent/                         # Logic Layer: OpenClaw
│   ├── ai-agent-logs/                # Persistent runtime logs
│   │   └── gateway.log               # Logs for OpenClaw gateway activity
│   └── Dockerfile                    # Environment for agent logic
├── ai-gateway/                       # Computing Layer: LLM & Reasoning Proxy
│   ├── logs/
│   │   └── llama-server.log          # Raw output from llama-server
│   ├── proxy/
│   │   ├── config.yaml               # Gateway/Proxy configurations
│   │   └── llama-proxy.py            # OpenAI-compatible API wrapper for llama.cpp
│   └── Dockerfile                    # Environment for agent gateway
├── llama.cpp/                        # Source: Cloned llama.cpp for local build
├── models/                           # Model Storage: Save .gguf files here
├── persistfolder/                    # Persistent Data(The “Soul” of the Agent)                   
│   ├── config/                       # Holds openclaw.json and global settings
│   ├── memory/                       # Memory storage (main.sqlite)
│   └── workspace/                    # Projects, AGENT.md, MEMORY.md, daily log…etc
└── docker-compose.yml                # Bridges Host GPU & Docker Containers

The 40GB Memory Challenge

Even on high-end hardware like the NVIDIA DGX Spark (120GB), memory management is the primary hurdle.

Baseline: The system and browser sit at ~5GB.
The Agent Stack: Spin up OpenClaw, an additional ~25GB, which hit 30GB.
Full Orchestration: Once the Orchestrator, Redis, and Celery workers are live, the stack hits 32~40GB before even reaching full load.

This high memory footprint is exactly why the hybrid Docker/Host split is necessary—it keeps the reasoning engine isolated while letting the retrieval engine run lean on the host.

The Workflow: Orchestrator + ClawMobile

The real value comes from the synergy between the Orchestrator and the ClawMobile app. It changes the dev process from “babysitting a terminal” to “asynchronous management.”

Orchestrator: Handles the heavy lifting—task queuing, multi-phase development, and background execution.

Task Management: A FastAPI backend with Celery and Redis handles asynchronous task queues, allowing for multi-phase development workflows (create, test, deploy).

Monitoring: Provides real-time WebSocket log streams and tool-tracking to audit every operation performed by the AI agents.

Screenshot from 2026-04-07 20-48-16955×663 30 KB

Screenshot from 2026-04-07 20-48-541262×820 96.7 KB
ClawMobile: Since it speaks the OpenClaw Gateway protocol, I can stay connected to the DGX Spark from anywhere. I can check the dashboard to see which tasks are in progress, failed, or completed in the Orchestrator background and provide real-time feedback to the agent while I’m away from my desk.

Secure Remote Access: Connects via Tailscale or LAN using Ed25519 authentication.

Mobile Supervision: Allows the user to track Orchestrator tasks and provide real-time feedback to OpenClaw, ensuring continuous improvement without being tethered to a laptop.

clawmobile768×1376 248 KB

Screenshot_20260407_205251_Claw Mobile1080×2086 153 KB

Screenshot_20260407_205032_Claw Mobile1080×2086 133 KB

**

Resource Architecture Table**

Layer	Components	Placement	Memory Impact
Cognition	Qwen 3.5 (35B), OpenClaw Gateway	Docker (AI-Gateway)	~25GB
Retrieval	Ollama, Nomic-Embed, QMD v2	Host Machine	Low (Optimized)
Management	FastAPI, Redis, Celery, SQLite	Docker (AI-Agent/Orchestrator)	~2-10GB+
Interface	Kotlin Android App / OpenClaw Dashboard or Orchestrator Frontend	Mobile Device / Browser	N/A / 500M-1GB

**

Key Takeaways for Developers**

Don’t over-contextualize the LLM: Use a secondary, smaller embedding model for RAG/Retrieval to keep your main reasoning model fast.
Persistence: Use Docker Compose to mount host volumes for /workspace and /memory so your agent’s “soul” survives a reboot.
Infrastructure: Even with 120GB of VRAM, efficiency matters. Separating the Gateway from the Agent allows for better resource allocation.

This setup moves away from “AI as a chatbot” and toward “AI as an autonomous background process” that you manage via mobile.

Tips
In OpenClaw chatbox tag [gemini], [autoresearch], [think] keywords to get more functions.

Reference:

Digital_David · April 8, 2026, 1:04pm

Hi, I’m currently struggling with the same issues and was on the same path yesterday of a dual model set up.

For the model side, have you tried https://www.byterover.dev/ it has native openclaw integration with up to 92% recall accuracy, and it’s all .json / .md files in a hireracle tree setting instead of a sql or vector DB.

Since this seems to be a dedicated “claw” machine, as mine is as well, Is there a reason to run Openclaw in a docker instead of running as an application itself with auto restart on reboot? I’ve not experienced a “loss of soul” issue yet, even when I’ve hard crashed the DGX.

cher012410 · April 8, 2026, 1:43pm

I haven’t used ByteRover yet. I think ByteRover, Milla-Jovovich/Mempalace are all good management methods. Hermes’ Agent self-improving AI agent framework is also worth considering.

QMD solves the problem of “delivery” (retrieval speed), while ByteRover solves the problem of “warehousing and logistics management” (structuring and lifecycle of memory).

==============================================

What are the advantages and disadvantages of running OpenClaw on Docker?

✅ Advantages (Pros)

1. File System “Safety Box” Mechanism

Preventing Accidental Deletion: In Docker, you can mount via ReadOnly or only mount specific /workspace directories. Even if the agent executes rm -rf /, it will only destroy the virtual system inside the container, not harm your DGX host.

Malicious Code Isolation: If the agent browses the web, downloads and executes a malicious script, the script will be trapped in the container’s Linux environment, unable to easily gain root privileges on the host.

2. Resource Constraints

You can utilize Docker’s --memory or --cpus constraints to prevent OpenClaw from accidentally exhausting all of DGX’s CPU resources when handling complex tasks, causing the host to crash or SSH connection to drop.

❌ Disadvantages and Challenges (Cons)

1. Network Overhead

Problem: The current architecture relies on host.docker.internal:11434 for communication with the host’s LLM.

Impact: Large amounts of embedding data and token streams transmitted between the container and the host pass through Docker’s virtual network bridge layer (docker0), resulting in additional latency. For AI dialogues that prioritize “real-time” performance, this will be slightly slower than native execution.

2. Permission Issues

Pain Point: This is the most common problem encountered with Docker. When the agent generates code as root inside the container, you may not be able to modify these files on the host using your own account.

Solution: You need to explicitly specify user: “${UID}:${GID}” in Docker Compose; otherwise, the development experience will become very cumbersome.

Digital_David · April 8, 2026, 3:31pm

I come from th pre-virtualization era, where we have bare metal installations remain up for a few years without a single reboot. So learning the pros and cons, potential security issues is important… I have corns set up to backup multiple directories to a remote server nightly, giving enough time to revert back a day, if needed.

Wouldn’t using the --sandbox agent flag in the agent give the same security?

cher012410 · April 8, 2026, 5:40pm

That is a great question. While the --sandbox flag in OpenClaw is a strong first line of defense, it operates at a different “layer” than Docker.

The --sandbox flag: This is a Software-level restriction. It usually tells the agent’s internal code-executor (like a Python or Node.js runtime) to restrict its file-system access to a specific path. If there is a bug in the sandbox implementation or an “escape” vulnerability in the language runtime itself, the agent can still break out and see your host files.

Besides, the --sandbox flag doesn’t help with system bloat or dependency hell. If an agent installs 50 random npm packages or pip libraries to solve a task, it litters your host system with files. In Docker, you just delete the container and start fresh. Your nightly cron backups are perfect for your data, but Docker protects your environment from getting messy.

Digital_David · April 8, 2026, 8:39pm

Understandable.

Would you mind sharing a redacted version of your openclaw.json and some of the commands used to set up your dockers? If you don’t want to share here, PM me. Would be happy to share what I’ve done as well.

cher012410 · April 8, 2026, 9:28pm

Hi David

In “Claw_Setup” project at the Reference, I added the openclaw.json as a sample file. If the file path, port, file/model name, API key you setup correctly, it works well.

Topic		Replies	Views
Optimizing DGX for Openclaw Brain DGX Spark / GB10 openclaw	6	324	June 12, 2026
New guide: Run OpenClaw AI agents on DGX Spark Announcements	0	1013	March 12, 2026
Build a Secure, Always-On Local AI Agent with OpenClaw and NVIDIA NemoClaw Announcements agentic-ai , nemotron , nemoclaw , openclaw	0	187	April 17, 2026
OpenClaw on several GPUs DGX Spark / GB10 Projects jetson , nemotron	0	382	March 21, 2026
Has anyone had any good experience running on DGX Spark with clawdbot? DGX Spark / GB10	38	4472	March 20, 2026
Guide: llama.cpp + Qwen3.5-35B-A3B + openclaw on GB10 DGX Spark / GB10 Projects llama	4	8296	March 4, 2026
Total nightmare : NEMOCLAW over Paperclip over OPENCLAW over vLLM over Dokers, over LLM flavours , over Linux DGX Spark / GB10	14	3685	March 25, 2026
NemoClaw on Spark DGX Spark / GB10 agentic-ai	56	5256	March 24, 2026
Use Cases for OpenClaw / Hermes DGX Spark / GB10 Projects openclaw	9	1016	June 2, 2026
Why does OPENCLAW call the OLLAMA API, and the model doesn't remember my previous question? How to set up OLLAMA? DGX Spark / GB10	7	1194	March 3, 2026

OpenClaw + Ollama hybrid + ClawMobile architecture

The Architecture: Decoupling Reasoning from Retrieval

The 40GB Memory Challenge

The Workflow: Orchestrator + ClawMobile

**

**

Related topics