: Full NVIDIA Enterprise RAG Blueprint on DGX Spark (ARM64) — First Deployment, 11 Bug Fixes Documented
Full NVIDIA Enterprise RAG Blueprint on DGX Spark (ARM64) — First Deployment, 11 Bug Fixes Documented
NVIDIA’s stated position as of February 2026 was that ARM64 container support for the Enterprise RAG Blueprint components was “not supported, on roadmap.” Rather than wait, I worked around this by building all Python services from source in native ARM64 venvs while running NIM microservices and infrastructure components in ARM64-compatible containers.
This post documents the approach, the bugs encountered and patched, and the validated end result. Posting because this path doesn’t exist anywhere in public documentation and someone will attempt this after me.
Hardware / OS
- DGX Spark, GB10 Superchip, 128GB unified LPDDR5x, ARM64/aarch64
- DGX OS 7.x
Approach
Official blueprint containers are x86 only. The workaround:
- rag-server, ingestor-server, NV-Ingest API, and NV-Ingest Ray pipeline — built from source in native Python venvs
- NIM microservices (embed-1b-v2, rerank-1b-v2, table-structure-v1, graphic-elements-v1) — run as ARM64-compatible NIM containers
- Infrastructure (Milvus, MinIO, etcd, Redis) — Docker containers with ARM64-compatible images
- LLM: Nemotron-Nano-9B-v2-NVFP4 via TRT-LLM, served in Docker on port 8355
Running Stack
| Component | Port | How it runs |
|---|---|---|
| Nemotron-Nano-9B-v2-NVFP4 | 8355 | Docker / trtllm-serve |
| nemotron-embed-1b-v2 | 8010 | NIM container |
| nemotron-rerank-1b-v2 | 8011 | NIM container |
| nim-table-structure | 8370 (HTTP) / 8371 (gRPC) | NIM container |
| nim-graphic-elements | 8376 (HTTP) / 8377 (gRPC) | NIM container |
| rag-server | 8181 | Native Python venv |
| ingestor-server | 8182 | Native Python venv |
| NV-Ingest API | 7770 | Native Python venv |
| NV-Ingest Pipeline | Ray (internal) | Native Python |
| Milvus | 19531 | Docker |
| MinIO | 9012 (API) / 9013 (console) | Docker |
| etcd | 2381 | Docker |
| Redis | 6380 | Docker |
Bug Fixes Required (11 total)
- langchain-milvus 0.3.3 + pymilvus 2.6.12 connection alias incompatibility
MilvusClient creates a random connection alias that the ORM Collection() class cannot find. Fix: patch langchain_milvus/vectorstores/milvus.py line 250 to register the alias with the connections singleton before Collection() is called. - ENABLE_REDIS_BACKEND default is False
With Redis backend disabled, ingestion tasks queue in-memory with no worker consuming them. Ingestion silently stalls. Fix: set ENABLE_REDIS_BACKEND=True in the NV-Ingest environment. - Embedding endpoint template substitution broken
NV-Ingest’s $EMBEDDING_NIM_ENDPOINT env var substitution fails silently in default_pipeline_impl.py. Fix: hardcode http://127.0.0.1:8010/v1 directly in the file. - YOLOX env var name mismatch
Pipeline reads YOLOX_GRPC_ENDPOINT, not YOLOX_PAGE_ELEMENTS_GRPC_ENDPOINT. Fix: set the correct var name. Secondary issue: page-elements NIM crashes under memory pressure (~95% unified memory utilization). Workaround: point YOLOX_GRPC_ENDPOINT at the table-structure NIM gRPC endpoint instead.
5-11. Additional fixes documented — happy to share the full registry as a follow-up reply if useful to the community.
Validation
- 25 real construction documents ingested (invoices, POs, submittals)
- 55 vector entities created in Milvus
- Natural language queries return accurate, grounded, cited answers against source documents
- Full 22-stage NV-Ingest Ray pipeline confirmed functional end-to-end (OCR, layout detection, table extraction, text splitting, embedding)
Known Issues
- page-elements NIM unstable at high memory load — workaround above holds
- LLM health check returns 404 (cosmetic — trtllm-serve does not implement /v1/health/ready)
- Swap exhaustion (~15 GiB) under full pipeline load — clears on reboot
- NV-Ingest source builds currently in /tmp — migration to persistent storage needed
Question for NVIDIA
Is there a timeline for official ARM64 container support for the Enterprise RAG Blueprint components? Specifically: rag-server, ingestor-server, and NV-Ingest. The NIM containers and infrastructure images are already ARM64-compatible — the gap is the Python service containers. A roadmap or confirmed target release would be useful for anyone building on Spark.
Happy to share the full 11-bug fix registry, start script, and source build instructions if there’s interest. This path is repeatable — it just isn’t documented anywhere yet.
Hi Tylerdegagne, please share the registry fix, source build instruction. Appreciate your post here, at least I am not the only one bumping into these hurdles.