Hi everyone,
I’m testing the NVIDIA RAG Blueprint on a DGX Spark (GB10, ARM64) and ran into a consistent architecture issue with the “control plane” containers. The NIMs and Milvus run fine, but the ingestor and RAG server containers exit with exec format error, which looks like an x86_64 vs ARM64 problem.
Environment
-
Hardware: NVIDIA DGX Spark (GB10 Grace Blackwell Superchip, ARM64, 128 GB unified memory)
-
Host OS: DGX OS / Ubuntu 22.04 aarch64
-
Driver:
-
nvidia-smiinsidenvcr.io/nvidia/cuda:12.4.0-base-ubuntu22.04reports:-
Driver Version:
580.95.05 -
CUDA Version:
13.0 -
GPU:
NVIDIA GB10
-
-
-
Docker: using NVIDIA Container Toolkit;
docker run --rm --gpus all nvcr.io/nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smiworks as expected. -
RAG Blueprint:
NVIDIA-AI-Blueprints/rag(Compose files underdeploy/compose)
What works
Using the provided Compose files, the following containers work correctly on DGX Spark:
-
NIMs (ARM64 / multiarch):
-
nvcr.io/nim/nvidia/llama-3.3-nemotron-super-49b-v1.5:1.13.1asnim-llm-msNIM_MODEL_PROFILEset to the detected compatible profile:
d491076c9c5fbbc0f5ec92916ee84050c3b51a1c93055a64de8d0f31a22ed209
-
nvcr.io/nim/nvidia/llama-3.2-nv-embedqa-1b-v2:1.10.0asnemoretriever-embedding-ms -
nvcr.io/nim/nvidia/llama-3.2-nv-rerankqa-1b-v2:1.8.0asnemoretriever-ranking-ms -
nvcr.io/nim/nvidia/nemoretriever-page-elements-v2:1.5.0aspage-elements -
nvcr.io/nim/nvidia/nemoretriever-graphic-elements-v1:1.5.0asgraphic-elements -
nvcr.io/nim/nvidia/nemoretriever-table-structure-v1:1.5.0astable-structure
-
All of these start and stay up; the embedding and rerank NIMs are healthy, the LLM-NIM loads successfully with the selected profile on GPU 0.
-
Vector DB stack:
-
milvusdb/milvus:v2.6.2-gpuasmilvus-standalone -
minio/minio:RELEASE.2025-09-07T16-13-09Zasmilvus-minio -
quay.io/coreos/etcd:v3.6.5asmilvus-etcd
-
Milvus is reachable on 19530, MinIO and etcd are healthy.
-
Redis:
redis/redis-stack:7.2.0-v18is up and accepting connections.
So the heavy GPU services (NIMs) and the DB/storage layer are working fine on ARM64.
What fails
The following containers exit immediately on DGX Spark with exec format error:
-
nvcr.io/nvidia/blueprint/ingestor-server:2.3.0(serviceingestor-server, port 8082) -
nvcr.io/nvidia/nemo-microservices/nv-ingest:25.9.0(servicenv-ingest-ms-runtime) -
nvcr.io/nvidia/blueprint/rag-server:2.3.0(servicerag-server, port 8081) -
nvcr.io/nvidia/blueprint/rag-frontend:2.3.0(servicerag-frontend, port 3000)
The NIMs and Milvus are running and reachable, but since ingestor-server and rag-server exit immediately with exec format error, ports 8081 and 8082 are not listening and the Blueprint’s REST endpoints/UI are not usable.
Given the error pattern, it looks like:
-
These Blueprint containers (
ingestor-server,nv-ingest,rag-server,rag-frontend) are built for linux/amd64 only, -
and are being run on an aarch64 host (DGX Spark / GB10), which results in the
exec format erroras soon as Docker tries to run the binary.
My question(s)
-
Are there ARM64-native (or multi-arch) builds planned for:
-
nvcr.io/nvidia/blueprint/ingestor-server -
nvcr.io/nvidia/nemo-microservices/nv-ingest -
nvcr.io/nvidia/blueprint/rag-server -
nvcr.io/nvidia/blueprint/rag-frontend
specifically for DGX Spark / GB10?
-
-
If yes, is there any ETA or roadmap you can share (e.g. target RAG Blueprint version or NIM/Blueprint release where ARM64 support for these containers will be available)?
-
In the meantime, is there an officially recommended workaround for DGX Spark:
-
e.g. running only NIMs + Milvus on DGX Spark and hosting
rag-server/ingestor-serveron a separate x86_64 node, -
or using a different supported stack for ARM64 to orchestrate the NIMs for RAG?
-
-
Is there any ongoing work or branch (GitHub / internal) that explicitly targets RAG Blueprint on DGX Spark / ARM64, which early adopters can follow/test?
My main goal is to run the full NVIDIA RAG Blueprint (including ingest and RAG server) on DGX Spark, since the hardware is clearly capable and the NIMs already run well on this system. At the moment, the only blockers are the exec format error on the ARM64 host for these Python-based Blueprint containers.
Any guidance, roadmap hints, or pointers to ARM64-ready images (even experimental) would be very helpful.
Thanks in advance!