I am EXTREMely disappointed with the current state of DGX Spark

fred_dev · April 10, 2026, 12:41pm

Feel free to dump this into a gist or something, but I have to leave for an appointment. Here’s what I documented:

# ComfyUI Distributed — Dual-Node Setup

Short tutorial for adding a worker node to a ComfyUI master using the
[ComfyUI-Distributed](https://github.com/robertvoy/ComfyUI-Distributed) plugin.

Throughout this guide, replace the placeholders with your own values:

- `<user>` — the Linux username on both nodes (assumed to be the same)
- `<IP address node 1>` — routable IP of the master node
- `<IP address node 2>` — routable IP of the worker node
- `<worker-id>` — a short slug for the worker, e.g. `worker1`
- `<Worker Name>` — a human-readable label shown in the UI

## Topology

| Role   | Host                              | Port | Notes                          |
| ------ | --------------------------------- | ---- | ------------------------------ |
| Master | node 1 — `<IP address node 1>`    | 8188 | UI lives here, queues the work |
| Worker | node 2 — `<IP address node 2>`    | 8188 | Headless, returns results      |

Both nodes run an identical ComfyUI install. The master's plugin talks to the
worker over HTTP and pulls image bytes back when the worker finishes.

## 1. Prereqs (both nodes)

Install ComfyUI the same way in the same path on each node — this matters:
the master references workflows and model filenames by string, so anything
referenced must resolve identically on the worker.

```bash
# Run on BOTH nodes
python -m venv ~/comfyui-env
source ~/comfyui-env/bin/activate
git clone https://github.com/comfyanonymous/ComfyUI.git ~/ComfyUI
cd ~/ComfyUI
pip install -r requirements.txt
```

Keep `~/ComfyUI/models/` in sync between the two machines. rsync is the
simplest way:

```bash
# From node 1 → node 2
rsync -avh --progress ~/ComfyUI/models/ <user>@<IP address node 2>:~/ComfyUI/models/
```

## 2. Install the Distributed plugin (both nodes)

```bash
cd ~/ComfyUI/custom_nodes
git clone https://github.com/robertvoy/ComfyUI-Distributed.git
cd ComfyUI-Distributed
pip install -r requirements.txt   # if present
```

## 3. Launch ComfyUI on both nodes

The worker needs `--listen 0.0.0.0` so the master can reach it, and
`--enable-cors-header` so the master's web UI doesn't get blocked when
streaming worker previews.

```bash
# Run on BOTH nodes
source ~/comfyui-env/bin/activate
cd ~/ComfyUI
python main.py --listen 0.0.0.0 --port 8188 --enable-cors-header
```

For production, background them with `nohup ... &` or a systemd unit.

## 4. Register the worker on the master

Edit `~/ComfyUI/custom_nodes/ComfyUI-Distributed/gpu_config.json` **on the
master only**:

```json
{
  "master": {
    "host": "<IP address node 1>",
    "cuda_device": 0
  },
  "workers": [
    {
      "id": "<worker-id>",
      "name": "<Worker Name>",
      "type": "remote",
      "host": "<IP address node 2>",
      "port": 8188,
      "enabled": true
    }
  ],
  "settings": {
    "auto_launch_workers": false,
    "stop_workers_on_master_exit": false,
    "master_delegate_only": false
  }
}
```

Key fields:
- `master.host` — a routable IP of the master (not `localhost`). Workers must
  be able to reach it to upload result images back.
- `workers[].host` / `port` — where the master will POST jobs. Must match the
  `--listen` / `--port` of the worker's ComfyUI.
- `master_delegate_only: true` — if you want node 1 to only coordinate and
  never run a job itself. Leave `false` to use both GPUs.

Restart the master's ComfyUI so the plugin re-reads the config.

## 5. Use it in a workflow

Inside the ComfyUI UI on the master, open the **Distributed** side panel
(added by the plugin). Your worker should appear as `<Worker Name>` with a
green status dot. If it's red, check that the worker URL is reachable:
`curl http://<IP address node 2>:8188/system_stats`.

Then in the workflow itself, replace two nodes:

1. Swap the normal `KSampler`'s seed input for a **Distributed Seed** node.
2. Insert a **Distributed Collector** between the VAE decode and the
   `SaveImage` / `PreviewImage` node.

When you hit **Queue**, the Distributed Seed hands each enabled worker a
different seed, each machine renders its own image in parallel, and the
Collector gathers them all back on the master. You'll see N images come out
of one queue run, where N = 1 master + enabled workers.

The `distributed-txt2img.json` and `cyberrealistic-full-distributed.json`
files in this folder are ready-to-load examples of that wiring.

## 6. Driving it from the API

For scripted batches, skip `/prompt` and POST to the plugin's endpoint:

```bash
curl -X POST http://<IP address node 1>:8188/distributed/queue \
  -H "Content-Type: application/json" \
  -d '{
        "prompt": <workflow JSON>,
        "enabled_worker_ids": ["<worker-id>"]
      }'
```

Results for both master and worker come back in the master's
`/history` / output folder as if they were all produced locally.

## Troubleshooting

- **Worker shows red in UI** — the plugin can't reach the worker host/port.
  `curl` the worker's `/system_stats` from the master. If that works but the
  plugin still fails, restart the master ComfyUI so it re-reads
  `gpu_config.json`.
- **Worker runs job but master shows "missing image"** — the worker couldn't
  reach `master.host` to upload its result. Set `master.host` to an IP the
  worker can actually route to, not `localhost` or `127.0.0.1`.
- **"model not found" on worker only** — the master referenced a checkpoint
  that only exists on node 1. Rsync `models/` again.
- **Only node 1 ever runs jobs** — you're hitting `/prompt` instead of
  `/distributed/queue`, or the workflow is missing the Distributed Seed /
  Collector nodes.

Topic		Replies	Views
DGX Spark (SM121) Software Support is Severely Lacking - Official Roadmap Needed DGX Spark / GB10	41	5540	February 15, 2026
NVFP4 on DGX Spark / GB10 is broken. I bought 9 of these for this feature. Requesting NVIDIA's official roadmap and response DGX Spark / GB10 jetson , llama , agentic-ai , nemotron , nemoclaw	44	6174	May 17, 2026
Dearest CUTLASS TEAM, When the hell are you going to properly fix tcgen05 FP4 support for DGX Spark / GB10 (SM121)? DGX Spark / GB10	37	2487	April 25, 2026
We unlocked NVFP4 on the DGX Spark: 20% faster than AWQ! DGX Spark / GB10	144	9066	March 14, 2026
FP4 on DGX Spark — Why It Doesn't Scale Like You'd Expect DGX Spark / GB10	213	6859	March 13, 2026
Help: Running NVFP4 model on 2x DGX Spark with vLLM + Ray (multi-node) DGX Spark / GB10 mistral-large	18	2701	December 25, 2025
DGX Spark: 13 → 49 tok/s with Qwen3.5-35B — Native SM121 Kernel Build Guide DGX Spark / GB10 Projects cuda , cusparse	13	1406	April 1, 2026
50%+ Improvement on spark?! DGX Spark / GB10 cuda , deepseek	25	2486	March 24, 2026
DeepSeek-V4-Flash on 4× DGX Spark via vLLM (jasl fork, TP=4, RDMA, MTP) — 49–54 tok/s single-stream, full recipe + the traps DGX Spark / GB10 Projects deepseek	3	544	June 19, 2026
PSA: State of FP4/NVFP4 Support for DGX Spark in VLLM DGX Spark / GB10	234	13299	May 15, 2026

I am EXTREMely disappointed with the current state of DGX Spark

Related topics