What to run on 8 Sparks / GB10?

Weekend incoming. 8 Sparks (GB10 - 4x Lenovo, 4x Asus) lined up as a cluster. Looking for ideas on what to run. Any specific models you want to see benchmarks for?

PS: Yes I know it´s a little bit messy ;) But I promise, will be looking good soon. Still waiting for my rack mounts.

mimo 2.5 pro and glm5.1

@ma.bu before you launch any model you might want to space those Sparks. Under load the mighty Spark gets very hot. There have been many overheat shutdown cases reported already. Search the forum.

Or get back to us when your Sparks die on you unexpectedly!

Could you run latency tests between two of the nodes in that setup?

Node 1:

ib_write_lat -d rocep1s0f0 -i 1 -p 13000 -F

Node 2:

ib_write_lat -d rocep1s0f0 -i 1 -p 13000 -F <Node-1_IP_Address>

Node 1:

ib_read_lat -d rocep1s0f0 -i 1 -p 13001 -F

Node 2:

ib_read_lat -d rocep1s0f0 -i 1 -p 13001 -F <Node-1_IP_Address>

Cool, one thing is contributing with benchmarks here: https://spark-arena.com
We don’t have benchmarks for Deepseek v4 and nvidia/Kimi-K2.6-NVFP4 yet

https://x.com/spark_arena/status/2055367735463538717?s=20 @ma.bu

can you run NVIDIA-Nemotron-3-Super-120B-A12B-BF16 with 1m tokens?) something like this

Summary
docker exec -it $VLLM_CONTAINER /bin/bash -c "\
CUDA_VISIBLE_DEVICES=0
VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 \
vllm serve nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 \
--tensor-parallel-size 8 \
--max-model-len 1048576 \
--dtype bfloat16 \
--distributed-executor-backend ray \
--enforce-eager \
--enable-auto-tool-choice \
--tool-call-parser mistral \
--host 0.0.0.0 \
--port 8000 \
--swap-space 0 \
--trust-remote-code"

We ran the RDMA latency tests between ai-002 (10.0.0.3) and ai-003 (10.0.0.5) on rocep1s0f0.

ib_write_lat: avg 2.49 us, 99% 2.59 us, 99.9% 3.47 us
ib_read_lat: avg 4.86 us, 99% 5.02 us, 99.9% 6.37 us

Both tests completed successfully.

Thank you, :) I have no idea why the people think there could be fire soon. It´s so save here in Austria also with my hardware mess.

Kimi K2.6 done, Deepseek v4 will follow soon.

Actually no issue so far and the spark gets max 90c degree. But yes, they will be spaced soon.

Thank you, perfect.

glm 5.1 already on spark arena: zai-org/GLM-5.1-FP8 - Spark Arena Benchmark

mimo 2.5 will follow soon.