How do I run Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled on vllm community docker?

dbsci · March 13, 2026, 10:17pm

Performance-wise you’re probably better off doing a quant, etc.; however, to answer your original question, here is a recipe for use with sparkrun that builds on top of @eugr’s vllm docker repo.

sparkrun run @sparkrun-testing/jackrong-qwen3.5-27b-claude4.6-distill-vllm

The @sparkrun-testing/ prefix is required for “hidden” registries. I do that so that I can deploy recipes for particular use without them being part of the default tab completion, etc.

You can check out the recipe file at: sparkrun-recipe-registry/testing/recipes/qwen3.5/exotic/jackrong-qwen3.5-27b-claude4.6-distill-vllm.yaml at main · dbotwinick/sparkrun-recipe-registry · GitHub

I tested that it ran and I was seeing ~4.5 tok/s, so not terribly impressive on performance with single node tensor parallel, but it’s interesting to see this new wave of opus distillation models! (Note that 27B dense model at BF16 would have a theoretical peak throughput of ~5.1 tok/s on a single spark Spark).

You can learn more about how to install sparkrun in the forums at: Sparkrun - central command with tab completion for launching inference on Spark Clusters or check out the docs at https://sparkrun.dev. sparkrun is designed to make it easier to run models and we’re working to make it easier to find recipes and understand baseline performance at spark-arena.com.

Topic		Replies	Views
Custom built vLLM + Qwen3.5-35B on NVIDIA DGX Spark (GB10) — sustained 50 tok/s, 1M context DGX Spark / GB10	18	3888	May 7, 2026
How to use eugr's docker? DGX Spark / GB10	10	646	April 8, 2026
Run Qwen3.5-27B with spark-vllm-docker DGX Spark / GB10 llama	1	2078	March 5, 2026
Running QuantTrio/Qwen3-VL-235B-A22B-Instruct-AWQ on 2 node spark DGX Spark / GB10	2	243	April 21, 2026
Qwen/Qwen3.5-122B-A10B - Alibaba/Qwen thought about us... :-D DGX Spark / GB10	340	16654	March 24, 2026
VLLM -- the $150M train wreck? DGX Spark / GB10 llama	24	1439	February 27, 2026
New pre-built vLLM Docker Images for NVIDIA DGX Spark DGX Spark / GB10	73	8850	March 27, 2026
Qwen3.5-35B-A3B on NVIDIA DGX Spark DGX Spark / GB10	4	3463	March 17, 2026
I'd like to learn how to use the latest vLLM on DGX Spark DGX Spark / GB10 cuda	9	2386	November 29, 2025
HOW-TO: Run Qwen3-Coder-Next on Spark DGX Spark / GB10 llama	92	10045	March 24, 2026

How do I run Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled on vllm community docker?

Related topics