This seems to be the smallest model (392GiB) I can find but I couldn’t get deepgemm to work. Anyone successfully run GLM5.1?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| GLM 5.1 on Hugging Face... Is this model going to run on a Single Spark? How many will be necessary? | 18 | 4095 | May 3, 2026 | |
| GLM 5 Local? | 10 | 1726 | March 21, 2026 | |
| New Quantized Models Drop: GLM-5 REAP 50% — How Many DGX Sparks Do You Need? | 1 | 489 | April 29, 2026 | |
| Running GLM-4.7-FP8 (355B MoE) on 4x DGX Spark with SGLang + EAGLE Speculative Decoding | 39 | 1972 | April 20, 2026 | |
| How to run GLM 4.7 on dual DGX Sparks with vLLM / mods support in spark-vllm-docker | 28 | 4094 | January 2, 2026 | |
| DGX Spark performance | 50 | 4957 | February 27, 2026 | |
| Nvidia spark dgx GB10 fine-tune slow time problem - Urgent HELP | 5 | 209 | February 26, 2026 | |
| Gemma 4 Day-1 Inference on NVIDIA DGX Spark — Preliminary Benchmarks | 17 | 7448 | April 7, 2026 | |
| Request: Add GLM 5.1 from Z-ai on NIM | 8 | 545 | April 13, 2026 | |
| Someone post this: Gemma 4 26B-A4B MoE running at 45-60 tok/s on DGX Spark | 4 | 2505 | April 5, 2026 |