Decode is memory bandwidth-limited.
jasl
9
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Anyone having luck with Deepseek V4 Flash on Dual Sparks? | 13 | 1227 | June 4, 2026 | |
| Deepseek v4 Flash on 2 Nodes | 70 | 5366 | June 12, 2026 | |
| DeepSeek-V4-Flash (official FP8) running across 2x DGX Spark — TP=2, MTP, 200K ctx, recipe + numbers | 219 | 12811 | June 12, 2026 | |
| DeepSeek v4 Flash (Aiden Recipe from Reddit) - 1M token session operational, Cuda 12.1 tailored for DGX Spark GB10 | 82 | 3921 | June 13, 2026 | |
| Deepseek V4 released | 143 | 15622 | May 18, 2026 | |
| DeepSeek v4 Flash (IQ2XXS) on a single GB10! | 9 | 3094 | June 13, 2026 | |
| Fully custom CUDA-native Deepseek 4 Flash optimized for 1x Spark! antirez/ds4 | 67 | 5631 | June 12, 2026 | |
| DeepSeekV4-Flash hybrid quant, 1x DGX Spark: antirez's optimized 128 GB MLX recipe ported to vLLM for GB10 | 18 | 1759 | May 11, 2026 | |
| DeepSeek V4 Flash MXFP4 proof-of-life on a single GB10/GX10 | 4 | 1256 | May 8, 2026 | |
| Does Qwen3.5-35B-A3B on GB10 leave a lot of performance on the table? | 40 | 5902 | March 16, 2026 |