Someone post this: Gemma 4 26B-A4B MoE running at 45-60 tok/s on DGX Spark

I think this approach worth a try:

1 54.01 t/s 54.06 t/s 43.23 t/s 50.43 t/s
2 69.13 t/s 62.47 t/s 51.53 t/s 61.04 t/s
4 138.75 t/s 141.14 t/s 140.10 t/s 140.00 t/s

Can confirm, get around 50 tok/s pp

Jup runs here with 45- 60

The community vllm container by @eugr works fine for this model too (there is a recipe):