Google Gemma 4 - It will work on DGX Spark?

Sure. Might take a few hours until support across all inference servers pop up.

I already tried vLLM with latest Transformers v5.5.0 (which is required), but I failed:

llama.cpp has added support already: