Good Day, Having lived through the DGX Spark model size issue, I would like to get some clarification of the up to 1T parameter model size claim for the DGX Station .
At FP4, a 1.T model would require ~500-600GB + overhead such as KV cache. This listing does not seem to be a unified memory model. So the only memory this might fit is the 496 GB LPDDR5X RAM, but will be about as fast as a Mac Studio studio running inertance with the 396 GB/s limitation. And to get this size in GPU memory would need another 2 additional A6000’s with 96GB each to fit.
Am I missing something here in the advertising?
I do Prey that the Blackwell GPU is a standard SM120 and not a custom throttled down / modified new model that vLLM and others will need to write custom code for adoption.
Lastly, if you would like a 3rd party, independent evaluation of a unit, let me know :-)