DGX Spark PyTorch LLM training throughput up to 8x slower than expected

Please check out our benchmarking guide for benchmarking different models with different backends: DGX Spark Performance FAQ