What tests could be run to ensure your DGX Spark is not defective?

Are there any specific tests or playbooks that could be used to run and verify that the spark is not defective/crashing unexpectedly/overheating, etc. like what some users appear to be reporting (i.e., DGX Spark. low fan speed, high temps, device very hot). Am assuming DGX Spark is meant to be capable of extended periods of training/fine tuning?

Are there specific metrics in DGX Dashboard or other available metrics to confirm it is normal?

If you suspect hardware issues, please reachout to NVIDIA DGX Spark Support , where my colleagues can walk you through diagnostics tools to confirm your unit is healthy.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.