Just make sure you’re checking logs to ensure you’re crashing due to overheating and not OOM. I repasted and put new thermal pads down (all Thermal Grizzly Kryonaut) and was still crashing. I was crashing due to heat and OOM, so it was hard to isolate one over the other. After repasting I was still crashing due to OOM despite using vetted community recipes for my cluster. I changed my swap size and yes (sadly) set some GPU clock limits and I haven’t crashed since. A moderator posted about it here.
fishnotphish
140
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| DGX Spark. low fan speed, high temps, device very hot | 60 | 4617 | May 26, 2026 | |
| Dgx spark shut down without rebooting | 22 | 562 | May 20, 2026 | |
| To NVIDIA Staff: Is This a Hardware Issue Requiring Repeated Shutdowns and RMA Under High Load? | 25 | 1121 | March 18, 2026 | |
| DGX Spark stability / out of RAM / overheating | 26 | 1179 | May 12, 2026 | |
| DGX Spark constantly shut down after first set up | 8 | 463 | January 11, 2026 | |
| Status and Experience on Thermal Performance | 21 | 3949 | May 7, 2026 | |
| DGX Spark Performance Degradation - GPU Power Draw Issue | 65 | 3501 | May 28, 2026 | |
| Unexpected Shutdown During ComfyUI Inference on DGX Spark (Occurs on Two Units) | 4 | 428 | March 12, 2026 | |
| Shutdown under high utilization during image generation using qwen | 9 | 353 | January 20, 2026 | |
| MSI EdgeXpert Suddenly Power-Off During llama-benchy – Possible PD Firmware Issue? | 25 | 715 | March 9, 2026 |