I have two Titan X (Pascal) used for machine learning, probabilistic fiber tracking and other processes that will run for days or weeks at a time. Every time I check nvidia-smi on that system I notice GPU1 is much hotter than GUP0.
Tue Nov 8 11:01:15 2016
±----------------------------------------------------------------------------+
| NVIDIA-SMI 367.57 Driver Version: 367.57 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 TITAN X (Pascal) Off | 0000:03:00.0 Off | N/A |
| 39% 66C P2 62W / 250W | 9112MiB / 12189MiB | 22% Default |
±------------------------------±---------------------±---------------------+
| 1 TITAN X (Pascal) Off | 0000:04:00.0 On | N/A |
| 59% 85C P2 115W / 250W | 9249MiB / 12189MiB | 0% Default |
±------------------------------±---------------------±---------------------+
Given the equal utilization I don’t really understand why this would be. We have another system running GTX 980s, that when running the same program behaves as follows:
Tue Nov 8 10:55:31 2016
±----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48 Driver Version: 367.48 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 980 Off | 0000:02:00.0 Off | N/A |
| 0% 31C P0 39W / 180W | 0MiB / 4037MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 1 GeForce GTX 980 Off | 0000:03:00.0 Off | N/A |
| 33% 56C P2 66W / 180W | 1861MiB / 4037MiB | 42% Default |
±------------------------------±---------------------±---------------------+
| 2 GeForce GTX 980 Off | 0000:83:00.0 Off | N/A |
| 33% 56C P2 76W / 180W | 1866MiB / 4037MiB | 9% Default |
±------------------------------±---------------------±---------------------+
| 3 GeForce GTX 980 Off | 0000:84:00.0 Off | N/A |
| 32% 54C P2 72W / 180W | 1881MiB / 4037MiB | 43% Default |
±------------------------------±---------------------±---------------------+
Note the similar temps across the currently used GPUs. Is there something I should be concerned about with respect to the Titan X above running that much hotter than the other card?
Thank you,
Keith