I have a problem with my second GPU, and Googleing has brought me here. In short: my second water cooled GTX 1080 is lost by Ubuntu
nvidia-smi (or at least marked as lost) after either seconds or minutes in the Ubuntu desktop.
What I have already done:
- reseed both cards
- replug the pci power cables (I found that the bottom one was not plugged in all the way)
The card is not overheating as
nvidia-smi is reporting around 20 degrees Celcius. I was wondering if something else (like the memory or power delivery phases) could overheat.
I believe this is not a novel problem, but I have no idea what else to tell you, so please let me know. I went through the nvidia-bug-report.log, but I could not find anything interesting on my own. Any help would be appreciated.
bug-report.gz (252.9 KB)
edit: my machine did run (several days or weeks) with the power cable of the second card not plugged in all the way. I did not notice when the second GPU failed since I wasn’t using it in this period