I removed an old GT 720 (long ago) rebooted many times since, but it still shows in nvidia-smi (see below):
How do I get this fixed?
| NVIDIA-SMI 410.48 Driver Version: 410.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GT 710 Off | 00000000:01:00.0 N/A | N/A |
| 50% 41C P8 N/A / N/A | 11MiB / 980MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 1080 Off | 00000000:02:00.0 Off | N/A |
| 0% 41C P8 11W / 180W | 4405MiB / 8119MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
| 1 3712 C /home/german/anaconda3/bin/python 4395MiB |
You said you’ve removed a GT 720 but what is showing is a GT 710
Anyway I’m quite confident that if nvidia-smi reports those 2 GPUs, you actually have those 2 GPUs in your system.
You are correct… … sorry about the confusion… there was an old 3rd GPU that was removed and that one is gone…
I am having the same problem. Here is the output from nvidia-smi:
david@dachshund:~$ nvidia-smi
Wed Oct 16 21:35:30 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26 Driver Version: 430.26 CUDA Version: 10.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GT 710 Off | 00000000:01:00.0 N/A | N/A |
| 40% 50C P0 N/A / N/A | 491MiB / 1993MiB | N/A Default |
±------------------------------±---------------------±---------------------+
| 1 GeForce RTX 208… Off | 00000000:02:00.0 Off | N/A |
| 41% 40C P8 7W / 260W | 1MiB / 11019MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 2 GeForce RTX 208… Off | 00000000:04:00.0 Off | N/A |
| 40% 34C P8 17W / 260W | 1MiB / 11019MiB | 0% Default |
±------------------------------±---------------------±---------------------+
±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
±----------------------------------------------------------------------------+
I removed the GT 710 some time ago. It is definitely NOT in the machine. The machine has been rebooted many times. I tried to reset nvidia-smi but I get:
david@dachshund:~$ nvidia-smi -r
GPU Reset couldn’t run because GPU 00000000:01:00.0 is the primary GPU.
This doesn’t happen all the time. Some times I just see the two RTX 2080TI’s, not the GT 710.
The GT 710 was originally used to drive the monitors, but now 1 GeForce RTX 2080TI is running the displays since I removed the GT 710.
There are only 2 possible explanations:
- You have a GT710 in that machine
- nvidia-smi has a bug in it
If you reboot a machine, there are no steps you can take, and no steps you should have to take, to get nvidia-smi to report the GPU complement correctly. Other factors could cause GPUs to disappear (e.g. power, overheating, etc.) but there are no factors that can cause a GPU to appear in nvidia-smi when it is actually not present, excepting a bug in nvidia-smi
If it’s your opinion that nvidia-smi has a bug, then you can report that following the sticky post instructions at the top of the CUDA programming sub-forum. I wouldn’t expect much traction on a bug. Most bugs have to be pursued using the ability to reproduce the issue at NVIDIA. I can assure you I have never heard of such a report and am doubtful that you would be able to provide instructions to reproduce the issue, however you should be aware that you will likely be asked for exactly that if you file a bug.
Since the above is a questionable errand (in my view), you might want to power down that machine, crack the case, and take a peek. I’m sure you consider it unlikely, so feel free to ignore all this. Do as you wish.
I must have a bug in my head. I forgot that I replaced the card after I had taken it out.
I apologize for wasting your time.
Thanks