When I typed command
Unable to determine the device handle for GPU 0000:02:00.0: Unknown Error was returned.
I then typed
nvidia-debugdump --list, here is the result:
Found 2 NVIDIA devices
Device ID: 0
Device name: NVIDIA TITAN X (Pascal) (*PrimaryCard)
GPU internal ID: 0324416077500
detailed info of bug report:
nvidia-bug-report.log (2.2 MB)
I don’t know how to approach this problem, so I am asking for help.
OS: Linux version 4.15.0-142-generic
GPU: 2*NVIDIA TITAN X
nvidia-bug-report.log (3.5 MB)
Hello, I have the same problem. My Nvidia A2000 is not working with what I believe is the latest driver (520.56.06). I have a linux kernel 5.15.0-53 with generic headers…
On my side, the
nvidia-debugdump --list says the following :
~$ nvidia-debugdump --list
Found 1 NVIDIA devices
Error: nvmlDeviceGetHandleByIndex(): Not Found
FAILED to get details on GPU (0x0): Not Found
Also, I have this output for
~$ nvidia-smi -L
Unable to determine the device handle for gpu 0000:01:00.0: Not Found
Hello again, I kept scrapping the forums and I think you can check this : Nvidia-smi outputs “No devices were found” on Ubuntu 22.04 + driver 520 - #2 by generix
On my side I changed the drivers to a non “open kernel” version and restarted my machine. The
nvidia-smi works again and I can use tools such as
Hope this helps !
NVRM: Xid (PCI:0000:02:00): 79, pid=1160, GPU has fallen off the bus.
[15028104.848929] pcieport 0000:00:02.0: AER: Multiple Corrected error received: id=0010
[15028104.848952] pcieport 0000:00:02.0: can't find device of ID0010
[15028104.848955] pcieport 0000:00:02.0: AER: Corrected error received: id=0010
[15028104.848961] pcieport 0000:00:02.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0010(Receiver ID)
[15028104.848966] pcieport 0000:00:02.0: device [8086:2f04] error status/mask=00000040/00002000
[15028104.848972] pcieport 0000:00:02.0: [ 6] Bad TLP
Please reboot. If the gpu still doesn’t show up, it’s probably broken, check if it works in another system.