Hello, like many similar posts, I am getting the error in the title when running nvidia-smi.
However, it seems my issue is different than many others, I am not getting PCIe bus error like this guy for example.
I have added my nvidia-bug-report.log file, as well as some further info, below.
johannes@itk-System-Product-Name:~$ nvidia-smi
Unable to determine the device handle for GPU0000:67:00.0: Unknown Error
johannes@itk-System-Product-Name:~$ nvidia-debugdump --list
Found 3 NVIDIA devices
Device ID: 0
Device name: NVIDIA GeForce RTX 3090 (*PrimaryCard)
GPU internal ID: GPU-4349ddf2-d56e-2429-ae92-0ba3ec2d2d3e
Error: nvmlDeviceGetHandleByIndex(): Unknown Error
FAILED to get details on GPU (0x1): Unknown Error
johannes@itk-System-Product-Name:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal
nvidia-bug-report.log.gz (1.7 MB)