Hello,
I am using A100 GPU and Ubuntu 20.04 LTS server. I have installed nvidia driver 510.47.03 for it.
I refer to this official document to install my driver
NVIDIA HGX A100 Software User Guide
However, when I finished my installation, trying to run “nvidia-smi” results in “ NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.”
Besides that, I can’t start nvidia-fabricmanager.
Ubuntu 20.04 LTS server
Tesla A100
510.47.03 driver
CUDA Toolkit 11.6
BugReport:nvidia-bug-report.log.gz (28.6 MB)
I don’t know why my bug report is so large. The size of the decompressed file reaches 800MB
Any help is greatly appreciated, thanks!