Kernel: 5.4.17-2136.335.4.el8uek.x86_64
Distro: Oracle Linux 8.10
Platform: AWS EC2, p2.8xlarge
NVIDIA hardware: GK210GL [Tesla K80]
What I have tried so far:
sudo yum -y install pciutils;
lspci | grep -i nvidia;
sudo yum -y install kernel-uek-devel;
sudo dnf -y module install nvidia-driver:latest-dkms;
sudo reboot;
After reboot, observed first instance of
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
when attempting to run nvidia-smi
command. Then tried:
sudo dnf -y module remove nvidia-driver:latest-dkms;
sudo yum -y remove kernel-uek-devel;
sudo yum -y update;
sudo yum -y install kernel-uek-devel;
sudo dnf -y module install nvidia-driver:latest-dkms;
sudo reboot;
Failure running nvidia-smi
still persisted.
Output of gcc --version
:
gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22.0.1)
Output of dnf list installed \*nvidia\*
:
dnf-plugin-nvidia.noarch 2.2-1.el8 @artifactory-nvidia
kmod-nvidia-latest-dkms.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
libnvidia-cfg.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
libnvidia-fbc.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
libnvidia-ml.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-driver.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-driver-cuda.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-driver-cuda-libs.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-driver-libs.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-kmod-common.noarch 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-libXNVCtrl.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-libXNVCtrl-devel.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-modprobe.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-persistenced.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-settings.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
nvidia-xconfig.x86_64 3:560.35.03-1.el8 @artifactory-nvidia
Here is the bug report file:
nvidia-bug-report.log.gz (58.1 KB)