Hi ,
We are using multiple GPUs in our 3 machines
Machine 1- 4xGPU Nvidia Quadro RTX6000 24gb
Machine 2- 4xGPU- MSI RTX 2080 Ti Gaming X trio 11GB DDR66 352-bit Triple Fan
Machine 3- 2xGPU- NVIDIA Tesla P100 GPU
We are using Ubuntu Server 20.04 LTS Legacy version, We downloaded the graphic card drivers which are suggested by Nvidia.
Machine 1- 4xGPU Nvidia Quadro RTX6000 24gb https://us.download.nvidia.com/XFree86/Linux-x86_64/535.104.05/NVIDIA-Linux-x86_64-535.104.05.run
Machine 2- 4xGPU- MSI RTX 2080 Ti Gaming X trio 11GB DDR66 352-bit Triple Fan https://us.download.nvidia.com/XFree86/Linux-x86_64/535.104.05/NVIDIA-Linux-x86_64-535.104.05.run
Machine 3- 2xGPU- NVIDIA Tesla P100 GPU https://us.download.nvidia.com/tesla/515.105.01/NVIDIA-Linux-x86_64-515.105.01.run
after installing the driver when we installed Cuda 11.7, the machine OS got corrupted. if we do not install the graphic driver and directly install Cuda then machines restart themselves and do not show any graphic card. what should we do we wanted to create a Kubernetes cluster.
Please help us
Hello @manojkumarhr and welcome to the NVIDIA developer forums.
I highly recommend that you ideally start from fresh system installations again and read the detailed installation instructions for CUDA. The recommended way is to install the GPU drivers as part of the CUDA toolkit installation and not before. The Documentation has a very detailed description of the process and when followed exactly will lead to a properly running CUDA installation.
If you want to change the GPU driver afterwards, you will find notes to that regard also in the CUDA documentation.
And if that still fails for you, we do have a dedicated CUDA forum category exactly for the purpose of helping with setup problems.
I hope this helps!
Thanks!
@MarkusHoHo Thank you very much will be in your debt for my lifetime
thank you very much for replying me we are trying to install any driver that is compatible with NVidia Tesla p100 and Nvidia Titan RTX in Ubuntu 20.04 But the latest drivers are not supported Please help me we are struck for last few days and have not able to get the logs since Tesla p 100
@MarkusHoHo For Your Reference I am attaching my system logs of I am installing drivers in Ubuntu 20.04 for Titan RTX graphic cards. Help me out how I can
nvidia-bug-report.log.gz (217.0 KB)
install these drivers.
This is in your log:
ERROR: The Nouveau kernel driver is currently in use by your system. This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding. Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.
WARNING: One or more modprobe configuration files to disable Nouveau are already present at: /usr/lib/modprobe.d/nvidia-installer-disable-nouveau.conf, /etc/modprobe.d/nvidia-installer-disable-nouveau.conf. Please be sure you have rebooted your system since these files were written. If you have rebooted, then Nouveau may be enabled for other reasons, such as being included in the system initial ramdisk or in your X configuration file. Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.
ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.
Nouveau
is the Open Source GPU driver that you should most definitely disable. Read the above text carefully and also check out the Linux driver README on suggestions specific to the Nouveau driver.