Hello NVIDIA Community,
I’m encountering some unusual behavior with my NVIDIA driver and CUDA installation on RHEL 8, and I’m hoping someone can provide some insight or guidance.
Current Setup
- Operating System: RHEL 8
- GPU: NVIDIA A40
Issue Description
- I installed the recommended Nvidia drivers after selecting my product and OS specs from -Driver Details | NVIDIA. After the successful installation, I run nvidia-smi to get:
NVIDIA-SMI 535.183.06 Driver Version: 560.28.03 CUDA Version: 12.2
I’m confused by the discrepancy between the NVIDIA-SMI version (535.183.06) and the Driver Version (560.28.03) . Does the output from NVIDIA-SMI version and the Driver Version has to be the same?
2)I’ve attempted to install CUDA 12.6 since driver version said 560 using the following commands from the nvidia-website:
wget https://developer.download.nvidia.com/compute/cuda/12.6.0/local_installers/cuda-repo-rhel8-12-6-local-12.6.0_**560.28.03**-1.x86_64.rpm
sudo rpm -i cuda-repo-rhel8-12-6-local-12.6.0_560.28.03-1.x86_64.rpm
sudo dnf clean all
sudo dnf -y install cuda-toolkit-12-6
The installation was successful but then I got the unsupported display driver / cuda driver
combination error when I try to run a simple cuda test application. So I thought the driver
is actually 535 and thus installed cuda 12.2 which was successful but I got the same error again
3When I try to compile a CUDA program, I get an error suggesting an incompatibility:
unsupported display driver / cuda driver combination
4)Questions
Is the discrepancy between NVIDIA-SMI version and Driver Version normal? If not, what might be causing this?
Could this discrepancy be the reason for the CUDA compilation error?
How can I resolve this issue and successfully compile and run CUDA programs?
Is there a way to ensure that the NVIDIA-SMI version and Driver Version are the same?
Snapshots:
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0
1 ~]$ nvidia-smi
Wed Aug 21 12:43:23 2024
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.06 Driver Version: 560.28.03 CUDA Version: 12.2 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A40 Off | 00000000:17:00.0 Off | 0 |
| 0% 33C P0 80W / 300W | 0MiB / 46068MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+
| 1 NVIDIA A40 Off | 00000000:65:00.0 Off | 0 |
| 0% 33C P0 78W / 300W | 0MiB / 46068MiB | 0% Default |
| | | N/A
Error message while running a simple test cuda file: ~]$ ./cuda_check
Failed to launch kernel (error code system has unsupported display driver / cuda driver combination)!