Hi,
I’m trying to better understand the CUDA Enhanced Compatibility introduced in CUDA 11.1:
CUDA has relaxed the minimum driver version check and thus no longer requires a driver upgrade with minor releases of the CUDA Toolkit.
I have driver 455.45.01 installed on the host. This driver version is compatible with any CUDA 11.x versions according to the CUDA Enhanced Compatibility.
Indeed we can run the CUDA 11.2 Docker image and nvidia-smi reports the correct versions:
$ docker run --gpus all --entrypoint nvidia-smi nvidia/cuda:11.2.2-devel-ubuntu20.04
Tue Jun 1 15:52:37 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.45.01 Driver Version: 455.45.01 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... Off | 00000000:01:00.0 Off | N/A |
| 0% 48C P5 16W / 140W | 0MiB / 6076MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
However, I’m trying to run the sample executable deviceQuery
as shown in example 3.1 but it fails to detect the GPU. Here’s how to reproduce:
- Run the CUDA 11.2 development image:
docker run --gpus all nvidia/cuda:11.2.2-devel-ubuntu20.04
- Download, build, and run
deviceQuery
:
apt-get update
apt-get install -y wget
wget https://github.com/NVIDIA/cuda-samples/archive/refs/tags/v11.2.tar.gz
tar xf v11.2.tar.gz
cd cuda-samples-11.2/Samples/deviceQuery/
make
./deviceQuery
- The executable does not detect the GPU:
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 100
-> no CUDA-capable device is detected
Result = FAIL
So to summarize: I built a binary with CUDA 11.2 and run it with driver 455 but the GPU is not detected. This configuration should work according to the CUDA Enhanced Compatibility.
Any idea what is wrong here?
Thanks!