Nvidia-container-cli reporting wrong CUDA version

Hi

I (think I’ve) installed everything correctly - the CUDA samples run (well the BlackScholes one does anyway), and I’m able to run containers like this one:

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

Running nvidia-smi.exe shows:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.20       Driver Version: 460.20       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+

However, nvidia-container-cli info shows:

NVRM version:   460.0
CUDA version:   11.0

Device Index:   0
Device Minor:   0
Model:          UNKNOWN
Brand:          UNKNOWN
GPU UUID:       GPU-00000000-0000-0000-0000-000000000000
Bus Location:   0
Architecture:   UNKNOWN

And indeed, when I try to run one of our internal containers which is based on nvidia’s CUDA 11.2 base image, I get:

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.1, please update your driver to a newer version, or use an earlier cuda container\\\\n\\\"\"": unknown.

Why is nvidia-container-cli reporting a different cuda driver version to nvidia-smi.exe? And am I right in assuming that’s the core issue here?

I’m running Windows 10 build 20251.1.

Thanks!
Dan

3 Likes