Unable to start CUDA container with recent update on November 10

Unable to start a container after Nvidia’s recent update of images on November 10.

For example:
Command: docker run -it mcr.microsoft.com/mirror/nvcr/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04 bash
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.8, please update your driver to a newer version, or use an earlier cuda container: unknown.
ERRO[0000] error waiting for container:

| NVIDIA-SMI 510.108.03 Driver Version: 510.108.03 CUDA Version: 11.6 |
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
| 0 NVIDIA GeForce … Off | 00000000:17:00.0 Off | N/A |
| 0% 35C P8 5W / 151W | 6MiB / 8192MiB | 0% Default |
| | | N/A |
| 1 NVIDIA GeForce … Off | 00000000:73:00.0 Off | N/A |
| 0% 37C P8 5W / 151W | 53MiB / 8192MiB | 0% Default |
| | | N/A |

1 Like

I have the exact same issue. +1 for me, and linking to my question today


As a temporary solution, I was able to use one of my older images based off the previous nviadia/cuda builds as my new base image. I could then install whatever new requirements or code into the working directories and build a new image.
This works as long as I need to make minimal changes to the underlying image (fair assumption for a lot of use cases) but it’s long term fragile.

Just passing along the tip.

1 Like

I solved this issue
Add NVIDIA_DISABLE_REQUIRE=true to environment variable set.

I was reading on another forum, that you will be able to start the container using this method. However, you will run into other issues while running the container.

To solve this issue, I updated Nvidia driver from 510 to 525 and this resolved the issue locally. The only concern is this is not an easy solution to be used everywhere, especially on the compute infrastructure I am using.