That all seemed to okay but I now need an updated nvidia-container-toolkit to support the later CUDA version. The nvidia-container-toolkit version currently installed is 1.11.0~rc.1-1 and I’d like 1.16.1-1, which is a candidate according to apt-cache policy.
However, if I run sudo apt install nvidia-container-toolkit I see:
The following additional packages will be installed:
libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit-base
The following packages will be REMOVED:
nvidia-container nvidia-jetpack nvidia-jetpack-dev nvidia-jetpack-runtime
The following NEW packages will be installed:
nvidia-container-toolkit-base
The following packages will be upgraded:
libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit
3 upgraded, 1 newly installed, 4 to remove and 2 not upgraded.
nvidia-jetpack seems to depend on most of the nvidia software, so it seems a bad idea to uninstall that.
I’m trying to run an Ubuntu 22.04 docker image on my Jetpack 5.1.2 system. I create the image based on nvidia/cuda:12.2.2-devel-ubuntu22.04 (although I could use any cuda version after 11.7, which is the earliest I could find).
When I run the image with the command:
docker run -it --runtime nvidia --gpus all <image>
I get the error:
docker: Error response from daemon: failed to create task for container:
failed to create shim task: OCI runtime create failed:
nvidia-container-runtime did not terminate successfully: exit status 1:
time="2024-08-27T12:01:33Z" level=error
msg="failed to create NVIDIA Container Runtime: failed to construct OCI spec modifier:
requirements not met: unsatisfied condition: cuda>=12.2 (cuda=11.4)"
I noticed that the release notes for NVIDIA Container Runtime mentioned updating the base cuda image which is why I thought that might be the solution.
$ docker run -it --runtime nvidia --gpus all nvidia/cuda:11.7.1-devel-ubuntu22.04
docker: Error response from daemon: failed to create task for container:
failed to create shim task: OCI runtime create
failed: nvidia-container-runtime did not terminate successfully:
exit status 1: time="2024-08-27T23:57:51Z" level=error
msg="failed to create NVIDIA Container Runtime: failed to construct OCI spec modifier:
requirements not met: unsatisfied condition: cuda>=11.7 (cuda=11.4)"
: unknown.