Please
$ sudo apt-get install -y nvidia-docker2
$ sudo apt install nvidia-driver-525
Hi,
thank you for your answer.
Both of them I had already installed, nvidia-docker2 version is 2.13.0-1. Driver version is 525.125.06.
Can you find the lib?
$ sudo find / -name libnvidia-ml.so.1
Please add --runtime=nvidia
in the docker run command.
Yes, it can be found, here are output:
/usr/lib/i386-linux-gnu/libnvidia-ml.so.1
/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
find: ‘/run/user/1000/doc’: Permission denied
find: ‘/run/user/1000/gvfs’: Permission denied
/var/snap/docker/common/var-lib-docker/overlay2/20cdacd0d96be0cd178f108fd419b3e05a943f4956de496986539a41e57d2cf3/diff/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
When I add --runtime=nvidia
, there is also an error:
docker: Error response from daemon: Unknown runtime specified nvidia.
See ‘docker run --help’.
Could you please double check if you install nvidia-docker on the machine?
Please use the commands mentioned in Error while running action recognition net - #9 by Morganh and retry.
Hi Morganh,
I’ve retried these commands, they actually have been run before, so the issue still exists.
Can you try below and share the result?
$ docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
and
$ sudo docker run --rm --runtime=nvidia --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
The results are the same as before.
docker: Error response from daemon: Unknown runtime specified nvidia.
See ‘docker run --help’.
Please try to
sudo apt install -y nvidia-docker2
sudo systemctl daemon-reload
sudo systemctl restart docker
It still doesn’t work. As long as I run it with --runtime==nvidia, I get this error:
docker: Error response from daemon: Unknown runtime specified nvidia.
See ‘docker run --help’.
Without --runtime but run with --gpus all, get this:
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as ‘legacy’
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
Can you share /etc/nvidia-container-runtime/config.toml ?
Please try to reinstall nvidia-driver.
Uninstall:
sudo apt purge nvidia-driver-525
sudo apt autoremove
sudo apt autoclean
Install: sudo apt install nvidia-driver-525
There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks
Please try with New computer install GPU Docker error - #6 by david9xqqb, especially, sudo systemctl restart docker.service
.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.