Can I upgrade/Change Nvidia Driver on Docker container using Dockerfile

I am new in Nvidia driver installation in docker container. I have pull nvcr.io/nvidia/deepstream:5.0.1-20.09-devel image and tried to install NVIDIA-Linux-x86_64-450.66.run on top of nvcr.io/nvidia/deepstream:5.0.1-20.09-devel.

  • Is it possible to upgrade/change Nvidia drivers in container?
  • When I tried to do this I got following error:

ERROR: Unable to find the module utility modprobe; please make sure you

have the package ‘module-init-tools’ or ‘kmod’ installed. If you do

have ‘module-init-tools’ or ‘kmod’ installed, then please check that

modprobe is in your PATH.

Then I installed ‘module-init-tools’. I use RUN apt-get installed ‘module-init-tools’ in Dockerfile.

Then I faced following error:

ERROR: An NVIDIA kernel module ‘nvidia-uvm’ appears to already be loaded in

your kernel…

I installed kmod through Dockerfile. Also I run Nvidia driver using following flags

CMD [“./NVIDIA-Linux-x86_64-450.66.run”, “–no-opengl-files -a -s --no-kernel-module”]

Now I got new errors:

WARNING: You specified the ‘–no-kernel-module’ command line option, nvidia-installer will not install a kernel module as part of this driver installation and it will not remove existing NVIDIA kernel modules not part of an earlier NVIDIA driver installation. Please ensure that an NVIDIA kernel module matching this driver version is installed separately.

WARNING: nvidia-installer was forced to guess the X library path ‘/usr/lib’ and X module path ‘/usr/lib/xorg/modules’; these paths were not queryable

from the system. If X fails to find the NVIDIA X driver module, please install the pkg-config utility and the X.Org SDK/development package for your distribution and reinstall the driver.

ERROR: Unable to delete ‘/usr/lib/x86_64-linux-gnu/libnvcuvid.so.450.66’ (Device or resource busy).

ERROR: Unable to backup file ‘/usr/lib/x86_64-linux-gnu/libnvcuvid.so.450.66’.

ERROR: Unable to delete ‘/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.450.66’ (Device or resource busy).

ERROR: Unable to backup file ‘/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.450.66’.

ERROR: Unable to delete ‘/usr/lib/x86_64-linux-gnu/libcuda.so.450.66’ (Device or resource busy).

ERROR: Unable to backup file ‘/usr/lib/x86_64-linux-gnu/libcuda.so.450.66’.

ERROR: Unable to delete ‘/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.450.66’ (Device or resource busy).

ERROR: Unable to backup file ‘/usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.450.66’.

ERROR: Unable to delete ‘/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.450.66’ (Device or resource busy).

ERROR: Unable to backup file ‘/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.450.66’.

I have used following docker command run.

sudo docker run --gpus all -it --rm --privileged --cap-add=ALL -v /dev:/dev -v /var/run/docker.sock:/var/run/docker.sock -v /lib/modules:/lib/modules:ro -v /usr:/host/usr:ro -v /tmp/.X11-unix:/tmp/.X11-unix --network host -p 9000-9999:9000-9999 -e DISPLAY=$DISPLAY --name caml-nvidia_v1-container_v16 caml-nvidia_v4

Hi @ravip.777,
Apologies for delay.
Please allow me some time to check on this.
Thanks!

Hi @ravip.777,
One should not install NVIDIA drivers into containers (generally, not just for these containers), no. [It also shouldn’t usually be necessary to run in --privileged mode , and mounting in the docker.sock is also unusual unless you’re attempting certain docker-in-docker setups.] Looks like there is some fundamental misunderstanding of how this is supposed to work. The driver runs outside the container, and docker mounts in the necessary components from the host system into the container.

Thanks!