Hi, I’m new in the forum. I’ve started using Docker a few months ago and I’m working on my graduation thesis.
I know that are many topics like this in the forum. I’ve already read all of these but they didn’t solve my problem.
I’m currently using Ubuntu 20.04 in dual boot with Windows 10. I’ve already disable the Secure Boot. Windows runs on disk C, the primary, and Ubuntu runs on disk D, so they are separate.
I believe my computer has all the features to run cuda. I’ve checked out from Installation Guide Linux :: CUDA Toolkit Documentation
There is something weird. I can install the driver and switch to my Nvidia GPU (sudo prime-select nvidia) directly from Ubuntu terminal, reboot and launch the command “nvidia-smi”. It works without problems.
But when I build my docker image using the same commands inside the Dockerfile I receive this error:
Failed to initialize NVML: Unknown Error
SPECS OF MY SYSTEM
$ uname -m && cat /etc/*release
x86_64
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION=“Ubuntu 20.04.1 LTS”
NAME=“Ubuntu”
VERSION=“20.04.1 LTS (Focal Fossa)”
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME=“Ubuntu 20.04.1 LTS”
VERSION_ID=“20.04”
HOME_URL=“https://www.ubuntu.com/”
SUPPORT_URL=“https://help.ubuntu.com/”
BUG_REPORT_URL=“https://bugs.launchpad.net/ubuntu/”
PRIVACY_POLICY_URL=“Data privacy | Ubuntu”
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
Kernel version: $uname -r
Linux 5.4.0-48-generic x86_64
Docker version: $docker -v
Docker version 19.03.13
GPUs on computer(integrated and dedicated)
$sudo lshw -c video
*-display
description: VGA compatible controller
product: GP107M [GeForce GTX 1050 Ti Mobile]
vendor: NVIDIA Corporation
physical id: 0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller bus_master cap_list rom
configuration: driver=nouveau latency=0
*-display
description: VGA compatible controller
product: UHD Graphics 630 (Mobile)
vendor: Intel Corporation
physical id: 2
version: 00
width: 64 bits
clock: 33MHz
capabilities: pciexpress msi pm vga_controller bus_master cap_list rom
configuration: driver=i915 latency=0
Nvidia GPU: GeForce GTX 1050 Ti
$lspci | grep -i nvidia
01:00.0 VGA compatible controller: NVIDIA Corporation GP107M [GeForce GTX 1050 Ti Mobile] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GP107GL High Definition Audio Controller (rev ff)
Recommended Nvidia Driver:
nvidia-driver-450
$dpkg --get-selections | egrep “nvidia|bbswitch”
libnvidia-cfg1-450:amd64 install
libnvidia-common-450 install
libnvidia-compute-450:amd64 install
libnvidia-decode-450:amd64 install
libnvidia-encode-450:amd64 install
libnvidia-extra-450:amd64 install
libnvidia-fbc1-450:amd64 install
libnvidia-gl-450:amd64 install
libnvidia-ifr1-450:amd64 install
nvidia-compute-utils-450 install
nvidia-dkms-450 install
nvidia-driver-450 install
nvidia-kernel-common-450 install
nvidia-kernel-source-450 install
nvidia-prime install
nvidia-settings install
nvidia-utils-450 install
xserver-xorg-video-nvidia-450 install
GCC version: $gcc -v
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
GLIBC version: $ldd --version
ldd (Ubuntu GLIBC 2.31-0ubuntu9.1) 2.31
Here there is my Dockerfile
#OpenCV with CUDA Acceleration Test | by Mikkel Wilson | Medium
FROM nvidia/cuda:11.0-devel-ubuntu20.04
RUN apt-get update
#I’ve added my user to the docker group so “sudo” would be unecessary but for safety I’ve used it anyway
RUN apt-get install -y sudo unzip nano git wget coreutils
RUN sudo apt-get update
#Verify the System has the Correct Kernel Headers and Development Packages Installed
#The kernel headers and development packages for the currently running kernel can be installed with:
RUN sudo apt-get install -y linux-headers-$(uname -r)
#---------------------------- AVOID TZDATA & KEYBOARD CONFIG ------------------------------------------
#Avoiding user interaction with tzdata
#for apt to be noninteractive
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y keyboard-configuration
ENV TZ=Europe/Minsk
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
RUN apt-get update
#------------------------------------------------------------------------------------------------
#------------------------------ SET-UP Driver GPU ------------------------------------------
#lspci needs pciutils - to check Linux system hardware information GPU
RUN sudo apt-get install -y pciutils
#use: lspci -k | grep -A 2 -i “VGA” oppure lspci | grep VGA oppure lspci -vnnn | perl -lne ‘print if /^\d+:.+([\S+:\S+])/’ | grep VGA
#How To Switch Between Intel and Nvidia Graphics Card on Ubuntu
#How To Switch Between Intel and Nvidia Graphics Card on Ubuntu
RUN sudo apt-get update
#Install nvidia-smi and Nvidia driver for my GPU: Nvidia GeForce GTX 1050 Ti
RUN sudo apt-get install ubuntu-drivers-common
#All compatible drivers for my GPU
RUN sudo ubuntu-drivers devices
RUN sudo ubuntu-drivers autoinstall
#Or I could only install the 450 because it’s the one recommended.
#RUN sudo apt-get install -y nvidia-driver-450
#Now I switch to my Nvidia GPU
RUN sudo prime-select nvidia
#If Nvidia GPU was selected I see the result with the follow command
RUN prime-select query
#-------------------------------------------------------------------------------------------------------
#--------------------------------------- CUDA Installation ------------------------------------------------------
#RUN apt-get install nvidia-container-toolkit #NOT WORK!
#------------- CUDA Toolkit 11.1 - Installer for Linux Ubuntu 20.04 x86_64
#CUDA Toolkit 11.7 Update 1 Downloads | NVIDIA Developer
RUN sudo apt-get update
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
RUN sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
RUN wget https://developer.download.nvidia.com/compute/cuda/11.1.0/local_installers/cuda-repo-ubuntu2004-11-1-local_11.1.0-455.23.05-1_amd64.deb
#dpkg → is the default package manager on Ubuntu. You can use it to install, configure, update or remove packages.
RUN sudo dpkg -i cuda-repo-ubuntu2004-11-1-local_11.1.0-455.23.05-1_amd64.deb
RUN sudo apt-key add /var/cuda-repo-ubuntu2004-11-1-local/7fa2af80.pub
RUN sudo apt-get update
#---------------------------- AVOID TZDATA & KEYBOARD CONFIG ------------------------------------------
#Avoiding user interaction with tzdata
#for apt to be noninteractive
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y keyboard-configuration
ENV TZ=Europe/Minsk
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
RUN apt-get update
#------------------------------------------------------------------------------------------------
RUN sudo apt-get -y install cuda
RUN sudo apt-get update
#----------- End Dockerfile --------------------------------------
I’ve already tried the following commands but NONE have solved my problem…
- Remove all Nvidia package and reinstall
RUN sudo apt-get autoremove -y --purge $(dpkg --get-selections| grep nvidia | awk ‘{print $1}’)
RUN sudo ubuntu-drivers autoinstall
2)Disable Nouverau driver
RUN mkdir -p /etc/modprobe.d/ && touch /etc/modprobe.d/blacklist-nvidia-nouveau.conf
RUN echo “blacklist nouveau”>/etc/modprobe.d/blacklist-nvidia-nouveau.conf
RUN echo “options nouveau modeset=0”>>/etc/modprobe.d/blacklist-nvidia-nouveau.conf
RUN cat /etc/modprobe.d/blacklist-nvidia-nouveau.conf
RUN apt-get update
3)Nvidia fallback service & Noverau & bbswitch
#NVIDIA GPU, Optimus Prime and Ubuntu 18.04 Woes | by Amitosh Swain Mahapatra | Medium
RUN sudo systemctl disable nvidia-fallback.service
RUN sudo apt-get --reinstall install -y grub-pc
RUN sudo apt-get update
#Blacklist nouveau driver using GRUB config. In /etc/default/grub look for a line GRUB_CMDLINE_LINUX .
#Add nouveau.blacklist=1 into that parameter. #If the line is not present add this line GRUB_CMDLINE_LINUX=“nouveau.blacklist=1”
WORKDIR /etc/default/
RUN sed -i grub -e ‘11s!GRUB_CMDLINE_LINUX=“”!GRUB_CMDLINE_LINUX=“nouveau.blacklist=1”!’
RUN sudo update-grub
WORKDIR /
#bbswitch (only for laptop users interested for power savings, if your system supports it.)
RUN sudo apt-get install -y bbswitch-dkms
#Configure the system to load it by appending bbswitch in /etc/modules
#To disable the card on boot run
RUN sudo echo “options bbswitch load_state=0” | sudo tee /etc/modprobe.d/bbswitch.conf
RUN sudo apt-get update
RUN sudo prime-select intel
RUN sudo prime-select nvidia
RUN sudo apt-get update
can someone help me? thank you very much