Docker container cant use GPU

Hello I have a pc that can use gpu, but I couldn’t run a docker there that use the gpu. How can I run gpu using containers?

Dockerfile:
FROM nvcr.io/nvidia/tensorflow:21.12-tf2-py3

Comands to build Image and run container:
sudo docker build --tag=solonvidiatensorflow:latest .
sudo docker run --tty --detach --name container_nvidiatensorflow solonvidiatensorflow:latest

If I run:

a)
a.1)
nvidia-smi:

Fri Jun 24 16:10:44 2022
±----------------------------------------------------------------------------+
| NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … On | 00000000:01:00.0 On | N/A |
| 0% 41C P0 28W / 200W | 278MiB / 6078MiB | 1% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1705 G /usr/lib/xorg/Xorg 175MiB |
| 0 N/A N/A 1838 G /usr/bin/gnome-shell 27MiB |
| 0 N/A N/A 2435 G …AAAAAAAAA= --shared-files 71MiB |
±----------------------------------------------------------------------------+

a.2)
sudo docker exec container_nvidiatensorflow nvidia-smi:
OCI runtime exec failed: exec failed: unable to start container process: exec: “nvidia-smi”: executable file not found in $PATH: unknown

b)
b.1)
python3 -c “import tensorflow as tf ; print('Num GPUs Available: ', len(tf.config.list_physical_devices(‘GPU’)))”:

2022-06-24 17:11:33.019941: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-24 17:11:33.038839: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-24 17:11:33.038983: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
Num GPUs Available: 1

b.2)
sudo docker exec container_nvidiatensorflow python3 -c “import tensorflow as tf ; print('Num GPUs Available: ', len(tf.config.list_physical_devices(‘GPU’)))”:

2022-06-24 21:21:30.820790: W tensorflow/stream_executor/platform/default/dso_loader.cc:65] Could not load dynamic library ‘libcuda.so.1’; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2022-06-24 21:21:30.820806: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-06-24 21:21:30.820816: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
Num GPUs Available: 0

c)
c.1)
python3 -c “import tensorflow as tf ; print(tf.test.gpu_device_name())”

2022-06-24 17:14:26.986965: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-24 17:14:27.016277: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-24 17:14:27.035709: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-24 17:14:27.035852: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-24 17:14:27.307803: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-24 17:14:27.307963: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-24 17:14:27.308069: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-24 17:14:27.308173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:0 with 4948 MB memory: → device: 0, name: NVIDIA GeForce GTX 1060 6GB, pci bus id: 0000:01:00.0, compute capability: 6.1
/device:GPU:0

c.2)
sudo docker exec container_nvidiatensorflow python3 -c “import tensorflow as tf ; print(tf.test.gpu_device_name())”:

2022-06-24 21:22:29.321447: W tensorflow/stream_executor/platform/default/dso_loader.cc:65] Could not load dynamic library ‘libcuda.so.1’; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/compat/lib:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2022-06-24 21:22:29.321463: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-06-24 21:22:29.321477: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist

If you need further information just ask me
Thanks in advance!

Hi,

Looks like you’re missing the --gpus all option in the docker command. Also, we recommend you to please use the latest container.
This forum talks more about updates and issues related to cuDNN. We recommend you to please reach out Nvidia container related platform to get better help.

Thank you.