Hi all,
I am running DIGITS Docker container but for some reason it fails to recognize host’s GPU: it does not report any GPUs (where I expect 1 to be reported) so in the upper right corner of the DIGITS home page there is no indication of any GPUs and also during the training phase, DIGITS uses only CPU.
I have GeForce GT 640 graphics card:
$ nvidia-smi -L
GPU 0: GeForce GT 640 (UUID: GPU-f2583df9-404d-2564-d332-e7878a94d087)
$ lspci
...
VGA compatible controller: NVIDIA Corporation GK107 [GeForce GT 640 OEM] (rev a1)
...
GK107 is a code name for GeForce GT 640 (GDDR5) (source: GeForce 600 series - Wikipedia) which, according to CUDA GPUs - Compute Capability | NVIDIA Developer, has computing capability 3.5 (which is supported as it has to be >2.1 according to Installation Guide — NVIDIA Cloud Native Technologies documentation).
This is my docker run command:
$ docker run --gpus all -d --name digits --rm -p 8888:5000 -v /home/userx/data:/data -v /home/userx/jobs:/workspace/jobs nvcr.io/nvidia/digits:20.12-tensorflow-py3
When nvidia-smi runs from Docker container, it does see the graphics card:
$ docker exec -it digits bash
root@e58b860504a9:/workspace# nvidia-smi
Fri Feb 12 23:33:17 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce GT 640 Off | 00000000:01:00.0 N/A | N/A |
| 40% 32C P8 N/A / N/A | 260MiB / 1992MiB | N/A Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
I am using the latest version of Docker and Nvidia Docker:
$ docker --version
Docker version 20.10.3, build 48d30b5
$ nvidia-docker version
NVIDIA Docker: 2.5.0
Client: Docker Engine - Community
Version: 20.10.3
API version: 1.41
Go version: go1.13.15
Git commit: 48d30b5
Built: Fri Jan 29 14:33:21 2021
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.3
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: 46229ca
Built: Fri Jan 29 14:31:32 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.3
GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.19.0
GitCommit: de40ad0
I am running Ubuntu 20.04:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.2 LTS
Release: 20.04
Codename: focal
I installed the most recent version of NVIDIA driver for Ubuntu:
$ modinfo nvidia
filename: /lib/modules/5.4.0-65-generic/updates/dkms/nvidia.ko
alias: char-major-195-*
version: 460.32.03
supported: external
license: NVIDIA
srcversion: 9BFA7969070552C6938D8A8
alias: pci:v000010DEd*sv*sd*bc03sc02i00*
alias: pci:v000010DEd*sv*sd*bc03sc00i00*
depends:
retpoline: Y
name: nvidia
vermagic: 5.4.0-65-generic SMP mod_unload
...
Would anyone be kind to explain why DIGITS running in Docker does not recognize my graphics card?
Thanks,
Bojan