I’m going to deploy an “NVIDIA GeForce GTX 1050 Ti” graphics card to docker containers. According to the links below, I installed the driver for the graphic card and Cuda, as well as the toolkit for Docker in Rocky Linux.
https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
- The relevant drivers were installed according to the following path and also the nouveau module is not loaded and the nvidia module is loaded.
But when a docker container is up and I run the nvidia-smi command on Rocky, it shows as follows that it did not find any processes:
Sun Feb 4 14:49:33 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 1050 Ti On | 00000000:04:00.0 Off | N/A |
| 0% 48C P5 N/A / 75W | 1MiB / 4096MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
- I started the respective container several times with the following parameters separately, but it didn’t make a difference.
--pid=host --privileged --gpus 'all,capabilities=utility' --runtime=nvidia
The output of the strace nvidia-smi command is as follows:
ou.txt (32.5 KB)
*my questions:
-
I don’t know why the nvidia-smi command doesn’t show any processes and whether the graphics card is applied to the docker container or not? :(
-
My next question is that this nvidia-smi command should be executed in the Rocky system or in the container environment as follows:
-
docker exec -it af868d81d6f4 nvidia-smi