Adding GPU to Docker on Rocky Linux platform

I’m going to deploy an “NVIDIA GeForce GTX 1050 Ti” graphics card to docker containers. According to the links below, I installed the driver for the graphic card and Cuda, as well as the toolkit for Docker in Rocky Linux.

https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

  • The relevant drivers were installed according to the following path and also the nouveau module is not loaded and the nvidia module is loaded.
    1

But when a docker container is up and I run the nvidia-smi command on Rocky, it shows as follows that it did not find any processes:

Sun Feb  4 14:49:33 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1050 Ti     On  | 00000000:04:00.0 Off |                  N/A |
|  0%   48C    P5              N/A /  75W |      1MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

  • I started the respective container several times with the following parameters separately, but it didn’t make a difference.
--pid=host --privileged --gpus 'all,capabilities=utility' --runtime=nvidia

The output of the strace nvidia-smi command is as follows:
ou.txt (32.5 KB)

*my questions:

  1. I don’t know why the nvidia-smi command doesn’t show any processes and whether the graphics card is applied to the docker container or not? :(

  2. My next question is that this nvidia-smi command should be executed in the Rocky system or in the container environment as follows:

  • docker exec -it af868d81d6f4 nvidia-smi

Is there no one to help me? I have been dealing with this issue for several days and it has not been resolved yet

There is no one inside the NVIDIA company to help me, why the processes are not showing up?