Adding GPU to Docker on Rocky Linux platform

I’m going to deploy an “NVIDIA GeForce GTX 1050 Ti” graphics card to docker containers. According to the links below, I installed the driver for the graphic card and Cuda, as well as the toolkit for Docker in Rocky Linux.

https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html

  • The relevant drivers were installed according to the following path and also the nouveau module is not loaded and the nvidia module is loaded.
    1

But when a docker container is up and I run the nvidia-smi command on Rocky, it shows as follows that it did not find any processes:

Sun Feb  4 14:49:33 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1050 Ti     On  | 00000000:04:00.0 Off |                  N/A |
|  0%   48C    P5              N/A /  75W |      1MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

  • I started the respective container several times with the following parameters separately, but it didn’t make a difference.
--pid=host --privileged --gpus 'all,capabilities=utility' --runtime=nvidia

The output of the strace nvidia-smi command is as follows:
ou.txt (32.5 KB)

*my questions:

  1. I don’t know why the nvidia-smi command doesn’t show any processes and whether the graphics card is applied to the docker container or not? :(

  2. My next question is that this nvidia-smi command should be executed in the Rocky system or in the container environment as follows:

  • docker exec -it af868d81d6f4 nvidia-smi

Is there no one to help me? I have been dealing with this issue for several days and it has not been resolved yet

There is no one inside the NVIDIA company to help me, why the processes are not showing up?

We are trying to find someone who could help.
I have not used Rocky or this container, but the SMI reporting no processes are running seems to be what I would have expected - unless you are actually running some GPU accelerated task in the background.
I would suggest run some CUDA Samples - something which would take a good few minutes, and then run NVIDIA SMI and see if the process shows up.

Sorry again for the lack of response.

Thank you for your answer @nadeemm.
I use it like this inside the container. Google Earth is passed to $DISPLAY env corresponding to xvfb as xserver and then vncserver which reads from xserver.

Maybe the GPU doesn’t support this method?
what’s the solution?

I downloaded Cuda Samples from the link https://github.com/nvidia/cuda-samples and run it in the container as follows:

Then I run nvidia-smi in the container concurrently and it gave me different usage percentages (3%, 9%, 16%).

I still don’t know if the gpu is being used or not.
Because it doesn’t show any process and I was checking nvida-smi output in a loop.

Do you think the gpu is being used?
If it is, why doesn’t the googleearth application that runs inside the container use gpu?
what to do?