Help setting up a linux vm with passtrouth enabled gpu

Hi all,

I’m having some issues running a docker container (triton inference server) on a vm using the host gpu (passthrough)

My setup is the following:

Host:
Linux ubuntu 20.04
NVIDIA RTX P4000
16GB RAM
800 GB Disk
16 cores

VM:
Linux Ubuntu 20.04
10GB RAM
48 GB Disk
8 Cores
Virualized using qemu

I believe the passthrouth is configured well since I can see the graphics card using lspci and nvidia-smi identifies it correctly.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01   Driver Version: 470.103.01   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro RTX 4000     On   | 00000000:05:00.0 Off |                  N/A |
| 30%   36C    P8     9W / 125W |      1MiB /  7982MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

I’ve installed the nvidia-docker2 and configured it as docker default runtime. I’ve also installed kernel headers and development packages and CUDA drivers 470.103.01-1.

When i run a container that uses gpu (such as triton inference server) the container freezes not only itself but also the whole VM (it is not possible to docker stop or docker rm it.

I’m not sure if I need a nvidia license server or if my graphics card is not compatible with gpu passthrough/iommu.

Any help is welcome. thanks!

Howdy

You may need to configure your container to allow access and pass through the proper device files.

See link.