Hi all,
I’m having some issues running a docker container (triton inference server) on a vm using the host gpu (passthrough)
My setup is the following:
Host:
Linux ubuntu 20.04
NVIDIA RTX P4000
16GB RAM
800 GB Disk
16 cores
VM:
Linux Ubuntu 20.04
10GB RAM
48 GB Disk
8 Cores
Virualized using qemu
I believe the passthrouth is configured well since I can see the graphics card using lspci
and nvidia-smi
identifies it correctly.
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01 Driver Version: 470.103.01 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Quadro RTX 4000 On | 00000000:05:00.0 Off | N/A |
| 30% 36C P8 9W / 125W | 1MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
I’ve installed the nvidia-docker2 and configured it as docker default runtime. I’ve also installed kernel headers and development packages and CUDA drivers 470.103.01-1.
When i run a container that uses gpu (such as triton inference server) the container freezes not only itself but also the whole VM (it is not possible to docker stop
or docker rm
it.
I’m not sure if I need a nvidia license server or if my graphics card is not compatible with gpu passthrough/iommu.
Any help is welcome. thanks!