Hi everyone
We recently want to use Nvidia graphics cards based on pasting them by Open Nebula. In the following, I will explain the issue in more detail.
At first, we installed Debian 11 on the servers that hosted the graphics cards. also we used Open Nebula for virtualization and installed it on Debian servers.
Also, the types of Opennebula Hypervisor server CPUs are as follows:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 43 bits physical, 48 bits virtual
CPU(s): 48
On-line CPU(s) list: 0-47
Thread(s) per core: 2
Core(s) per socket: 24
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 49
Model name: AMD Ryzen Threadripper 3960X 24-Core Processor
Stepping: 0
Frequency boost: enabled
CPU MHz: 2199.662
CPU max MHz: 6635.1558
CPU min MHz: 2200.0000
BogoMIPS: 7600.17
Virtualization: AMD-V
Then, using the Open Nebula documentation, we proceeded to PCI Passthrough the GPUs.
in the Sunstone (opennebula web app )panel, we were able to assign a GPU number to the virtual machine and use it.
the result of lspci
command in the VM that used passthrough GPUs as follow:
01:01.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2204] (rev a1)
01:02.0 Audio device [0403]: NVIDIA Corporation Device [10de:1aef] (rev a1)
01:03.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:2204] (rev a1)
01:04.0 Audio device [0403]: NVIDIA Corporation Device [10de:1aef] (rev a1)
nvidia-smi -L
in the VM results as a follow:
GPU 0: NVIDIA GeForce RTX 3090 (UUID: GPU-9ba68e69-e2ce-5d2c-ad15-5884706fd049)
GPU 1: NVIDIA GeForce RTX 3090 (UUID: GPU-a71db27c-bec8-5f4b-cba3-b4c5a91cf19f)
The way we use the graphics card is to install its drivers on Ubuntu 20.04 and Nvidia Docker drivers in then VM. after install drivers from this list in ubuntu:
nvidia-headless-510 nvidia-utils-510 cuda-toolkit-11-6
and docker driver : nvidia-docker2
The output of the following command shows that we have access to the GPU using Docker on the virtual machine and Nvidia is working:
root@localhost:~# docker run --rm --gpus all nvidia/cuda:11.2.0-runtime-ubuntu20.04 nvidia-smi
Mon Dec 26 11:06:43 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.108.03 Driver Version: 510.108.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:01.0 Off | N/A |
| 0% 32C P8 12W / 350W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... On | 00000000:01:03.0 Off | N/A |
| 0% 36C P8 27W / 350W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+